2023-09-28 10:51:51,223 INFO [train.py:1107] (1/4) Training started 2023-09-28 10:51:51,223 INFO [train.py:1117] (1/4) Device: cuda:1 2023-09-28 10:51:51,229 INFO [train.py:1129] (1/4) {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 50, 'reset_interval': 200, 'valid_interval': 3000, 'feature_dim': 80, 'subsampling_factor': 4, 'warm_step': 2000, 'env_info': {'k2-version': '1.24.3', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '821ebc378e7fb99b8adc81950227963332821e01', 'k2-git-date': 'Wed Jul 19 15:38:25 2023', 'lhotse-version': '1.16.0.dev+git.1db4d97a.clean', 'torch-version': '1.11.0+cu102', 'torch-cuda-available': True, 'torch-cuda-version': '10.2', 'python-version': '3.9', 'icefall-git-branch': 'dev/bilingual', 'icefall-git-sha1': '09ada8fb-dirty', 'icefall-git-date': 'Thu Sep 28 10:47:39 2023', 'icefall-path': '/star-home/jinzengrui/lib/miniconda3/envs/dev39/lib/python3.9/site-packages/icefall-1.0-py3.9.egg', 'k2-path': '/star-home/jinzengrui/lib/miniconda3/envs/dev39/lib/python3.9/site-packages/k2-1.24.3.dev20230721+cuda10.2.torch1.11.0-py3.9-linux-x86_64.egg/k2/__init__.py', 'lhotse-path': '/star-home/jinzengrui/lib/miniconda3/envs/dev39/lib/python3.9/site-packages/lhotse-1.16.0.dev0+git.1db4d97a.clean-py3.9.egg/lhotse/__init__.py', 'hostname': 'de-74279-k2-train-6-0423201309-7c68fd68fb-6cszs', 'IP address': '10.177.28.83'}, 'world_size': 4, 'master_port': 12354, 'tensorboard': True, 'num_epochs': 30, 'start_epoch': 1, 'start_batch': 0, 'exp_dir': PosixPath('zipformer/exp-w-tal-csasr'), 'bpe_model': 'data/lang_bbpe_2000/bbpe.model', 'base_lr': 0.045, 'lr_batches': 7500, 'lr_epochs': 3.5, 'ref_duration': 600, 'context_size': 2, 'prune_range': 5, 'lm_scale': 0.25, 'am_scale': 0.0, 'simple_loss_scale': 0.5, 'ctc_loss_scale': 0.2, 'seed': 42, 'print_diagnostics': False, 'inf_check': False, 'save_every_n': 4000, 'keep_last_k': 30, 'average_period': 200, 'use_fp16': True, 'use_tal_csasr': True, 'num_encoder_layers': '2,2,3,4,3,2', 'downsampling_factor': '1,2,4,8,4,2', 'feedforward_dim': '512,768,1024,1536,1024,768', 'num_heads': '4,4,4,8,4,4', 'encoder_dim': '192,256,384,512,384,256', 'query_head_dim': '32', 'value_head_dim': '12', 'pos_head_dim': '4', 'pos_dim': 48, 'encoder_unmasked_dim': '192,192,256,256,256,192', 'cnn_module_kernel': '31,31,15,15,15,31', 'decoder_dim': 512, 'joiner_dim': 512, 'causal': False, 'chunk_size': '16,32,64,-1', 'left_context_frames': '64,128,256,-1', 'use_transducer': True, 'use_ctc': False, 'manifest_dir': PosixPath('data/fbank'), 'max_duration': 1000, 'bucketing_sampler': True, 'num_buckets': 30, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'shuffle': True, 'drop_last': True, 'return_cuts': True, 'num_workers': 2, 'enable_spec_aug': True, 'spec_aug_time_warp_factor': 80, 'enable_musan': True, 'input_strategy': 'PrecomputedFeatures', 'blank_id': 0, 'vocab_size': 2000} 2023-09-28 10:51:51,230 INFO [train.py:1131] (1/4) About to create model 2023-09-28 10:51:52,089 INFO [train.py:1135] (1/4) Number of model parameters: 68625511 2023-09-28 10:51:58,407 INFO [train.py:1150] (1/4) Using DDP 2023-09-28 10:51:59,174 INFO [multi_dataset.py:39] (1/4) About to get multidataset train cuts 2023-09-28 10:51:59,174 INFO [multi_dataset.py:42] (1/4) Loading Aishell-2 in lazy mode 2023-09-28 10:51:59,249 INFO [multi_dataset.py:49] (1/4) Loading TAL-CSASR in lazy mode 2023-09-28 10:51:59,268 INFO [multi_dataset.py:142] (1/4) About to get train-clean-100 cuts 2023-09-28 10:51:59,294 INFO [multi_dataset.py:149] (1/4) About to get train-clean-360 cuts 2023-09-28 10:51:59,299 INFO [multi_dataset.py:156] (1/4) About to get train-other-500 cuts 2023-09-28 10:52:14,752 INFO [asr_datamodule.py:218] (1/4) Enable MUSAN 2023-09-28 10:52:14,753 INFO [asr_datamodule.py:219] (1/4) About to get Musan cuts 2023-09-28 10:52:17,990 INFO [asr_datamodule.py:243] (1/4) Enable SpecAugment 2023-09-28 10:52:17,990 INFO [asr_datamodule.py:244] (1/4) Time warp factor: 80 2023-09-28 10:52:17,991 INFO [asr_datamodule.py:254] (1/4) Num frame mask: 10 2023-09-28 10:52:17,991 INFO [asr_datamodule.py:267] (1/4) About to create train dataset 2023-09-28 10:52:17,991 INFO [asr_datamodule.py:294] (1/4) Using DynamicBucketingSampler. 2023-09-28 10:52:18,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:52:18,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 10:52:18,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 10:52:18,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:18,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:18,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:19,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:19,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:19,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:19,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:19,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 10:52:19,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:52:19,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 10:52:19,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 10:52:20,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 10:52:20,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:52:20,231 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 10:52:20,359 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 10:52:20,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:21,030 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:21,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:21,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:52:21,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:52:22,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:52:22,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:22,361 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:22,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:22,533 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:22,539 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:52:22,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:22,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:52:23,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 10:52:23,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:52:24,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:24,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 10:52:24,093 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 10:52:24,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:52:24,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:52:24,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 10:52:24,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 10:52:24,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 10:52:25,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:52:25,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:52:25,661 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 10:52:25,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 10:52:25,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 10:52:25,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:26,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 10:52:26,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 10:52:26,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 10:52:26,478 INFO [asr_datamodule.py:309] (1/4) About to create train dataloader 2023-09-28 10:52:26,479 INFO [multi_dataset.py:88] (1/4) About to get multidataset dev cuts 2023-09-28 10:52:26,479 INFO [multi_dataset.py:91] (1/4) Loading Aishell-2 DEV set in lazy mode 2023-09-28 10:52:26,481 INFO [multi_dataset.py:163] (1/4) About to get dev-clean cuts 2023-09-28 10:52:26,483 INFO [multi_dataset.py:170] (1/4) About to get dev-other cuts 2023-09-28 10:52:26,523 INFO [asr_datamodule.py:340] (1/4) About to create dev dataset 2023-09-28 10:52:27,341 INFO [asr_datamodule.py:357] (1/4) About to create dev dataloader 2023-09-28 10:52:27,341 INFO [train.py:1351] (1/4) Sanity check -- see if any of the batches in epoch 1 would cause OOM. 2023-09-28 10:52:27,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:52:27,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 10:52:27,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 10:52:27,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:27,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:27,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:27,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:28,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:28,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:28,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:28,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 10:52:29,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:52:29,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 10:52:29,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 10:52:29,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 10:52:29,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:52:29,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 10:52:29,770 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 10:52:29,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:30,434 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:30,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:30,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:52:30,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:52:31,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:52:31,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:31,314 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:31,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:31,482 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:31,489 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:52:31,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:31,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:52:32,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 10:52:32,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:52:33,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:33,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 10:52:33,412 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 10:52:33,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:52:33,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:52:33,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 10:52:33,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 10:52:33,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 10:52:34,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:52:34,872 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:52:35,023 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 10:52:35,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 10:52:35,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 10:52:35,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:35,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 10:52:35,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 10:52:35,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 10:52:36,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:52:36,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 10:52:36,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 10:52:36,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:36,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:36,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:36,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:36,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:37,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:37,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:37,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 10:52:37,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:52:37,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 10:52:37,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 10:52:37,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 10:52:38,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:52:38,195 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 10:52:38,320 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 10:52:38,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:38,977 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:39,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:39,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:52:39,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:52:39,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:52:39,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:39,860 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:40,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:40,479 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:40,485 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:52:40,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:40,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:52:41,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 10:52:41,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:52:41,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:41,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 10:52:41,992 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 10:52:42,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:52:42,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:52:42,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 10:52:42,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 10:52:42,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 10:52:43,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:52:43,404 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:52:43,555 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 10:52:43,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 10:52:43,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 10:52:43,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:44,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 10:52:44,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 10:52:44,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 10:52:44,918 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-28 10:52:46,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-28 10:52:46,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:52:46,426 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:52:47,080 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:52:47,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:52:47,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:47,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-28 10:52:47,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-28 10:52:47,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:47,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:48,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:48,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:52:48,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:52:48,464 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:52:48,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-28 10:52:48,936 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:52:49,857 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 10:52:49,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:52:50,139 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-28 10:52:50,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:52:50,628 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:52:51,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:52,019 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:52,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:53,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-28 10:52:53,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-28 10:52:53,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:53,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:52:53,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:52:53,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:52:54,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-28 10:52:54,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:52:54,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:52:54,978 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-28 10:52:55,397 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-28 10:52:55,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:52:56,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:52:56,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:56,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-28 10:52:56,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 10:52:56,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:52:56,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:56,945 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:57,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:52:58,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-28 10:52:58,294 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:59,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:52:59,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-28 10:52:59,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-28 10:52:59,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:52:59,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:52:59,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:00,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:00,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-28 10:53:00,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 10:53:00,212 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:01,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:01,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-28 10:53:01,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 10:53:01,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-28 10:53:01,731 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:53:01,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 10:53:01,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-28 10:53:02,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:02,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-28 10:53:03,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:03,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:03,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:03,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:53:03,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:53:03,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-28 10:53:03,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-28 10:53:04,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:04,149 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:53:04,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:04,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:05,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-28 10:53:05,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-28 10:53:05,381 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-28 10:53:05,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:05,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-28 10:53:05,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-28 10:53:05,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-28 10:53:05,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:06,021 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-28 10:53:06,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-28 10:53:06,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:06,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:06,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:53:07,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:53:07,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-28 10:53:07,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:08,127 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:53:08,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:53:08,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-28 10:53:08,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:53:08,268 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:53:08,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-28 10:53:08,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-28 10:53:08,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:08,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:08,824 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:53:09,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-28 10:53:09,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:09,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:09,732 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-28 10:53:09,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 10:53:10,431 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-28 10:53:10,465 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-28 10:53:10,673 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:10,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 10:53:11,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:53:11,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:12,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:12,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:12,981 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-28 10:53:13,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-28 10:53:13,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:53:13,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:53:14,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:14,545 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:14,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:53:15,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:53:15,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:15,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:15,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:15,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:15,817 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:15,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-28 10:53:15,970 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-28 10:53:16,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:16,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:53:16,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:16,266 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:16,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 10:53:16,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 10:53:16,410 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-28 10:53:16,419 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:16,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:16,666 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:16,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:53:16,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:17,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:17,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:53:17,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:53:17,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:17,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:18,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:18,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:53:18,525 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:19,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-28 10:53:19,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-28 10:53:19,633 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-28 10:53:20,025 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:20,033 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 10:53:20,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:53:20,371 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:20,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:20,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:20,568 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:20,759 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-28 10:53:21,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:21,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:53:22,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:53:22,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-28 10:53:22,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:53:22,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:53:22,818 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:53:23,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:53:23,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:23,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:53:23,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:23,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-28 10:53:24,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:24,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:24,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:53:24,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-28 10:53:24,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:25,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 10:53:25,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-28 10:53:25,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:53:26,397 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:26,638 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:53:26,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-28 10:53:26,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:53:26,766 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-28 10:53:27,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:27,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:27,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:53:28,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-28 10:53:28,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:28,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:28,604 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-28 10:53:28,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-28 10:53:28,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:29,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:29,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:53:29,567 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:29,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:31,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:53:31,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:53:32,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:53:32,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:32,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 10:53:32,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 10:53:32,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:33,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:53:33,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:33,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:33,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-28 10:53:33,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 10:53:33,756 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:34,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:53:35,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:35,976 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:36,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:53:36,943 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:37,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-28 10:53:37,319 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:37,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:53:37,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:37,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 10:53:37,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-28 10:53:37,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:37,810 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-28 10:53:38,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:38,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:53:38,450 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:38,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:38,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:53:38,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:39,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:39,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:53:41,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:41,840 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:41,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:53:42,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-28 10:53:42,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-28 10:53:42,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:53:42,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:43,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 10:53:43,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:53:43,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:43,483 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:53:43,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-28 10:53:43,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:44,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:53:44,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:53:44,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:53:44,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:53:44,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:53:44,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:53:44,666 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:44,845 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:53:44,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:45,389 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:53:45,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:46,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:53:47,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:47,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:53:48,031 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-28 10:53:48,190 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:48,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:53:48,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-28 10:53:48,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-28 10:53:48,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:53:48,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-28 10:53:49,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:49,335 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:49,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:53:49,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-28 10:53:50,016 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:50,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 10:53:50,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-28 10:53:50,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:50,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-28 10:53:51,310 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 10:53:51,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-28 10:53:51,829 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-28 10:53:51,904 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:52,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:52,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:52,442 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-28 10:53:52,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:53:52,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:53:52,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:53,671 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:54,267 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-28 10:53:54,274 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-28 10:53:54,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:53:54,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:54,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-28 10:53:55,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:55,651 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:53:55,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:55,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-28 10:53:56,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:56,491 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 10:53:56,802 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:56,966 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:53:57,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-28 10:53:57,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 10:53:57,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:57,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-28 10:53:57,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:57,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:57,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:57,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:57,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:58,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:58,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 10:53:58,919 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:59,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:54:00,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:00,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:01,016 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-28 10:54:01,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:01,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-28 10:54:01,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:01,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-28 10:54:01,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:54:02,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-28 10:54:02,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:54:02,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:54:02,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:54:02,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:02,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:02,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:54:02,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:03,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-28 10:54:03,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:54:03,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:04,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:04,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:54:04,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:54:04,431 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:05,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-28 10:54:05,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:05,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:05,805 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:05,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:06,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-28 10:54:06,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:06,466 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-28 10:54:06,631 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-28 10:54:06,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:07,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:54:07,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-28 10:54:08,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:08,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:54:08,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:08,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:08,680 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:08,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:09,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:54:09,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:54:09,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-28 10:54:09,827 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:09,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:10,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:54:10,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:10,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:10,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:10,966 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-28 10:54:11,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-28 10:54:11,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:11,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-28 10:54:11,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:12,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:54:12,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:12,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-28 10:54:12,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:54:12,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:12,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:12,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:12,588 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-28 10:54:12,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-28 10:54:13,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:13,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:13,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-28 10:54:14,449 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-28 10:54:14,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:54:15,091 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:15,896 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-28 10:54:16,315 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-28 10:54:16,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-28 10:54:16,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:17,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:54:17,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-28 10:54:17,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:54:17,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 10:54:18,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:18,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:18,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-28 10:54:18,695 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-28 10:54:18,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-28 10:54:19,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:54:19,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:54:19,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-28 10:54:19,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 10:54:19,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:54:19,883 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-28 10:54:20,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-28 10:54:20,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:54:20,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:20,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:54:20,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-28 10:54:20,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:54:20,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 10:54:20,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 10:54:22,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:22,789 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:54:23,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-28 10:54:23,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-28 10:54:23,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:54:23,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:24,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:24,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:24,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:24,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-28 10:54:25,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-28 10:54:25,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-28 10:54:25,385 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:25,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:25,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:54:25,876 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-28 10:54:25,905 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-28 10:54:25,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:26,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:54:26,281 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-28 10:54:26,697 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-28 10:54:26,750 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:54:26,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-28 10:54:26,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-28 10:54:27,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:54:27,520 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-28 10:54:27,602 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 10:54:27,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-28 10:54:28,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:54:29,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-28 10:54:29,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-28 10:54:29,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:54:29,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:30,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:30,374 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:54:30,426 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-28 10:54:30,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:30,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:54:31,094 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:31,126 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-28 10:54:31,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-28 10:54:31,355 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:31,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 10:54:32,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 10:54:32,483 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-28 10:54:32,789 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:32,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:32,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:34,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:34,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-28 10:54:34,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-28 10:54:34,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:34,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-28 10:54:34,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 10:54:34,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:54:35,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:54:35,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:54:35,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:35,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-28 10:54:36,156 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-28 10:54:36,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:36,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:36,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:36,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:36,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-28 10:54:37,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-28 10:54:37,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:54:37,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:38,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:38,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:39,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:39,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-28 10:54:39,839 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:39,869 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:40,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-28 10:54:40,410 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-28 10:54:40,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:41,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-28 10:54:41,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-28 10:54:41,494 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:41,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-28 10:54:41,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:41,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:41,807 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:41,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:41,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:54:42,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:54:42,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:43,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-28 10:54:43,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:54:43,734 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:43,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:44,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:44,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:44,826 WARNING [train.py:1197] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-28 10:54:44,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-28 10:54:45,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:54:45,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:54:45,842 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:54:46,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:54:46,424 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:46,433 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-28 10:54:46,564 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:46,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 10:54:47,051 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:54:47,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 10:54:47,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-28 10:54:47,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:47,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-28 10:54:47,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-28 10:54:47,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:47,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:47,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:54:47,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:48,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:54:48,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:54:49,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:49,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:49,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 10:54:49,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 10:54:49,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:50,096 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-28 10:54:50,194 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:50,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-28 10:54:50,433 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-28 10:54:51,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-28 10:54:51,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-28 10:54:51,904 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:54:51,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 10:54:52,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:52,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:53,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 10:54:53,288 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-28 10:54:53,512 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:54:53,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:54:54,008 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:54,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-28 10:54:54,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:54:55,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-28 10:54:55,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:55,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:55,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:54:56,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:56,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:57,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:57,900 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:54:58,439 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:58,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:54:58,457 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:59,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-28 10:54:59,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-28 10:55:00,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:00,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-28 10:55:00,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:55:00,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-28 10:55:01,011 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:55:01,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:55:01,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 10:55:02,071 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-28 10:55:02,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:55:03,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:55:03,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:03,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-28 10:55:03,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:04,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:55:04,562 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:04,962 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:05,442 WARNING [train.py:1197] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-28 10:55:05,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:05,722 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:06,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:06,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 10:55:06,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:06,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:06,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 10:55:06,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:07,074 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:55:07,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 10:55:07,401 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-28 10:55:07,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:07,457 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:07,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:08,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:55:08,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:08,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:55:08,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-28 10:55:08,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:55:08,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:55:08,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-28 10:55:08,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:55:08,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 10:55:09,096 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-28 10:55:09,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-28 10:55:09,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:09,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:09,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:55:10,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:10,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:11,004 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:11,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:11,230 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:11,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:11,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 10:55:11,710 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:12,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 10:55:12,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:12,609 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:55:12,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:13,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-28 10:55:13,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-28 10:55:13,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-28 10:55:13,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:14,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:14,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-28 10:55:15,004 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:55:15,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:55:15,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:15,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-28 10:55:15,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:16,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:17,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 10:55:17,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:55:17,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-28 10:55:17,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-28 10:55:18,297 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-28 10:55:18,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:55:18,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:55:19,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:19,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-28 10:55:19,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:19,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:55:19,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-28 10:55:20,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:55:20,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:20,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:55:21,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:55:21,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-28 10:55:22,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-28 10:55:22,132 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-28 10:55:22,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:22,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:22,802 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:23,100 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:23,109 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-28 10:55:24,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-28 10:55:24,295 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-28 10:55:24,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-28 10:55:24,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-28 10:55:24,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-28 10:55:24,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:24,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-28 10:55:24,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:25,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:55:25,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:25,467 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:25,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-28 10:55:25,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:25,927 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:55:26,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 10:55:26,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:26,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:26,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:26,814 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-28 10:55:27,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-28 10:55:27,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:55:27,329 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:55:27,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-28 10:55:27,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-28 10:55:27,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:28,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-28 10:55:28,061 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-28 10:55:28,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-28 10:55:28,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:55:28,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 10:55:28,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:55:29,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:55:29,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:29,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 10:55:29,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:29,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:30,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-28 10:55:30,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:55:30,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-28 10:55:31,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:55:31,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:55:31,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-28 10:55:31,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:32,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:32,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:55:32,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:32,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:55:33,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-28 10:55:33,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:33,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:55:33,682 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:55:33,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:34,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:34,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-28 10:55:35,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:35,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:35,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:35,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:35,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:35,700 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:35,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:36,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:36,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 10:55:36,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-28 10:55:37,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:37,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:37,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:55:37,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:37,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-28 10:55:37,972 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:38,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-28 10:55:38,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:38,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:39,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:55:39,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:39,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:39,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:40,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:40,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:55:40,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-28 10:55:40,429 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-28 10:55:40,462 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-28 10:55:40,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:55:40,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:40,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:40,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:41,366 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-28 10:55:41,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-28 10:55:41,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-28 10:55:41,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 10:55:42,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:55:42,857 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:42,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-28 10:55:43,105 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:55:43,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-28 10:55:44,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 10:55:45,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:45,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-28 10:55:45,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:55:45,437 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:45,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-28 10:55:45,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:45,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:45,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:46,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 10:55:46,305 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:46,496 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-28 10:55:46,580 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-28 10:55:46,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-28 10:55:46,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 10:55:46,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:47,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:47,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:47,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:55:47,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:47,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:47,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-28 10:55:48,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-28 10:55:48,817 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:48,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-28 10:55:49,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-28 10:55:49,580 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-28 10:55:49,911 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-28 10:55:49,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:55:49,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:49,954 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 10:55:50,293 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:50,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:50,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-28 10:55:50,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:55:50,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:51,022 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:55:51,076 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-28 10:55:51,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:55:52,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-28 10:55:52,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-28 10:55:52,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:55:52,786 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:55:52,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:55:52,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:53,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:53,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:53,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:55:53,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:55:53,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:54,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-28 10:55:55,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-28 10:55:55,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:55:55,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-28 10:55:55,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:55:55,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:55,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-28 10:55:56,306 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:56,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:56,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-28 10:55:57,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:55:57,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-28 10:55:57,288 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-28 10:55:57,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:55:57,458 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:57,543 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 10:55:57,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:55:59,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:59,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:59,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 10:55:59,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:56:00,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-28 10:56:00,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:56:01,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-28 10:56:01,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:56:01,567 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-28 10:56:01,701 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-28 10:56:02,425 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-28 10:56:02,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:56:02,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 10:56:03,326 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:03,343 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:56:03,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-28 10:56:03,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:56:03,878 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-28 10:56:04,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:56:04,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:04,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:04,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:04,914 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-28 10:56:05,586 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-28 10:56:05,832 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-28 10:56:05,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-28 10:56:06,191 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:06,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-28 10:56:07,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:07,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:07,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:07,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 10:56:08,085 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-28 10:56:08,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:08,645 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:56:08,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:08,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:56:08,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-28 10:56:08,977 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-28 10:56:09,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:09,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-28 10:56:09,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:56:10,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:10,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:56:10,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:10,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-28 10:56:10,452 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 10:56:10,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:56:10,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:56:11,542 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:11,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:12,597 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-28 10:56:12,838 WARNING [train.py:1197] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 10:56:12,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 10:56:12,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:13,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:13,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:56:13,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-28 10:56:13,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:56:14,002 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:14,124 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:14,151 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-28 10:56:14,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:14,197 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:56:14,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-28 10:56:14,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:56:15,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:15,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:56:15,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-28 10:56:15,919 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-28 10:56:16,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:16,829 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-28 10:56:16,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:17,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:56:17,701 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-28 10:56:17,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-28 10:56:18,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:18,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:56:18,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:19,482 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-28 10:56:19,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-28 10:56:20,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-28 10:56:20,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:20,477 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-28 10:56:20,682 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-28 10:56:20,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-28 10:56:20,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:21,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:22,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:22,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:56:22,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:22,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:22,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-28 10:56:22,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:22,603 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:56:22,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:22,783 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-28 10:56:23,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-28 10:56:23,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-28 10:56:23,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-28 10:56:23,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:56:24,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:24,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:56:24,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:24,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:56:25,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-28 10:56:25,282 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:56:25,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-28 10:56:25,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-28 10:56:26,243 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:26,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:26,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:56:26,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 10:56:27,061 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:27,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:56:27,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:56:27,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 10:56:28,118 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:28,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:56:28,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 10:56:28,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-28 10:56:29,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:29,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:56:29,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:56:29,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:56:29,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-28 10:56:29,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:56:29,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-28 10:56:29,990 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:30,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-28 10:56:30,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-28 10:56:30,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:30,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:56:30,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:56:31,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-28 10:56:31,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-28 10:56:31,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:56:31,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-28 10:56:32,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-28 10:56:32,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:33,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 10:56:33,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-28 10:56:33,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:33,668 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:56:34,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:56:34,619 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-28 10:56:34,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-28 10:56:34,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-28 10:56:34,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:56:35,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:56:35,228 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-28 10:56:35,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:56:36,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:56:36,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:56:36,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:56:36,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:36,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:36,639 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-28 10:56:36,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:56:36,936 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-28 10:56:36,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-28 10:56:37,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:56:37,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:38,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:38,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:56:39,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:56:39,043 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:56:39,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-28 10:56:39,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:40,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-28 10:56:40,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:56:40,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:56:40,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-28 10:56:41,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 10:56:41,240 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:56:41,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:56:41,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:56:41,900 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-28 10:56:42,465 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:42,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-28 10:56:42,975 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-28 10:56:43,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:43,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:43,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-28 10:56:43,557 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:43,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-28 10:56:43,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:56:43,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:56:43,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:56:43,952 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:44,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-28 10:56:44,754 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:56:44,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-28 10:56:45,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:56:45,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 10:56:46,047 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-28 10:56:46,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-28 10:56:46,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:47,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:56:47,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:47,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-28 10:56:47,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:56:47,649 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:47,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-28 10:56:47,910 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:56:47,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-28 10:56:48,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:48,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:56:48,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-28 10:56:48,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:56:49,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:49,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:56:49,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:56:49,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-28 10:56:49,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:50,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-28 10:56:50,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:56:50,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 10:56:51,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-28 10:56:51,322 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:51,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:56:51,892 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:56:51,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-28 10:56:52,003 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:56:52,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:52,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-28 10:56:52,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:52,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:56:54,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:56:55,659 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:56:55,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-28 10:56:55,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:55,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:56,616 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-28 10:56:56,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:57,678 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-28 10:56:58,200 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:58,405 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:56:58,452 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-28 10:56:58,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:56:58,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:59,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:56:59,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-28 10:56:59,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:59,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:59,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:57:00,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:57:00,432 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 10:57:01,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:01,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:57:01,563 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-28 10:57:01,887 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-28 10:57:02,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:57:02,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:57:02,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:03,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:57:03,130 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-28 10:57:03,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:04,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-28 10:57:04,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:04,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-28 10:57:04,414 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:57:04,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-28 10:57:05,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-28 10:57:05,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:05,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:57:05,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:57:05,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-28 10:57:05,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:06,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:57:06,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:57:06,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-28 10:57:06,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 10:57:06,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-28 10:57:06,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 10:57:06,699 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:06,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:57:07,276 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-28 10:57:08,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:57:08,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-28 10:57:09,056 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-28 10:57:09,289 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:57:09,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-28 10:57:09,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:57:09,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:10,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-28 10:57:10,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:57:10,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:57:10,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-28 10:57:11,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:57:11,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 10:57:11,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 10:57:12,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:57:12,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:57:12,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:12,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-28 10:57:12,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 10:57:13,017 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-28 10:57:13,029 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:57:13,260 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:13,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:13,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:57:13,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-28 10:57:15,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-28 10:57:15,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:57:15,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:15,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-28 10:57:15,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:57:15,736 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-28 10:57:15,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:15,793 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:16,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:57:16,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:57:16,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:57:17,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-28 10:57:17,046 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-28 10:57:17,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-28 10:57:17,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:57:17,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-28 10:57:17,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:18,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-28 10:57:18,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:57:18,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-28 10:57:18,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-28 10:57:18,591 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:57:18,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-28 10:57:18,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:18,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-28 10:57:19,298 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:57:19,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:57:19,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:57:19,985 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-28 10:57:20,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:57:20,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-28 10:57:21,022 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:57:21,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-28 10:57:21,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:57:22,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:22,401 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:57:22,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-28 10:57:23,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:57:23,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-28 10:57:23,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-28 10:57:23,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 10:57:24,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:24,483 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:24,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:57:24,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:24,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:57:25,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-28 10:57:25,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-28 10:57:25,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:57:25,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 10:57:26,078 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-28 10:57:26,156 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 10:57:26,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:57:26,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:57:26,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-28 10:57:26,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:26,901 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-28 10:57:27,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:27,620 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:27,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:57:28,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-28 10:57:28,791 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-28 10:57:28,932 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-28 10:57:29,451 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:57:29,793 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-28 10:57:30,032 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:30,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-28 10:57:31,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:31,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:32,043 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:32,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:32,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:57:32,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:57:32,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:57:33,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-28 10:57:33,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-28 10:57:33,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:57:33,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-28 10:57:33,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:34,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:34,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-28 10:57:34,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-28 10:57:34,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-28 10:57:34,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:34,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-28 10:57:36,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:37,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:38,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:57:38,108 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-28 10:57:38,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:57:38,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-28 10:57:38,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-28 10:57:38,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:39,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:57:39,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-28 10:57:39,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:57:40,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-28 10:57:40,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-28 10:57:41,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-28 10:57:41,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:57:42,684 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:42,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:43,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-28 10:57:43,607 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-28 10:57:44,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:57:45,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:57:45,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:57:45,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-28 10:57:45,921 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:57:46,056 WARNING [train.py:1197] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 10:57:47,195 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:47,296 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:47,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-28 10:57:47,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-28 10:57:48,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:57:48,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:57:48,242 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:57:48,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:57:48,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:48,498 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:57:48,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-28 10:57:48,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:57:49,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:50,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:57:51,330 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-28 10:57:51,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 10:57:51,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:51,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 10:57:52,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:52,297 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:57:52,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:57:52,972 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:57:52,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:52,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-28 10:57:53,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:53,765 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 10:57:53,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:57:53,964 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-28 10:57:54,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 10:57:54,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-28 10:57:54,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:54,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:57:54,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-28 10:57:54,726 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:57:54,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:57:54,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 10:57:54,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:57:55,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:57:55,318 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:55,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:57:56,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:57:56,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:57:57,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:57,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:57,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:57:57,367 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:57:57,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:57,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:57,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-28 10:57:58,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:57:58,728 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-28 10:57:58,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:57:59,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-28 10:57:59,301 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:59,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-28 10:58:00,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:58:00,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-28 10:58:00,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-28 10:58:00,501 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:58:00,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:58:00,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:58:01,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-28 10:58:01,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-28 10:58:01,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-28 10:58:01,701 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:58:01,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 10:58:03,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-28 10:58:03,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-28 10:58:03,900 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:58:04,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:58:04,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:04,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:58:04,462 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-28 10:58:04,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:58:04,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-28 10:58:04,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:04,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:58:05,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:05,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:05,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:05,973 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-28 10:58:06,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-28 10:58:06,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:58:06,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:58:06,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-28 10:58:07,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-28 10:58:07,387 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:58:07,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-28 10:58:07,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-28 10:58:08,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:58:08,203 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:58:08,224 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:58:08,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-28 10:58:08,515 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:58:08,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:08,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-28 10:58:08,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:09,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:58:09,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-28 10:58:10,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:58:11,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 10:58:11,528 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-28 10:58:11,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:11,629 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-28 10:58:11,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:11,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:12,110 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-28 10:58:12,249 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-28 10:58:12,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-28 10:58:12,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:13,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:13,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:13,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:13,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:13,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:58:13,743 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-28 10:58:13,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-28 10:58:13,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:58:13,983 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-28 10:58:14,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-28 10:58:14,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:58:14,526 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:14,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:58:14,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:58:14,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:14,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:15,217 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-28 10:58:15,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:15,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:58:15,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:58:15,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:58:15,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-28 10:58:16,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:58:16,086 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-28 10:58:16,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-28 10:58:16,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-28 10:58:16,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:16,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:58:17,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:18,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-28 10:58:18,022 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-28 10:58:19,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:58:19,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:58:19,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:58:19,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:58:19,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-28 10:58:20,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 10:58:20,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:20,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:21,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:21,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:58:21,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-28 10:58:21,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 10:58:21,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:58:21,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:21,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-28 10:58:21,880 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-28 10:58:22,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:22,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-28 10:58:23,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:23,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:23,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-28 10:58:23,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 10:58:24,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:24,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 10:58:24,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:58:24,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:58:25,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:25,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-28 10:58:25,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-28 10:58:25,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-28 10:58:26,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:58:26,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-28 10:58:26,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:58:26,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:27,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:58:27,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-28 10:58:28,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:28,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-28 10:58:28,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:28,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-28 10:58:29,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-28 10:58:29,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:30,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-28 10:58:30,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:30,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:58:30,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:58:30,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-28 10:58:31,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 10:58:31,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:32,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:32,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:32,274 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:58:32,481 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:58:32,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:58:32,834 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 10:58:33,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:33,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:58:34,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-28 10:58:34,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:58:34,138 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-28 10:58:34,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:35,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:58:35,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:58:35,165 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-28 10:58:35,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-28 10:58:35,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-28 10:58:35,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-28 10:58:35,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:35,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:58:36,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:36,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-28 10:58:36,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:36,604 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-28 10:58:36,916 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:58:36,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:58:36,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:58:36,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:58:37,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-28 10:58:37,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-28 10:58:38,599 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:58:38,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-28 10:58:39,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-28 10:58:39,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:40,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-28 10:58:40,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:40,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:40,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:58:41,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:41,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:58:41,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:41,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:41,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:41,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:58:41,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:58:42,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:42,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 10:58:42,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:58:42,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-28 10:58:42,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:58:42,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-28 10:58:43,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-28 10:58:43,156 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-28 10:58:43,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:43,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:58:43,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:43,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:43,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-28 10:58:43,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:44,194 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:58:44,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:44,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-28 10:58:45,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:45,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:58:45,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-28 10:58:45,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:58:45,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:58:45,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:46,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:58:46,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:58:46,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-28 10:58:46,712 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 10:58:47,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:48,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:58:48,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-28 10:58:48,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:58:48,953 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:58:49,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:58:49,337 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-28 10:58:49,608 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:58:49,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:49,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:58:50,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-28 10:58:50,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-28 10:58:50,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-28 10:58:50,523 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:50,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-28 10:58:50,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:58:52,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:52,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:52,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:58:52,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-28 10:58:52,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-28 10:58:53,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:58:53,316 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:53,317 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-28 10:58:53,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:53,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:53,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:53,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:53,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:53,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:58:54,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:54,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:58:54,183 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:54,706 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:54,835 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-28 10:58:55,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:55,299 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:55,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-28 10:58:56,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:56,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:56,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-28 10:58:56,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-28 10:58:56,860 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:57,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:58:57,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:57,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-28 10:58:58,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:58,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-28 10:58:59,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:59,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:58:59,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 10:58:59,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-28 10:58:59,984 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:00,044 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-28 10:59:01,081 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:59:01,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:01,470 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:59:01,937 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:01,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:59:02,262 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:02,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:02,786 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:59:02,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:03,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-28 10:59:03,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:59:03,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-28 10:59:03,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:59:03,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:59:04,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:59:04,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 10:59:04,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-28 10:59:04,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:04,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:59:05,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:59:06,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:59:06,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:59:06,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-28 10:59:06,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:06,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-28 10:59:06,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:06,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-28 10:59:06,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:59:06,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:59:07,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:59:07,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:07,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 10:59:08,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:08,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 10:59:08,591 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:08,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:09,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:09,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:09,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:59:09,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:59:09,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-28 10:59:09,920 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:10,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:10,811 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-28 10:59:11,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-28 10:59:11,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-28 10:59:11,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:11,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:11,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:59:11,462 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:59:12,921 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-28 10:59:13,074 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:59:13,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:13,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-28 10:59:13,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-28 10:59:13,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-28 10:59:13,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:59:14,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 10:59:14,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-28 10:59:15,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:59:15,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-28 10:59:15,605 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:15,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:15,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:59:15,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-28 10:59:16,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-28 10:59:16,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:16,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-28 10:59:16,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:16,836 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:16,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:59:16,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:17,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:17,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:59:17,429 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:59:17,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:17,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:59:17,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:59:18,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:59:18,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-28 10:59:19,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-28 10:59:20,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-28 10:59:20,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:20,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-28 10:59:20,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 10:59:22,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:59:22,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-28 10:59:22,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:22,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:23,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-28 10:59:23,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:23,244 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 10:59:23,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:59:23,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:59:24,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:24,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-28 10:59:24,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:24,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 10:59:24,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:24,777 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:24,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:59:25,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-28 10:59:25,437 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:26,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:59:26,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:59:26,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-28 10:59:26,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-28 10:59:26,890 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-28 10:59:27,037 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-28 10:59:27,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 10:59:27,290 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:27,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:59:27,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:27,507 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-28 10:59:27,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 10:59:27,609 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:27,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-28 10:59:27,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 10:59:28,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:28,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-28 10:59:28,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:28,421 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-28 10:59:28,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:59:28,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:29,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:29,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:59:29,688 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-28 10:59:29,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-28 10:59:30,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:59:30,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:30,135 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-28 10:59:30,206 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-28 10:59:30,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-28 10:59:30,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:59:31,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-28 10:59:31,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-28 10:59:32,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-28 10:59:33,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-28 10:59:33,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:33,907 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-28 10:59:33,925 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-28 10:59:33,992 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-28 10:59:34,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-28 10:59:34,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:34,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-28 10:59:34,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:59:35,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:35,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-28 10:59:35,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 10:59:36,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-28 10:59:36,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:59:36,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:59:36,971 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:59:37,005 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:37,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:59:37,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 10:59:37,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-28 10:59:37,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:59:37,823 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:37,825 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-28 10:59:38,163 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:59:38,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:38,249 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:59:38,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:38,807 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 10:59:39,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:59:39,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:39,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:59:39,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-28 10:59:39,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 10:59:39,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:39,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:40,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:59:40,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:40,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:41,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:59:41,448 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 10:59:41,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 10:59:41,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:59:41,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:42,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:42,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-28 10:59:42,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:42,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-28 10:59:42,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-28 10:59:42,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 10:59:42,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:42,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:43,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:43,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:43,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 10:59:44,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:59:44,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:59:44,404 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-28 10:59:44,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:45,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:59:45,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:59:45,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:59:46,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:46,326 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:46,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:47,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:47,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:59:47,925 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:48,036 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-28 10:59:48,045 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 10:59:48,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:48,343 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-28 10:59:48,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:59:49,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:59:49,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:59:49,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:49,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:59:49,748 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:50,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-28 10:59:50,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:59:50,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:50,864 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-28 10:59:51,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:59:51,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:59:51,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:51,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-28 10:59:51,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:51,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:51,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:52,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-28 10:59:52,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 10:59:52,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-28 10:59:52,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:59:52,879 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:59:53,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-28 10:59:53,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:53,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:59:53,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:59:53,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-28 10:59:54,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-28 10:59:54,371 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:59:54,383 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:55,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:55,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:59:55,209 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:59:55,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:55,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:59:55,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:55,743 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:59:55,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:55,937 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:56,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:59:56,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-28 10:59:57,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 10:59:57,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:57,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:57,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:59:58,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:58,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:59:58,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:58,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 10:59:58,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 10:59:58,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:58,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:59,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:59:59,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:00,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:00,934 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:00:01,007 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:01,126 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:00:01,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-28 11:00:01,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:01,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:01,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:00:02,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:00:02,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:03,094 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-28 11:00:03,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:03,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-28 11:00:03,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:00:04,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:00:04,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:04,398 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-28 11:00:04,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:00:04,669 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:04,831 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:00:04,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:00:05,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:05,533 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:00:05,798 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:00:05,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:06,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:00:07,722 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:00:07,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-28 11:00:08,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:00:08,678 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:00:08,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:00:09,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-28 11:00:09,690 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-28 11:00:09,690 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:09,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:09,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:00:10,030 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:00:10,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-28 11:00:10,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-28 11:00:10,345 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:00:10,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:00:10,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:00:11,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:11,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:11,244 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-28 11:00:11,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:00:11,469 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-28 11:00:11,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-28 11:00:11,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:11,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:00:11,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-28 11:00:11,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 11:00:12,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-28 11:00:12,509 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:00:12,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:12,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:13,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:00:13,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-28 11:00:13,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:13,621 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 11:00:14,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-28 11:00:14,559 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:14,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-28 11:00:14,628 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-28 11:00:14,701 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-28 11:00:15,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:00:15,266 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:00:15,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:00:15,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:00:15,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:16,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:16,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-28 11:00:16,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:16,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:16,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:16,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-28 11:00:16,615 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-28 11:00:16,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-28 11:00:17,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:00:17,572 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:00:17,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-28 11:00:18,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:18,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:00:18,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:18,763 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:00:18,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-28 11:00:18,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:00:19,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:19,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:00:19,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:00:19,320 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:19,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-28 11:00:19,822 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-28 11:00:19,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:20,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:20,172 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:00:20,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:00:20,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:00:21,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 11:00:21,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:22,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:22,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:00:22,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:22,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:00:22,821 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:22,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:00:22,919 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:00:23,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:23,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-28 11:00:24,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:00:24,209 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:00:24,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:24,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:24,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:24,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:00:24,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:24,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:00:24,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:24,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-28 11:00:25,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:00:25,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:25,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:00:25,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:00:25,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:25,993 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:26,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:00:26,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:00:26,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-28 11:00:26,303 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:00:26,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:26,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:26,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:27,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:00:27,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:27,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:27,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-28 11:00:27,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-28 11:00:28,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:00:28,062 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-28 11:00:28,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:29,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:00:29,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-28 11:00:29,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:00:29,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-28 11:00:29,556 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-28 11:00:29,557 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-28 11:00:29,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-28 11:00:30,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:30,208 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:30,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:00:30,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:30,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 11:00:31,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:31,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:32,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:00:32,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-28 11:00:32,942 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:00:33,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:34,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:00:34,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:34,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:00:34,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:34,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:00:34,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-28 11:00:35,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-28 11:00:35,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:00:36,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-28 11:00:36,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:37,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:00:37,859 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:00:38,392 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:00:38,425 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-28 11:00:38,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:00:39,109 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:39,202 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-28 11:00:39,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:00:39,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:40,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:40,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:40,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-28 11:00:40,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:40,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-28 11:00:41,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:41,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:00:41,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:41,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:00:41,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:41,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:41,649 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:41,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-28 11:00:41,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:00:42,163 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 11:00:42,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 11:00:42,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:42,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:00:43,098 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-28 11:00:43,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:00:43,448 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-28 11:00:43,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-28 11:00:43,659 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-28 11:00:43,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:43,960 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-28 11:00:44,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:44,944 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-28 11:00:45,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:45,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:00:46,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:00:46,316 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:00:46,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:00:46,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:00:46,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:00:47,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-28 11:00:47,204 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:47,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:00:47,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-28 11:00:47,634 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:47,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:00:47,925 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:00:48,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:48,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:00:48,810 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-28 11:00:49,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-28 11:00:49,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:00:49,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:00:49,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:50,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:50,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:50,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:50,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:50,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:52,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:00:52,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-28 11:00:52,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:00:52,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:00:53,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:00:53,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 11:00:54,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:00:54,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-28 11:00:54,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:00:54,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:00:54,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-28 11:00:54,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:00:54,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:00:55,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:00:55,706 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:00:55,980 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-28 11:00:56,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:57,064 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:00:57,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:00:57,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:57,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:57,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-28 11:00:57,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:00:57,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:00:57,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:00:58,893 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:00:59,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:00:59,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:00,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:01:00,342 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:00,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:01:00,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:01,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:01:01,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:01:01,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:01:01,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-28 11:01:02,094 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 11:01:02,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:01:02,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:02,221 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:02,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:02,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 11:01:02,431 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:01:02,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-28 11:01:02,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:01:02,533 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:01:02,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-28 11:01:03,426 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:01:03,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:04,457 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:04,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:01:04,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:01:04,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:01:04,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:01:05,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:01:05,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-28 11:01:06,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:01:06,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-28 11:01:07,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-28 11:01:07,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:01:07,983 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:07,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:08,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:01:08,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:08,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-28 11:01:08,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:09,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-28 11:01:09,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:01:09,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:01:09,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:01:10,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:01:10,380 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-28 11:01:10,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:01:10,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:10,805 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:10,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:11,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:01:11,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-28 11:01:11,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:01:12,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:12,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:12,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-28 11:01:13,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:01:13,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-28 11:01:13,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:01:13,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-28 11:01:14,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-28 11:01:14,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:14,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:01:14,594 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-28 11:01:14,625 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-28 11:01:14,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-28 11:01:15,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:01:15,891 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:01:16,388 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:01:16,567 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:01:16,670 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-28 11:01:16,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-28 11:01:17,673 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:01:17,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:18,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-28 11:01:18,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:01:18,168 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:18,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-28 11:01:19,591 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:19,865 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-28 11:01:20,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:01:20,940 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-28 11:01:21,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:21,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:22,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:01:22,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-28 11:01:22,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:01:23,095 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:23,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:23,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:23,881 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:01:23,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:01:23,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:23,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:24,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:24,054 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:01:24,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:01:24,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:01:24,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-28 11:01:24,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-28 11:01:25,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:01:25,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:25,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-28 11:01:25,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-28 11:01:25,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-28 11:01:25,424 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-28 11:01:26,200 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-28 11:01:26,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:26,515 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:26,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:01:26,690 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-28 11:01:26,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:26,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:01:27,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:01:27,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:01:27,895 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:27,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:28,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-28 11:01:28,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:28,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:29,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:01:29,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:01:29,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:01:29,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-28 11:01:29,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:30,101 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:01:30,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:01:30,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:01:30,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:01:30,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:01:30,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:31,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-28 11:01:31,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:01:32,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:33,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:33,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:01:33,254 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:01:33,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:33,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:01:33,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-28 11:01:33,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:01:34,088 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:01:34,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:01:34,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:01:35,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:01:35,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-28 11:01:35,328 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:01:35,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:35,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-28 11:01:35,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:35,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:01:36,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:01:36,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:36,585 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:01:37,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-28 11:01:37,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:01:38,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:39,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:01:39,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:01:40,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:40,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-28 11:01:40,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:01:41,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:41,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:01:41,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:01:41,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-28 11:01:41,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:41,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:01:41,622 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-28 11:01:41,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:41,754 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-28 11:01:41,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:42,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:42,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:01:42,877 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 11:01:42,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-28 11:01:43,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:01:43,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:43,816 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:44,276 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:44,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:01:45,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-28 11:01:45,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-28 11:01:45,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:45,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-28 11:01:45,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:01:45,890 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:01:46,048 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-28 11:01:46,049 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-28 11:01:46,057 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-28 11:01:47,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:47,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-28 11:01:47,341 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-28 11:01:47,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:01:47,602 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-28 11:01:48,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-28 11:01:48,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:48,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:01:48,652 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:01:48,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:49,083 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-28 11:01:49,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:01:49,645 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-28 11:01:49,842 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:01:50,098 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:50,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:50,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 11:01:50,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:01:50,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:01:50,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:50,992 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:01:51,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-28 11:01:51,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-28 11:01:51,072 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:51,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-28 11:01:52,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:01:53,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:54,003 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:54,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:01:54,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:01:54,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:55,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:01:55,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:01:55,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:01:55,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-28 11:01:55,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:01:56,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:56,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:56,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:01:56,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-28 11:01:56,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:01:56,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:56,934 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:01:57,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:01:57,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:58,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:58,369 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:58,715 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-28 11:01:59,097 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-28 11:01:59,133 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:01:59,183 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-28 11:01:59,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-28 11:01:59,356 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-28 11:01:59,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:59,751 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-28 11:01:59,936 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-28 11:02:00,115 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-28 11:02:00,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:02:01,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-28 11:02:01,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-28 11:02:01,640 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:02:01,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-28 11:02:01,993 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-28 11:02:02,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-28 11:02:03,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:03,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:03,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:03,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-28 11:02:03,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:02:04,041 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-28 11:02:04,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:04,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:04,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-28 11:02:05,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:02:05,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:05,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-28 11:02:05,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:02:05,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:05,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:06,391 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-28 11:02:06,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:06,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:02:07,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:02:07,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:02:07,942 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-28 11:02:08,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:08,356 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:08,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:02:09,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-28 11:02:09,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:09,510 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:02:10,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-28 11:02:10,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:10,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:02:10,337 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-28 11:02:10,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:10,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:10,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:02:11,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:02:11,392 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:11,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-28 11:02:11,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:02:11,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:02:11,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-28 11:02:12,165 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-28 11:02:12,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:12,774 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-28 11:02:12,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:12,884 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-28 11:02:13,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:13,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:02:13,498 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:13,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:14,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-28 11:02:14,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-28 11:02:15,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:02:15,589 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-28 11:02:15,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:15,771 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:15,823 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:02:15,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:15,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:16,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:16,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:16,557 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:16,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:02:16,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:02:16,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:17,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:02:17,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:17,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:17,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:02:17,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:18,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:02:18,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:18,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-28 11:02:18,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:18,708 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:18,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:19,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:19,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:02:19,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:19,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:02:19,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-28 11:02:19,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:02:20,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 11:02:20,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:20,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:20,504 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:20,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:02:20,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:20,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:02:20,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-28 11:02:20,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-28 11:02:20,934 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:02:21,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:02:22,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:02:22,172 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:22,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:02:22,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-28 11:02:22,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:02:23,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:02:23,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:24,171 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:02:24,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:02:24,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:02:24,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:02:24,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:24,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:25,011 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:02:25,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:02:25,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:25,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:02:26,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:02:26,796 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:27,314 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:02:27,317 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:27,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:27,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:28,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:28,808 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:28,990 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:29,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:02:29,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:02:29,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:29,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:29,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-28 11:02:30,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:30,370 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:02:30,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-28 11:02:30,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-28 11:02:30,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:31,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:31,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:31,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:31,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:02:31,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:31,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:31,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 11:02:32,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:02:32,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:32,061 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-28 11:02:32,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:02:32,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:32,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-28 11:02:33,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:33,181 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:33,293 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:02:33,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-28 11:02:33,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:33,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:02:33,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:33,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:34,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:02:34,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:02:34,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:02:34,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:34,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:02:36,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:36,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:02:36,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:37,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:37,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:02:37,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:02:37,948 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:02:38,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:02:38,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-28 11:02:38,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:38,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-28 11:02:39,476 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:02:39,762 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:02:39,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-28 11:02:39,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:40,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:02:40,292 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-28 11:02:40,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:02:40,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-28 11:02:40,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:40,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:02:40,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-28 11:02:40,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:41,046 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:41,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:41,348 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-28 11:02:41,349 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-28 11:02:42,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:42,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:02:42,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:43,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:43,371 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-28 11:02:43,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-28 11:02:43,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-28 11:02:43,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:44,053 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:02:44,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:44,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:02:44,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:44,624 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-28 11:02:44,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:45,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:02:45,997 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:46,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:46,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:46,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:47,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:02:47,168 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-28 11:02:47,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:02:47,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:02:47,415 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:02:47,999 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:02:48,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:48,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:02:49,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:02:49,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:49,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 11:02:49,543 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 11:02:49,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:02:49,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:49,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-28 11:02:49,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:50,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:50,074 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:50,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-28 11:02:50,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:50,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:02:50,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:02:50,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-28 11:02:51,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:02:51,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:02:51,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:02:51,812 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:52,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:02:52,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:52,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:02:52,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:52,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:53,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:02:53,182 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-28 11:02:53,940 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-28 11:02:53,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:54,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-28 11:02:54,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:54,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-28 11:02:54,719 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-28 11:02:54,934 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:57,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:57,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:02:57,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:02:57,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:02:57,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:02:57,473 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:02:57,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:02:57,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-28 11:02:58,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:02:58,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:58,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:58,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:58,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:58,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:58,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:59,102 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:02:59,359 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:59,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:59,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:59,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:03:00,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:00,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-28 11:03:00,698 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-28 11:03:00,919 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:03:01,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:01,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-28 11:03:01,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:03:01,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:01,314 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:01,367 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:03:01,367 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-28 11:03:01,432 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-28 11:03:01,438 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:03:01,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:02,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-28 11:03:02,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:02,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:03,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-28 11:03:03,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:03,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-28 11:03:03,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-28 11:03:03,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:03:03,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:03:04,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:04,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:03:04,857 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:03:04,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:05,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:03:05,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-28 11:03:05,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:03:05,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:05,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-28 11:03:06,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-28 11:03:06,462 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:06,466 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-28 11:03:06,502 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:03:06,828 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:03:06,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-28 11:03:07,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:07,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:07,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:08,112 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:08,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-28 11:03:08,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-28 11:03:08,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:03:08,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:03:09,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-28 11:03:10,051 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:03:10,678 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:11,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:03:11,778 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:03:11,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-28 11:03:12,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:12,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-28 11:03:12,465 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:12,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:03:13,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:13,562 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-28 11:03:13,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:13,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:13,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:14,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:03:14,268 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-28 11:03:14,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-28 11:03:14,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:03:14,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:15,252 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:03:15,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:03:15,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:15,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:03:16,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:16,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:03:17,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:17,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:17,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:03:17,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-28 11:03:18,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-28 11:03:18,159 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-28 11:03:18,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:03:18,487 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-28 11:03:18,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-28 11:03:18,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:03:18,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:03:18,799 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-28 11:03:18,806 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-28 11:03:19,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-28 11:03:19,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:03:19,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:03:19,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:19,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:03:19,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:20,013 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-28 11:03:20,072 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:20,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-28 11:03:20,824 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:20,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:03:21,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-28 11:03:21,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:21,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-28 11:03:21,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:03:21,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:21,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:03:22,187 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:22,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 11:03:22,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:03:22,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:22,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:03:22,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:03:22,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:03:22,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:03:23,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:23,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-28 11:03:23,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:24,162 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:03:24,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:03:24,666 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-28 11:03:24,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-28 11:03:25,020 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:03:25,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:03:25,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-28 11:03:25,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:26,109 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:03:27,234 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:03:28,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-28 11:03:28,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:03:28,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:03:28,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:28,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:28,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:28,897 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-28 11:03:29,160 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-28 11:03:29,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:29,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:03:29,691 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:03:29,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:03:29,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:03:29,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:30,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:03:30,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:30,966 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:03:31,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:03:31,512 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-28 11:03:31,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:03:31,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:31,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:03:32,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:03:32,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:03:32,517 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-28 11:03:32,592 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-28 11:03:32,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:32,700 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-28 11:03:32,800 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:03:33,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-28 11:03:33,373 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:03:33,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 11:03:33,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-28 11:03:33,691 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-28 11:03:33,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 11:03:33,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:03:34,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:03:34,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-28 11:03:34,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:34,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:34,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-28 11:03:34,727 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:03:34,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:35,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:03:35,553 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:35,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-28 11:03:36,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-28 11:03:36,745 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-28 11:03:36,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:37,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:03:38,346 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:38,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:38,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:38,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:03:38,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:03:38,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:39,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:39,031 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:39,208 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:39,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:39,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:39,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-28 11:03:39,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:39,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:03:40,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:03:40,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:03:40,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:40,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:03:41,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:41,401 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:03:42,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:42,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:42,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:42,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:03:42,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:03:42,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:03:42,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-28 11:03:42,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:03:42,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:43,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-28 11:03:43,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:03:44,669 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:03:44,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:03:45,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:03:45,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-28 11:03:45,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-28 11:03:45,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-28 11:03:46,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:03:46,481 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:46,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:46,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-28 11:03:47,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:48,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-28 11:03:48,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 11:03:48,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:48,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:03:48,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:48,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-28 11:03:49,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:49,120 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-28 11:03:49,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:03:49,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:49,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-28 11:03:49,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:50,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:03:50,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-28 11:03:50,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-28 11:03:50,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:51,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:51,606 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:51,637 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:51,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:03:51,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:03:51,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:03:52,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:03:52,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:52,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:52,250 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 11:03:52,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:03:52,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-28 11:03:53,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:03:53,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-28 11:03:53,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:53,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:53,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-28 11:03:54,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-28 11:03:55,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:55,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:55,553 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:55,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:03:55,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-28 11:03:55,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:55,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:03:56,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-28 11:03:56,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:56,599 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-28 11:03:56,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-28 11:03:57,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:57,188 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-28 11:03:57,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-28 11:03:57,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-28 11:03:57,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-28 11:03:57,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-28 11:03:58,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:58,266 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:58,436 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:58,659 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-28 11:03:58,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:58,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:59,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:59,280 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:03:59,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-28 11:03:59,863 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:04:00,160 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:04:00,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:00,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-28 11:04:00,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-28 11:04:00,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:04:00,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:04:00,896 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 11:04:01,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:01,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:04:01,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:04:01,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:04:01,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-28 11:04:01,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:04:01,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:01,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:01,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:01,815 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-28 11:04:01,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:02,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-28 11:04:02,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:02,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-28 11:04:02,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-28 11:04:02,788 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:04:02,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:02,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-28 11:04:03,122 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 11:04:03,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:03,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:03,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:03,832 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:04:05,106 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:04:05,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:05,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-28 11:04:06,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:04:06,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-28 11:04:06,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:06,875 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:06,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-28 11:04:07,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:07,736 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:04:08,158 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:09,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:04:10,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-28 11:04:10,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:10,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-28 11:04:11,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:04:12,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:04:12,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:04:12,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:04:12,827 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-28 11:04:13,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-28 11:04:13,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-28 11:04:13,796 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-28 11:04:14,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:04:14,798 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:14,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:04:14,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:14,981 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-28 11:04:14,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 11:04:15,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:15,693 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-28 11:04:15,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-28 11:04:15,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-28 11:04:16,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-28 11:04:16,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:04:16,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:04:16,628 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-28 11:04:16,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:16,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:16,877 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-28 11:04:17,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:04:17,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:19,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:19,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-28 11:04:19,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:19,572 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:19,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:04:19,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:04:19,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:04:20,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:20,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:04:20,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:20,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:20,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:20,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:20,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:20,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:04:21,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:21,221 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:21,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:21,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:04:21,516 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:04:22,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-28 11:04:22,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:22,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:04:22,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:22,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:04:23,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:04:24,260 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:04:24,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:24,299 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-28 11:04:24,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:04:24,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 11:04:24,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:24,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-28 11:04:24,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-28 11:04:24,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:04:25,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:25,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:25,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-28 11:04:26,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:26,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:26,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:26,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-28 11:04:26,746 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:27,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:04:27,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-28 11:04:27,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:04:27,504 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-28 11:04:27,774 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-28 11:04:27,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-28 11:04:27,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:28,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:28,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:04:28,702 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:28,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 11:04:29,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:04:29,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:29,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:30,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-28 11:04:30,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:30,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:30,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:30,551 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-28 11:04:30,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:30,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:04:30,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:04:30,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:31,003 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-28 11:04:31,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:31,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:04:31,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:31,761 WARNING [train.py:1197] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-28 11:04:31,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-28 11:04:31,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:32,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:04:32,416 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-28 11:04:33,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-28 11:04:33,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:04:33,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-28 11:04:33,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:04:34,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:04:34,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:04:34,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:34,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:34,810 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:34,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:04:35,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:04:35,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:35,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:04:35,463 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-28 11:04:35,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-28 11:04:35,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-28 11:04:35,951 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:35,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:04:36,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:36,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:04:36,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:04:36,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:36,510 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:04:36,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:36,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:04:37,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-28 11:04:37,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:37,407 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:37,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:04:37,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:04:37,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:38,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:04:38,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:38,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:04:38,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:38,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:04:39,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:39,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:04:40,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:40,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:04:40,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-28 11:04:40,625 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-28 11:04:40,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:41,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-28 11:04:41,126 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-28 11:04:41,293 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:04:41,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:04:41,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:41,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-28 11:04:41,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:41,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:04:41,931 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:42,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:42,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:42,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:42,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:43,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:43,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:43,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:43,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:44,050 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:44,065 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:44,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:44,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-28 11:04:44,634 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 11:04:44,702 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-28 11:04:44,762 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:04:44,924 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-28 11:04:45,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:45,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:46,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:46,353 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-28 11:04:46,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:04:47,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:04:47,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:04:48,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:48,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-28 11:04:48,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:04:48,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:04:48,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:48,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-28 11:04:48,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:48,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-28 11:04:49,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:49,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:49,276 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:04:49,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:04:49,483 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-28 11:04:50,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-28 11:04:50,064 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-28 11:04:50,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:50,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:50,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:04:50,716 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:50,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:04:51,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:51,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-28 11:04:52,360 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:04:52,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:52,661 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:52,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:04:54,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:04:54,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-28 11:04:55,428 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:55,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:55,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-28 11:04:55,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:55,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:55,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:55,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:04:56,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:56,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:04:56,747 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:04:57,430 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:57,661 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-28 11:04:58,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:04:58,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-28 11:04:59,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-28 11:04:59,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:59,628 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:04:59,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-28 11:04:59,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:00,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:05:01,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:05:01,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:01,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:05:01,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:02,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:02,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-28 11:05:03,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-28 11:05:03,581 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:05:03,642 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:05:03,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:04,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-28 11:05:04,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:05:05,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:05:05,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:05,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:05:05,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:05:05,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-28 11:05:05,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:05,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:06,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:06,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-28 11:05:07,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:05:08,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:05:08,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:08,887 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:05:09,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:09,470 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:05:09,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:05:09,662 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:10,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:05:10,585 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:05:10,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-28 11:05:10,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:05:11,364 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-28 11:05:11,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:05:11,717 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-28 11:05:12,340 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:05:12,487 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:12,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:05:12,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:05:12,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-28 11:05:12,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:12,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:05:12,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-28 11:05:13,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:13,299 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:05:13,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:05:13,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:14,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-28 11:05:14,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:05:15,139 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:05:15,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:05:15,481 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:15,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:05:15,763 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:15,995 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-28 11:05:16,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-28 11:05:16,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-28 11:05:16,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:16,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:16,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:05:16,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:05:17,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 11:05:17,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:05:17,879 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:05:18,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-28 11:05:18,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-28 11:05:18,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:05:18,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:18,555 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:05:18,771 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:19,133 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-28 11:05:19,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:05:19,442 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:05:19,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-28 11:05:19,835 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-28 11:05:20,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:20,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:20,319 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:20,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:20,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:05:22,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:22,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 11:05:23,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:05:23,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:05:23,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:23,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:05:23,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:05:24,126 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:05:24,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:05:24,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:24,274 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-28 11:05:24,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:05:25,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:05:25,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:05:25,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:05:25,397 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:25,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:05:25,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-28 11:05:25,555 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:25,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:25,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 11:05:26,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:26,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:05:26,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:27,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-28 11:05:27,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:05:27,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-28 11:05:27,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:05:27,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:05:27,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:05:28,764 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-28 11:05:28,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:29,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:29,759 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-28 11:05:29,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:30,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:30,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-28 11:05:31,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-28 11:05:31,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:05:31,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:05:31,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:31,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:31,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:05:32,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:32,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:32,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:05:32,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:05:32,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:32,967 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-28 11:05:33,403 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:05:33,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:33,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:05:34,347 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:34,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:05:34,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:34,603 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-28 11:05:34,709 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:05:35,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:35,787 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:05:36,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:36,230 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:36,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:05:36,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-28 11:05:37,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:05:37,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:05:37,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-28 11:05:38,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:05:38,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:39,273 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:39,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:05:40,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:05:40,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-28 11:05:40,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-28 11:05:40,434 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-28 11:05:40,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:05:40,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:05:40,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-28 11:05:41,040 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:41,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:05:41,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:05:41,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-28 11:05:41,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-28 11:05:41,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:41,893 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-28 11:05:43,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-28 11:05:43,382 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:05:43,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-28 11:05:44,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-28 11:05:44,551 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:44,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:05:44,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:05:45,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:05:45,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:05:45,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-28 11:05:45,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:05:45,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:45,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-28 11:05:45,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 11:05:45,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:45,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:45,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:05:46,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-28 11:05:46,464 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-28 11:05:46,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:46,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-28 11:05:47,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:05:47,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:05:47,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:05:48,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:48,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:05:48,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:05:48,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:05:48,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:05:49,546 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:49,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:49,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:05:49,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:05:50,093 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:50,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:50,945 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-28 11:05:51,383 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:51,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:51,556 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:05:51,643 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:05:51,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:05:51,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:52,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-28 11:05:52,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:05:52,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 11:05:52,781 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:52,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:05:53,177 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:53,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-28 11:05:53,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:05:53,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:05:53,443 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:05:53,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:05:54,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:54,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:54,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:05:54,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:54,624 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 11:05:54,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:55,061 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-28 11:05:56,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:05:56,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 11:05:56,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:05:56,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-28 11:05:56,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:05:57,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:57,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-28 11:05:57,604 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:57,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:58,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:58,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:05:58,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 11:05:58,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:05:58,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-28 11:05:59,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:59,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-28 11:05:59,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:05:59,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:06:00,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:00,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-28 11:06:00,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:00,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:06:00,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:06:00,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:00,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:06:01,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-28 11:06:01,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-28 11:06:01,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:06:01,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:01,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:06:01,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:06:02,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:02,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:06:03,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:03,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-28 11:06:03,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 11:06:03,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:06:03,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-28 11:06:04,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:06:04,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:04,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:04,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:06:04,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:05,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:06:05,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:06:05,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:06,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:06,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-28 11:06:06,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:06:06,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:06,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:06,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-28 11:06:07,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-28 11:06:07,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:07,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:06:07,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:08,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:08,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-28 11:06:09,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-28 11:06:10,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:10,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:06:10,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:11,107 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:06:11,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 11:06:11,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:06:12,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:06:12,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:06:12,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:06:13,519 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:06:14,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:14,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 11:06:14,311 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-28 11:06:14,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:14,710 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:14,887 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-28 11:06:15,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 11:06:15,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:15,257 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:06:15,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:15,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:06:15,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:15,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-28 11:06:16,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-28 11:06:16,936 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:17,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:17,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:06:17,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:06:17,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-28 11:06:17,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:18,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:18,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:18,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 11:06:18,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-28 11:06:19,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:20,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:06:20,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:06:20,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-28 11:06:21,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-28 11:06:21,339 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:21,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:21,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:21,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-28 11:06:22,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-28 11:06:22,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-28 11:06:22,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:22,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:06:23,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:06:23,994 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:06:24,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:06:24,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-28 11:06:24,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:06:25,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:25,459 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:06:25,754 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:26,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-28 11:06:26,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-28 11:06:27,023 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:27,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:27,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:27,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:06:27,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:28,091 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:06:28,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:28,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:06:28,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:28,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:28,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:28,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:06:28,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-28 11:06:28,984 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-28 11:06:29,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:29,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:29,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:29,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:29,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-28 11:06:30,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-28 11:06:30,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:31,077 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-28 11:06:31,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-28 11:06:31,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:32,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:32,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:32,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-28 11:06:32,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-28 11:06:33,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:33,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:33,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:06:33,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:06:33,818 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:33,951 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:33,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:06:33,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-28 11:06:34,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:34,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-28 11:06:34,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:34,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:34,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:06:34,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:34,841 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:06:34,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:34,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:35,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:35,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-28 11:06:35,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:35,555 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:35,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 11:06:35,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:06:35,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:36,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 11:06:36,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:36,552 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:06:37,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-28 11:06:37,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:37,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-28 11:06:37,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:37,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-28 11:06:38,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-28 11:06:38,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:06:38,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:38,781 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:06:38,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:06:39,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:06:39,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:06:40,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:06:40,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:40,206 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:06:41,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:41,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:06:41,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:06:42,166 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:06:43,370 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:43,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:06:44,381 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-28 11:06:44,457 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-28 11:06:44,489 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:06:44,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-28 11:06:44,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:45,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-28 11:06:45,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:46,153 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-28 11:06:46,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:06:46,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:06:46,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:47,435 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-28 11:06:47,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:06:47,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-28 11:06:47,687 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-28 11:06:47,735 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:48,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:48,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:06:48,163 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:48,476 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-28 11:06:48,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:48,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:06:48,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:06:49,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:06:49,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:06:50,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:50,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:06:51,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-28 11:06:52,497 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-28 11:06:52,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-28 11:06:52,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:06:52,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:06:53,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:06:53,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:06:53,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:53,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:06:53,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-28 11:06:54,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:06:54,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:06:55,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-28 11:06:56,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:57,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:58,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:06:58,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:58,537 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:58,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-28 11:06:58,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:06:58,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-28 11:06:58,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:06:58,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-28 11:06:59,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:06:59,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:06:59,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:59,586 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:06:59,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:59,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:06:59,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:07:00,082 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-28 11:07:00,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:07:00,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:07:00,644 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-28 11:07:00,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:07:00,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:01,688 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-28 11:07:01,748 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:07:02,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:07:02,141 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-28 11:07:02,301 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:07:02,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-28 11:07:02,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:02,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:07:02,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:07:02,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:07:03,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:07:03,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:03,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-28 11:07:03,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:07:03,569 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-28 11:07:04,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:07:04,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 11:07:05,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:07:05,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:05,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:07:05,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:06,525 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:06,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-28 11:07:06,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-28 11:07:07,190 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:07:07,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:07:07,435 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:07:07,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:07:07,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:07:08,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:07:08,528 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:07:08,638 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 11:07:08,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:07:08,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:09,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:09,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:07:10,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 11:07:10,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-28 11:07:10,293 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-28 11:07:10,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:07:11,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-28 11:07:11,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:12,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:07:13,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:13,165 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:07:13,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:07:13,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:07:13,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-28 11:07:14,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:07:14,210 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:14,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-28 11:07:14,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:15,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-28 11:07:15,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:15,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:07:16,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-28 11:07:16,289 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-28 11:07:16,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:07:16,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:07:16,753 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:16,799 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:07:17,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-28 11:07:17,926 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-28 11:07:18,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-28 11:07:18,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-28 11:07:18,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:07:18,470 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:18,522 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:18,566 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:07:18,710 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-28 11:07:19,716 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:19,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:07:19,909 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:07:20,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:07:20,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:07:20,663 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:20,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:07:20,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-28 11:07:20,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:20,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:07:20,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:07:20,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:07:21,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-28 11:07:21,425 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:21,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-28 11:07:21,778 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:22,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:07:22,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-28 11:07:22,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:07:22,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:07:22,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:07:22,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-28 11:07:22,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:07:22,988 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:07:23,334 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-28 11:07:23,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:23,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:07:23,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:24,524 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:07:24,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:07:25,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:07:26,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:26,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:26,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:07:27,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:07:27,798 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:07:27,940 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:07:28,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:28,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:07:28,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-28 11:07:28,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:07:28,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-28 11:07:28,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-28 11:07:28,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-28 11:07:28,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:07:29,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:07:29,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:30,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:30,137 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:30,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:07:30,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 11:07:30,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:07:30,710 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:07:31,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:31,781 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:07:31,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-28 11:07:32,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-28 11:07:32,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:07:32,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-28 11:07:32,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:07:33,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:33,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:07:33,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:34,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-28 11:07:34,438 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:07:34,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:07:34,796 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-28 11:07:34,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:07:35,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-28 11:07:35,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:07:35,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:35,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:07:35,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-28 11:07:35,778 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:35,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-28 11:07:36,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:07:36,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-28 11:07:36,396 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:07:36,413 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:07:36,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:07:36,601 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-28 11:07:36,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:36,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 11:07:36,810 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:37,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:07:37,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-28 11:07:37,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:07:37,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:07:38,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-28 11:07:38,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:07:38,543 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:07:38,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:07:38,799 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:07:38,802 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:07:39,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-28 11:07:40,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-28 11:07:40,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:40,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:07:40,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:41,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-28 11:07:41,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:07:41,579 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:41,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-28 11:07:41,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:07:41,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:41,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:07:41,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:07:42,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:07:42,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-28 11:07:42,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:43,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:07:43,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:07:43,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:07:43,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:43,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:07:43,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-28 11:07:44,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:44,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:07:44,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:07:45,072 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:07:45,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:07:46,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-28 11:07:47,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:47,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:07:47,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:47,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-28 11:07:48,606 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:07:49,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:07:49,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:07:49,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:50,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:07:50,286 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-28 11:07:50,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:50,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:51,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:07:51,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:51,690 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:51,976 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:07:51,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:07:52,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:52,325 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:52,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:52,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:07:52,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:53,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-28 11:07:54,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-28 11:07:54,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:54,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:07:54,272 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:54,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:07:54,419 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:54,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:54,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-28 11:07:55,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:07:55,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:55,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:55,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-28 11:07:55,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:07:56,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-28 11:07:56,326 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:07:56,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:07:56,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:56,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:07:57,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-28 11:07:57,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:07:57,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:07:57,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:58,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:58,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:07:58,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:58,806 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:58,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:07:59,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:07:59,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-28 11:07:59,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:00,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:00,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:08:01,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:02,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:02,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-28 11:08:02,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:08:02,597 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:08:02,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:08:02,669 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-28 11:08:03,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:08:03,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:08:03,665 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-28 11:08:03,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:08:03,736 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-28 11:08:04,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 11:08:04,217 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:08:04,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:08:04,581 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:08:04,737 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:08:04,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:08:05,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:08:05,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-28 11:08:05,232 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-28 11:08:05,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:05,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:05,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 11:08:05,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:06,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:08:06,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-28 11:08:06,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-28 11:08:06,137 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-28 11:08:06,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:06,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-28 11:08:06,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-28 11:08:07,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:07,665 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-28 11:08:07,795 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:08:08,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:08,047 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:08,341 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-28 11:08:08,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:08:08,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:08,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:08,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:08:08,727 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:08:09,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:09,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:09,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:09,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:08:10,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-28 11:08:10,344 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:08:10,931 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:11,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:11,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:08:11,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:12,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:08:12,644 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:12,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:08:12,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:13,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:08:13,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:08:14,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:08:15,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-28 11:08:15,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:15,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:16,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:16,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-28 11:08:16,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:16,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:08:17,540 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-28 11:08:17,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:17,832 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:08:17,962 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-28 11:08:18,056 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-28 11:08:18,067 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:18,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:18,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:08:18,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:18,431 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:18,488 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:18,834 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-28 11:08:18,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:18,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:18,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:18,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-28 11:08:19,153 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-28 11:08:19,160 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-28 11:08:19,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-28 11:08:19,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:19,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:08:19,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:19,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:08:20,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-28 11:08:20,331 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-28 11:08:20,345 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:20,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:20,891 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:21,805 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:22,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-28 11:08:22,123 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-28 11:08:22,193 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-28 11:08:22,234 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-28 11:08:22,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 11:08:22,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:22,691 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-28 11:08:22,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:23,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:23,107 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-28 11:08:23,503 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:23,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-28 11:08:23,594 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-28 11:08:23,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-28 11:08:24,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-28 11:08:24,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-28 11:08:24,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:24,296 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:24,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:24,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:24,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-28 11:08:24,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-28 11:08:24,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:25,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:08:25,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:25,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:25,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:25,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-28 11:08:25,566 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-28 11:08:26,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:26,824 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:27,100 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-28 11:08:27,579 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:08:28,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:28,857 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:08:28,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-28 11:08:28,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:08:28,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:08:29,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:08:29,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:08:29,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-28 11:08:29,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-28 11:08:30,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-28 11:08:30,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:30,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-28 11:08:30,296 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:08:30,667 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:08:30,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-28 11:08:31,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:31,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:31,801 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:08:32,584 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:08:32,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:33,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:33,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:08:33,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:08:33,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:33,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-28 11:08:33,927 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:08:34,081 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:34,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:34,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:08:34,485 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:08:35,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:35,875 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:36,054 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:08:36,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:36,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:36,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 11:08:36,977 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-28 11:08:37,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-28 11:08:37,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:08:37,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:08:37,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-28 11:08:37,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:08:38,454 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:38,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-28 11:08:38,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:38,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:38,594 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:08:38,616 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:08:38,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:39,273 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:08:39,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-28 11:08:39,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:08:39,811 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:39,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:40,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:40,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 11:08:40,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:08:40,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-28 11:08:41,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:08:42,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:08:42,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-28 11:08:42,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-28 11:08:42,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:43,348 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:43,448 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:08:43,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:08:43,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:43,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:43,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:45,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:45,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:45,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:45,812 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:08:45,948 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:08:46,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:08:47,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:08:47,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:08:48,055 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:08:48,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-28 11:08:48,230 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:48,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:49,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:49,267 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:49,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:49,621 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-28 11:08:49,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:08:49,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:50,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:08:50,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:08:50,608 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:50,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:08:50,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:08:51,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-28 11:08:51,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-28 11:08:51,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-28 11:08:51,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-28 11:08:52,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-28 11:08:52,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:08:52,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:52,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:53,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:53,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:54,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:54,249 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:08:54,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:08:54,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:54,562 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:54,635 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:08:55,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:55,961 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-28 11:08:56,068 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-28 11:08:56,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:08:56,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-28 11:08:56,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-28 11:08:56,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:57,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-28 11:08:57,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:08:58,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:58,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:58,104 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:08:58,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-28 11:08:58,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:58,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:58,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:59,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:08:59,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-28 11:08:59,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-28 11:08:59,446 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:08:59,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-28 11:08:59,893 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-28 11:08:59,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:09:00,099 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:00,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:00,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:00,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:09:00,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:09:00,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-28 11:09:00,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:00,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 11:09:01,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-28 11:09:01,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:09:01,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-28 11:09:01,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:09:01,524 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:01,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:01,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:01,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:09:02,732 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:02,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:09:03,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:03,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:03,852 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:09:03,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:09:03,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:04,163 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-28 11:09:04,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:09:04,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:09:04,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:05,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:06,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-28 11:09:06,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:06,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:07,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:09:07,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:07,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-28 11:09:07,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:09:07,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:08,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:08,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:09:08,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:09:09,837 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-28 11:09:09,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:09:10,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:10,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:10,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:10,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 11:09:11,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:11,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-28 11:09:11,222 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:11,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:11,518 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:11,624 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:11,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:11,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-28 11:09:11,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-28 11:09:11,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-28 11:09:11,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:12,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:12,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:12,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:13,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:09:14,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:14,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:14,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:14,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:14,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:14,438 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:14,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-28 11:09:15,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:09:15,527 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-28 11:09:15,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:16,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-28 11:09:16,417 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:09:16,516 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-28 11:09:16,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-28 11:09:16,564 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:16,658 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:09:17,103 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:09:17,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:17,306 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-28 11:09:17,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:17,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-28 11:09:18,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:18,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:09:18,427 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-28 11:09:18,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:19,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:09:19,495 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:20,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:20,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:20,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:20,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:09:20,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-28 11:09:21,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-28 11:09:21,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 11:09:21,225 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-28 11:09:21,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:22,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:09:22,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:22,411 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-28 11:09:22,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:22,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:22,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:09:22,837 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:09:23,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:09:23,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:23,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:24,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:24,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:09:24,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:09:24,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-28 11:09:24,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:09:24,866 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-28 11:09:25,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:26,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:26,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:26,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:26,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:09:27,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-28 11:09:27,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-28 11:09:27,536 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:27,685 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:09:27,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:28,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:28,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:09:29,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 11:09:29,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:30,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-28 11:09:30,970 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:09:31,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:31,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-28 11:09:31,935 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:32,479 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:09:32,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-28 11:09:32,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:32,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:33,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:33,239 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:09:33,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-28 11:09:33,405 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-28 11:09:33,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:33,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:33,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:33,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-28 11:09:34,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:34,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-28 11:09:34,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:35,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:35,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:35,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:09:35,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-28 11:09:36,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:36,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-28 11:09:36,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:09:36,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:36,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:38,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-28 11:09:38,721 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:09:38,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-28 11:09:39,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:39,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:39,553 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:39,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:39,708 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-28 11:09:39,712 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-28 11:09:40,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-28 11:09:40,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:41,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:41,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:09:41,347 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-28 11:09:41,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:41,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:09:42,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:09:42,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-28 11:09:42,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-28 11:09:42,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:42,438 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:09:42,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:42,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:09:42,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-28 11:09:43,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-28 11:09:43,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:09:43,682 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:43,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-28 11:09:43,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:43,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:44,158 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:09:44,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:44,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:09:45,443 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:45,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-28 11:09:45,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-28 11:09:45,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-28 11:09:46,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:09:46,144 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:46,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-28 11:09:46,976 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:47,045 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:09:47,706 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:09:47,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:47,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:47,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-28 11:09:48,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:48,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:09:48,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:09:48,945 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:49,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:49,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:09:49,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:49,423 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 11:09:49,447 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:09:49,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:49,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:50,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:09:50,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:09:50,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:09:50,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 11:09:50,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:51,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-28 11:09:51,081 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-28 11:09:51,855 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:51,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:09:51,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:09:51,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:52,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:09:52,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:09:52,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:52,827 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:53,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:53,423 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:53,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-28 11:09:54,000 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:54,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:54,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:09:54,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:54,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:54,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:09:54,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:54,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:09:55,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:09:55,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:09:55,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:55,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:55,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:55,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-28 11:09:56,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-28 11:09:56,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:56,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:56,287 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:56,288 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:56,663 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:57,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-28 11:09:57,790 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:59,069 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:09:59,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:09:59,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:59,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:59,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:10:00,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:10:00,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-28 11:10:00,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:00,804 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:10:00,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:10:00,956 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:10:00,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-28 11:10:01,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:10:01,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:01,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:01,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-28 11:10:02,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-28 11:10:02,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:10:03,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:10:03,187 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-28 11:10:03,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:03,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:10:03,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:10:03,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-28 11:10:03,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:03,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-28 11:10:03,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:04,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:04,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-28 11:10:05,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:06,719 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:10:06,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:07,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-28 11:10:07,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:10:08,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:08,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:08,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:10:08,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-28 11:10:09,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-28 11:10:09,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-28 11:10:09,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-28 11:10:10,004 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:10:10,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:10,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:10,114 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:10,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:10:10,259 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-28 11:10:10,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-28 11:10:10,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:10:10,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:10:11,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:10:11,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:10:11,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:11,670 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:11,693 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-28 11:10:11,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:10:12,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:12,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-28 11:10:12,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-28 11:10:13,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-28 11:10:13,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:10:13,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:10:13,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:10:13,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:13,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 11:10:13,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:10:13,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-28 11:10:14,263 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:14,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-28 11:10:14,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:15,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-28 11:10:15,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:10:15,287 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-28 11:10:15,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-28 11:10:16,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:10:16,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:10:16,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-28 11:10:16,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:10:16,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:10:16,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:17,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:17,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:10:17,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:10:17,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-28 11:10:17,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:10:18,084 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:18,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:10:18,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-28 11:10:18,576 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-28 11:10:18,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:10:18,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-28 11:10:18,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:18,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:10:18,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:10:18,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:19,783 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:19,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:10:20,076 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:10:20,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:20,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:20,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:20,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:10:21,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:21,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:10:22,023 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:22,091 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:22,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:10:22,472 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-28 11:10:22,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-28 11:10:22,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:23,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:10:23,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:10:23,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:10:23,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:23,510 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:10:23,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:24,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:10:24,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:10:24,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:24,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:24,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-28 11:10:24,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:10:24,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:10:25,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:10:25,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:10:25,707 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:10:25,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:25,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:10:26,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:10:26,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:26,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:10:26,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:27,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-28 11:10:27,629 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:28,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-28 11:10:28,346 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-28 11:10:29,159 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:29,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:10:29,401 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-28 11:10:29,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-28 11:10:29,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:30,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-28 11:10:30,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:10:30,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:10:30,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-28 11:10:30,325 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:30,430 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:10:30,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-28 11:10:30,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:30,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:10:30,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-28 11:10:30,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-28 11:10:31,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:10:31,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-28 11:10:31,109 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:10:31,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:31,285 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:10:31,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-28 11:10:31,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-28 11:10:31,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-28 11:10:31,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:31,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:31,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-28 11:10:31,889 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:10:32,106 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:32,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:32,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-28 11:10:32,696 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-28 11:10:33,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:10:33,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:10:34,158 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-28 11:10:34,553 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:10:34,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:34,726 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:35,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-28 11:10:35,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:35,283 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:10:35,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:35,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-28 11:10:35,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:10:35,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:10:36,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:36,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-28 11:10:37,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:38,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:10:38,716 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:38,729 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:10:38,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:38,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:10:38,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:10:38,970 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:39,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:39,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-28 11:10:39,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:10:40,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:40,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:10:40,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-28 11:10:41,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:41,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:41,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:10:41,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:10:41,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:10:42,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-28 11:10:42,881 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-28 11:10:43,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:10:43,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-28 11:10:43,215 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:10:43,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:43,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:43,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:10:43,550 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-28 11:10:43,696 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-28 11:10:43,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:44,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:10:44,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:44,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-28 11:10:44,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:45,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-28 11:10:45,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:10:45,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:10:45,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:10:45,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:10:45,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:10:46,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:10:46,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:46,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:46,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:10:46,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-28 11:10:46,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:10:46,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:10:47,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:47,903 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-28 11:10:47,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:10:48,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:48,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:48,240 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-28 11:10:48,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:10:48,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-28 11:10:48,655 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:49,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:10:49,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:49,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-28 11:10:49,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-28 11:10:49,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:49,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:49,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:10:50,178 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-28 11:10:50,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:10:50,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-28 11:10:50,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-28 11:10:50,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:51,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:10:51,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:51,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-28 11:10:51,660 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-28 11:10:52,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:10:52,353 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:10:52,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:53,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-28 11:10:53,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:53,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:53,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-28 11:10:53,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:54,680 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:54,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-28 11:10:54,962 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-28 11:10:55,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:55,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-28 11:10:55,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-28 11:10:55,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:56,614 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:56,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-28 11:10:56,978 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-28 11:10:57,001 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-28 11:10:57,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-28 11:10:57,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:57,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-28 11:10:58,047 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-28 11:10:58,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:10:58,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:10:58,765 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-28 11:10:59,026 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:10:59,101 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-28 11:10:59,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:10:59,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:10:59,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:10:59,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:10:59,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:11:00,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:11:00,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-28 11:11:00,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-28 11:11:00,261 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-28 11:11:00,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:11:00,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-28 11:11:00,523 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:11:00,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 11:11:01,425 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:11:01,607 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:11:02,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 11:11:02,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-28 11:11:02,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:11:02,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:11:02,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:11:02,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:11:02,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:11:02,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:11:02,887 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:11:02,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-28 11:11:03,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:11:03,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:11:03,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:11:03,752 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-28 11:11:03,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:11:04,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:11:04,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-28 11:11:05,310 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:11:06,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:11:06,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:11:06,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:11:06,746 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:11:07,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-28 11:11:07,541 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:11:08,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:11:08,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:11:08,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:11:08,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:11:09,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-28 11:11:09,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:11:09,863 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:11:10,172 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:11:10,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:11:10,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:11:10,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:11:10,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:11:10,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:11:10,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:11:11,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:11:11,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:11:11,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-28 11:11:12,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:11:12,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:11:12,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:11:12,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:11:12,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:11:12,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-28 11:11:13,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:11:13,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:11:13,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-28 11:11:13,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:11:13,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:11:14,026 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-28 11:11:14,097 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-28 11:11:14,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-28 11:11:15,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:11:15,134 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-28 11:11:15,144 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:11:15,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:11:15,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:11:15,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-28 11:11:15,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:11:15,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:11:16,287 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-28 11:11:16,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-28 11:11:16,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-28 11:11:17,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-28 11:11:17,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:11:17,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:11:17,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:11:17,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-28 11:11:18,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:11:18,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-28 11:11:18,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:11:18,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:11:18,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:11:18,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:11:19,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:11:19,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:11:19,265 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:11:19,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:11:20,058 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-28 11:11:20,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:11:20,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:11:20,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:11:20,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:11:20,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:11:20,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:11:22,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:11:22,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:11:22,256 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:11:22,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:11:23,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:11:23,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:11:23,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:11:23,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-28 11:11:23,955 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:11:24,118 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:11:27,545 INFO [scaling.py:1022] (1/4) Whitening: name=None, num_groups=1, num_channels=256, metric=76.53 vs. limit=7.5 2023-09-28 11:11:27,852 INFO [scaling.py:1022] (1/4) Whitening: name=None, num_groups=1, num_channels=512, metric=159.23 vs. limit=7.5 2023-09-28 11:11:28,534 INFO [train.py:1379] (1/4) Maximum memory allocated so far is 19445MB 2023-09-28 11:11:31,714 INFO [train.py:1379] (1/4) Maximum memory allocated so far is 19594MB 2023-09-28 11:11:36,034 INFO [train.py:1379] (1/4) Maximum memory allocated so far is 19594MB 2023-09-28 11:11:39,575 INFO [train.py:1379] (1/4) Maximum memory allocated so far is 19594MB 2023-09-28 11:11:51,708 INFO [train.py:1379] (1/4) Maximum memory allocated so far is 19594MB 2023-09-28 11:11:58,976 INFO [train.py:1379] (1/4) Maximum memory allocated so far is 19594MB 2023-09-28 11:12:16,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:12:16,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 11:12:16,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 11:12:16,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:12:17,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:12:17,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:12:17,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:12:17,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:12:17,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:12:17,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:12:17,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:12:18,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:12:18,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 11:12:18,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 11:12:18,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 11:12:18,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:12:18,613 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 11:12:18,743 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 11:12:18,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:12:19,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:12:19,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:12:20,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:12:20,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:12:20,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:12:20,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:12:20,768 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:12:20,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:12:20,939 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:12:20,946 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:12:20,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:12:20,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:12:21,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 11:12:22,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:12:22,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:12:22,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 11:12:22,555 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 11:12:22,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:12:22,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:12:22,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 11:12:23,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 11:12:23,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:12:24,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:12:24,525 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:12:24,683 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 11:12:24,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 11:12:24,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:12:24,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:12:25,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 11:12:25,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 11:12:25,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 11:12:25,578 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:12:30,490 INFO [train.py:1039] (1/4) Epoch 1, batch 0, loss[loss=9.369, simple_loss=8.513, pruned_loss=8.539, over 23291.00 frames. ], tot_loss[loss=9.369, simple_loss=8.513, pruned_loss=8.539, over 23291.00 frames. ], batch size: 106, lr: 2.25e-02, grad_scale: 1.0 2023-09-28 11:12:30,490 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-28 11:12:44,950 INFO [train.py:1071] (1/4) Epoch 1, validation: loss=9.318, simple_loss=8.466, pruned_loss=8.496, over 1125622.00 frames. 2023-09-28 11:12:44,951 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 19594MB 2023-09-28 11:12:46,454 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=8.95 vs. limit=7.5 2023-09-28 11:12:47,708 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=19.82 vs. limit=7.5 2023-09-28 11:12:50,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-28 11:12:50,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:12:52,018 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:12:54,309 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=10.47 vs. limit=5.0 2023-09-28 11:12:57,111 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=0.0, ans=0.3 2023-09-28 11:12:58,628 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:12:58,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:13:01,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:13:01,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-28 11:13:02,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-28 11:13:06,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:13:06,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:13:07,007 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=66.66666666666667, ans=0.8976666666666667 2023-09-28 11:13:10,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:13:11,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:13:11,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:13:11,855 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:13:13,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-28 11:13:17,011 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:13:17,779 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=510.83 vs. limit=7.55 2023-09-28 11:13:21,649 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=21.03 vs. limit=7.55 2023-09-28 11:13:26,904 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:13:26,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:13:27,970 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=83.19 vs. limit=5.066666666666666 2023-09-28 11:13:28,778 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-28 11:13:33,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:13:33,723 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:13:34,613 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=511.11 vs. limit=7.6 2023-09-28 11:13:36,067 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=245.54 vs. limit=7.55 2023-09-28 11:13:36,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:13:40,176 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=245.20 vs. limit=7.575 2023-09-28 11:13:42,157 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=482.64 vs. limit=7.575 2023-09-28 11:13:43,056 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:13:47,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:13:54,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-28 11:13:57,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-28 11:13:57,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:13:57,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:13:59,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:13:59,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:14:02,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-28 11:14:04,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:14:04,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:14:07,712 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:14:11,234 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-28 11:14:14,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:14:15,858 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=19.62 vs. limit=5.166666666666667 2023-09-28 11:14:16,927 INFO [train.py:1039] (1/4) Epoch 1, batch 50, loss[loss=1.234, simple_loss=1.099, pruned_loss=1.214, over 23392.00 frames. ], tot_loss[loss=3.831, simple_loss=3.525, pruned_loss=3.003, over 1057502.93 frames. ], batch size: 119, lr: 2.48e-02, grad_scale: 0.25 2023-09-28 11:14:19,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:14:19,554 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=333.3333333333333, ans=0.0925 2023-09-28 11:14:21,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:14:22,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-28 11:14:22,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:14:22,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:14:26,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:14:26,551 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:14:31,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:14:35,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-28 11:14:35,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:14:40,226 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=400.0, ans=0.246 2023-09-28 11:14:44,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:14:44,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-28 11:14:46,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-28 11:14:49,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:14:51,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:14:51,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:14:51,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:14:53,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:14:53,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:14:53,659 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:14:55,650 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=466.6666666666667, ans=0.0895 2023-09-28 11:14:59,825 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=466.6666666666667, ans=0.8836666666666667 2023-09-28 11:15:00,377 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=53.55 vs. limit=7.85 2023-09-28 11:15:02,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:15:04,641 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:15:04,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:15:04,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-28 11:15:06,522 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:15:08,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:15:08,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-28 11:15:09,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:15:10,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-28 11:15:14,613 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=267.63 vs. limit=7.7 2023-09-28 11:15:19,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:15:19,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:15:21,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:15:23,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:15:23,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:15:25,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-28 11:15:25,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-28 11:15:25,998 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=460.02 vs. limit=7.7 2023-09-28 11:15:27,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:15:27,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:15:32,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:15:32,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:15:32,903 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=10.59 vs. limit=4.24 2023-09-28 11:15:34,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-28 11:15:34,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-28 11:15:36,765 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-28 11:15:36,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:15:37,284 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=600.0, ans=0.471875 2023-09-28 11:15:38,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:15:39,055 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=600.0, ans=0.879 2023-09-28 11:15:40,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-28 11:15:40,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-28 11:15:40,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:15:42,147 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:15:42,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:15:43,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:15:47,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:15:49,917 INFO [train.py:1039] (1/4) Epoch 1, batch 100, loss[loss=1.209, simple_loss=1.04, pruned_loss=1.342, over 23178.00 frames. ], tot_loss[loss=2.398, simple_loss=2.177, pruned_loss=2.05, over 1882824.31 frames. ], batch size: 93, lr: 2.70e-02, grad_scale: 0.5 2023-09-28 11:15:50,787 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=33.95 vs. limit=5.333333333333333 2023-09-28 11:15:51,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:15:55,762 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 2.173e+02 3.855e+02 5.319e+03 2.503e+05, threshold=7.710e+02, percent-clipped=0.0 2023-09-28 11:15:55,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:15:56,935 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=240.95 vs. limit=5.333333333333333 2023-09-28 11:15:57,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-28 11:15:58,676 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=214.24 vs. limit=7.75 2023-09-28 11:15:59,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:16:02,814 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:16:04,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:16:04,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:16:04,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:16:04,592 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:16:06,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-28 11:16:08,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:16:08,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:16:08,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:16:08,387 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:16:09,181 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=64.81 vs. limit=5.366666666666667 2023-09-28 11:16:10,431 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=733.3333333333334, ans=0.17250000000000001 2023-09-28 11:16:13,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-28 11:16:14,135 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=52.49 vs. limit=7.775 2023-09-28 11:16:16,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:16:18,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:16:18,169 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:16:20,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:16:24,646 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=11.62 vs. limit=7.775 2023-09-28 11:16:25,368 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-28 11:16:25,404 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-28 11:16:27,188 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:16:27,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:16:31,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:16:35,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:16:36,769 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=243.92 vs. limit=7.8 2023-09-28 11:16:39,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:16:46,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:16:46,247 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-28 11:16:48,937 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=6.76 vs. limit=4.346666666666667 2023-09-28 11:16:49,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-28 11:16:54,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:16:54,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:16:56,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:16:59,286 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.98 vs. limit=4.346666666666667 2023-09-28 11:17:00,900 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=42.84 vs. limit=8.15 2023-09-28 11:17:01,686 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:17:02,596 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=78.63 vs. limit=7.825 2023-09-28 11:17:05,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:17:05,641 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=933.3333333333334, ans=0.29066666666666663 2023-09-28 11:17:09,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:17:11,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:17:11,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:17:12,110 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=47.40 vs. limit=7.85 2023-09-28 11:17:13,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:17:13,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:17:13,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:17:14,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-28 11:17:14,868 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-28 11:17:15,166 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=933.3333333333334, ans=0.09416666666666668 2023-09-28 11:17:16,481 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=11.76 vs. limit=7.85 2023-09-28 11:17:17,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:17:17,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:17:18,145 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=27.97 vs. limit=8.2 2023-09-28 11:17:19,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:17:19,109 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:17:19,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 11:17:19,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 11:17:20,834 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:17:20,844 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:17:20,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:17:22,586 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:17:22,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:17:24,256 INFO [train.py:1039] (1/4) Epoch 1, batch 150, loss[loss=1.047, simple_loss=0.8872, pruned_loss=1.149, over 24305.00 frames. ], tot_loss[loss=1.84, simple_loss=1.646, pruned_loss=1.683, over 2511246.19 frames. ], batch size: 74, lr: 2.93e-02, grad_scale: 0.5 2023-09-28 11:17:24,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:17:27,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:17:28,147 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1000.0, ans=0.046875 2023-09-28 11:17:29,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:17:29,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:17:29,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:17:32,077 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=24.11 vs. limit=7.875 2023-09-28 11:17:36,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:17:36,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:17:38,138 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1000.0, ans=0.865 2023-09-28 11:17:41,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:17:42,617 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.79 vs. limit=5.266666666666667 2023-09-28 11:17:42,998 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:17:43,307 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1066.6666666666667, ans=0.04666666666666667 2023-09-28 11:17:47,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-28 11:17:47,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-28 11:17:47,366 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-28 11:17:50,915 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:17:50,923 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:17:54,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:17:54,359 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:17:55,095 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=81.40 vs. limit=7.9 2023-09-28 11:17:55,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:17:55,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:17:57,677 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:17:58,502 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-28 11:18:00,950 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=79.72 vs. limit=7.925 2023-09-28 11:18:01,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:18:07,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:18:10,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:18:10,685 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-28 11:18:11,341 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=71.46 vs. limit=7.925 2023-09-28 11:18:14,146 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=236.70 vs. limit=7.925 2023-09-28 11:18:15,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:18:15,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:18:15,412 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:18:16,020 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=35.28 vs. limit=7.925 2023-09-28 11:18:17,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:18:18,002 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=9.31 vs. limit=5.283333333333333 2023-09-28 11:18:18,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:18:22,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:18:22,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:18:22,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-28 11:18:24,220 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1200.0, ans=0.44375 2023-09-28 11:18:31,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:18:31,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:18:33,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:18:33,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:18:37,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:18:38,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 11:18:40,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:18:43,907 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=178.22 vs. limit=7.975 2023-09-28 11:18:44,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:18:46,089 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:18:47,955 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:18:48,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-28 11:18:50,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:18:50,171 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-28 11:18:54,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:18:58,370 INFO [train.py:1039] (1/4) Epoch 1, batch 200, loss[loss=0.826, simple_loss=0.6918, pruned_loss=0.8875, over 17177.00 frames. ], tot_loss[loss=1.528, simple_loss=1.351, pruned_loss=1.446, over 3002250.59 frames. ], batch size: 37, lr: 3.15e-02, grad_scale: 1.0 2023-09-28 11:19:00,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:19:01,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:19:02,428 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=17.43 vs. limit=8.5 2023-09-28 11:19:03,418 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 9.506e+01 1.160e+02 1.347e+02 1.565e+02 3.276e+02, threshold=2.693e+02, percent-clipped=0.0 2023-09-28 11:19:04,465 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=12.05 vs. limit=4.533333333333333 2023-09-28 11:19:05,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-28 11:19:05,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:19:05,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:19:09,579 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-28 11:19:11,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:19:12,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:19:13,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:19:15,130 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1400.0, ans=0.434375 2023-09-28 11:19:18,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:19:18,255 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:19:18,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:19:34,977 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=14.44 vs. limit=4.586666666666667 2023-09-28 11:19:36,812 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.84 vs. limit=8.6 2023-09-28 11:19:42,218 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=108.90 vs. limit=8.05 2023-09-28 11:19:43,697 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=17.07 vs. limit=5.733333333333333 2023-09-28 11:19:44,819 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1466.6666666666667, ans=0.145 2023-09-28 11:19:46,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:19:46,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:19:46,917 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=18.17 vs. limit=8.05 2023-09-28 11:19:48,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:19:48,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:19:49,358 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=18.58 vs. limit=8.05 2023-09-28 11:19:50,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 11:19:50,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:19:52,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:19:52,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:19:52,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:19:52,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:19:54,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-28 11:19:55,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 11:19:55,924 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:20:00,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:20:09,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:20:09,984 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=13.24 vs. limit=8.075 2023-09-28 11:20:17,514 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=40.56 vs. limit=8.1 2023-09-28 11:20:18,134 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:20:18,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:20:26,578 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.15 vs. limit=5.4 2023-09-28 11:20:27,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:20:29,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-28 11:20:29,125 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:20:29,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:20:29,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:20:29,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:20:29,954 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.52 vs. limit=8.75 2023-09-28 11:20:30,796 INFO [train.py:1039] (1/4) Epoch 1, batch 250, loss[loss=0.78, simple_loss=0.6623, pruned_loss=0.7479, over 22736.00 frames. ], tot_loss[loss=1.332, simple_loss=1.167, pruned_loss=1.278, over 3376471.61 frames. ], batch size: 322, lr: 3.38e-02, grad_scale: 1.0 2023-09-28 11:20:31,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-28 11:20:31,678 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=16.98 vs. limit=5.833333333333333 2023-09-28 11:20:32,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:20:32,816 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-28 11:20:33,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:20:33,504 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=14.04 vs. limit=5.833333333333333 2023-09-28 11:20:37,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:20:38,741 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:20:40,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:20:40,530 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1666.6666666666667, ans=0.29166666666666663 2023-09-28 11:20:42,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:20:43,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:20:45,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:20:51,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:20:56,243 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=10.60 vs. limit=5.433333333333334 2023-09-28 11:21:02,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:21:02,621 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1733.3333333333333, ans=0.135 2023-09-28 11:21:06,358 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:21:06,656 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1800.0, ans=0.0595 2023-09-28 11:21:07,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:21:12,063 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=16.26 vs. limit=5.9 2023-09-28 11:21:15,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:21:15,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:21:17,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:21:17,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:21:19,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:21:19,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:21:19,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:21:23,037 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:21:23,830 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=15.11 vs. limit=5.9 2023-09-28 11:21:26,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-28 11:21:26,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:21:26,638 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1866.6666666666667, ans=0.4125 2023-09-28 11:21:27,113 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=14.96 vs. limit=8.2 2023-09-28 11:21:29,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:21:29,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:21:29,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:21:29,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:21:32,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:21:32,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:21:32,541 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1866.6666666666667, ans=0.2813333333333333 2023-09-28 11:21:34,140 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:21:34,319 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1866.6666666666667, ans=0.4125 2023-09-28 11:21:35,917 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:21:35,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:21:38,704 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=17.52 vs. limit=8.2 2023-09-28 11:21:42,491 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=97.08 vs. limit=8.2 2023-09-28 11:21:43,286 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:21:45,954 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=22.12 vs. limit=8.95 2023-09-28 11:21:46,356 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=57.36 vs. limit=5.0 2023-09-28 11:21:46,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:21:49,256 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=15.51 vs. limit=5.966666666666667 2023-09-28 11:21:50,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:21:56,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:21:57,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:22:03,741 INFO [train.py:1039] (1/4) Epoch 1, batch 300, loss[loss=0.7556, simple_loss=0.637, pruned_loss=0.7071, over 23404.00 frames. ], tot_loss[loss=1.197, simple_loss=1.04, pruned_loss=1.154, over 3668005.06 frames. ], batch size: 285, lr: 3.60e-02, grad_scale: 2.0 2023-09-28 11:22:03,801 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-28 11:22:03,995 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:22:05,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:22:07,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-28 11:22:07,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:22:08,762 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.09 vs. limit=9.0 2023-09-28 11:22:09,476 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 8.573e+01 1.074e+02 1.349e+02 1.820e+02 4.135e+02, threshold=2.699e+02, percent-clipped=10.0 2023-09-28 11:22:09,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:22:09,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-28 11:22:10,107 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=2000.0, ans=0.5 2023-09-28 11:22:12,381 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=1.83 vs. limit=3.3 2023-09-28 11:22:13,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:22:13,572 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:22:16,061 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=15.15 vs. limit=9.0 2023-09-28 11:22:17,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:22:17,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-28 11:22:19,247 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:22:19,547 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=2000.0, ans=0.055 2023-09-28 11:22:20,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:22:20,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-28 11:22:20,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:22:21,579 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=11.58 vs. limit=6.033333333333333 2023-09-28 11:22:22,754 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=2066.6666666666665, ans=0.403125 2023-09-28 11:22:25,129 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=122.61 vs. limit=8.275 2023-09-28 11:22:26,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:22:32,569 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:22:32,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-28 11:22:36,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-28 11:22:37,850 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:22:39,270 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=50.87 vs. limit=6.033333333333333 2023-09-28 11:22:39,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:22:42,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:22:42,012 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-28 11:22:42,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:22:43,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:22:46,382 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=15.56 vs. limit=6.066666666666666 2023-09-28 11:22:47,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:22:48,838 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:22:53,914 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-28 11:22:53,921 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-28 11:22:54,239 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=2133.3333333333335, ans=0.22866666666666666 2023-09-28 11:22:56,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:22:57,196 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=20.35 vs. limit=8.3 2023-09-28 11:22:57,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:22:59,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-28 11:23:01,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:23:08,801 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:23:11,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:23:11,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-28 11:23:12,238 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=29.01 vs. limit=8.325 2023-09-28 11:23:16,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:23:16,490 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:23:17,115 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=40.02 vs. limit=8.35 2023-09-28 11:23:19,874 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:23:20,042 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:23:20,846 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=9.34 vs. limit=5.566666666666666 2023-09-28 11:23:21,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-28 11:23:21,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:23:21,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:23:25,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-28 11:23:27,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:23:27,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:23:28,660 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=32.47 vs. limit=8.35 2023-09-28 11:23:30,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:23:30,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:23:31,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:23:36,549 INFO [train.py:1039] (1/4) Epoch 1, batch 350, loss[loss=0.7937, simple_loss=0.6501, pruned_loss=0.778, over 21880.00 frames. ], tot_loss[loss=1.101, simple_loss=0.9478, pruned_loss=1.059, over 3895533.91 frames. ], batch size: 48, lr: 3.83e-02, grad_scale: 2.0 2023-09-28 11:23:38,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:23:38,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 11:23:41,062 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:23:46,788 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=38.24 vs. limit=8.375 2023-09-28 11:23:48,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:23:50,542 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=2333.3333333333335, ans=0.1125 2023-09-28 11:23:52,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:23:52,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:23:56,148 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.70 vs. limit=4.96 2023-09-28 11:23:57,056 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-28 11:23:57,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:23:57,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-28 11:23:59,796 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=36.24 vs. limit=8.4 2023-09-28 11:24:00,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:24:00,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-28 11:24:03,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:24:04,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-28 11:24:08,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:24:08,612 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=2400.0, ans=0.3875 2023-09-28 11:24:09,074 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.66 vs. limit=9.3 2023-09-28 11:24:10,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:24:10,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:24:12,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:24:12,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:24:12,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:24:12,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:24:14,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:24:14,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:24:14,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:24:15,018 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.38 vs. limit=5.616666666666666 2023-09-28 11:24:24,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:24:24,439 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:24:26,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:24:26,151 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:24:31,353 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=2533.3333333333335, ans=0.04949747468305833 2023-09-28 11:24:32,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-28 11:24:32,983 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:24:40,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:24:40,527 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:24:42,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:24:43,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-28 11:24:46,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:24:46,414 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-28 11:24:48,189 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-28 11:24:48,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:24:48,892 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=13.14 vs. limit=8.45 2023-09-28 11:24:53,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:24:53,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-28 11:24:53,859 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=10.89 vs. limit=8.475 2023-09-28 11:24:54,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:24:57,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:24:59,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:24:59,582 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=2600.0, ans=0.809 2023-09-28 11:24:59,583 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=2600.0, ans=0.378125 2023-09-28 11:25:00,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:25:01,012 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:25:03,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:25:08,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:25:10,563 INFO [train.py:1039] (1/4) Epoch 1, batch 400, loss[loss=0.8313, simple_loss=0.6929, pruned_loss=0.7414, over 23600.00 frames. ], tot_loss[loss=1.043, simple_loss=0.8897, pruned_loss=0.9979, over 4075176.43 frames. ], batch size: 256, lr: 4.05e-02, grad_scale: 4.0 2023-09-28 11:25:10,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:25:12,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-28 11:25:12,374 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:25:12,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:25:12,843 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=2666.6666666666665, ans=0.04 2023-09-28 11:25:14,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:25:14,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:25:14,961 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=14.41 vs. limit=8.5 2023-09-28 11:25:15,792 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 9.874e+01 1.367e+02 1.651e+02 2.389e+02 7.473e+02, threshold=3.302e+02, percent-clipped=14.0 2023-09-28 11:25:17,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:25:19,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:25:22,136 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-28 11:25:23,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-28 11:25:23,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:25:25,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-28 11:25:25,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:25:29,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:25:29,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:25:30,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-28 11:25:31,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:25:31,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:25:31,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:25:33,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:25:35,733 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-28 11:25:37,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-28 11:25:41,485 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=16.72 vs. limit=9.55 2023-09-28 11:25:42,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:25:42,998 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=2733.3333333333335, ans=0.371875 2023-09-28 11:25:44,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:25:45,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-28 11:25:46,082 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-28 11:25:49,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:25:51,289 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:25:56,205 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=2800.0, ans=0.0825 2023-09-28 11:25:57,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-28 11:26:02,361 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-28 11:26:04,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-28 11:26:08,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:26:09,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:26:09,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-28 11:26:15,053 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=14.68 vs. limit=8.575 2023-09-28 11:26:15,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:26:17,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:26:19,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:26:19,403 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=2866.6666666666665, ans=0.365625 2023-09-28 11:26:19,438 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=2866.6666666666665, ans=0.243 2023-09-28 11:26:19,471 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.max_abs, batch_count=2866.6666666666665, ans=6.791666666666666 2023-09-28 11:26:22,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:26:22,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-28 11:26:24,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:26:27,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-28 11:26:29,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:26:29,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:26:33,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-28 11:26:35,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:26:37,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:26:37,273 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:26:40,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-28 11:26:40,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:26:40,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:26:42,310 INFO [train.py:1039] (1/4) Epoch 1, batch 450, loss[loss=0.8461, simple_loss=0.6854, pruned_loss=0.7854, over 24582.00 frames. ], tot_loss[loss=1.003, simple_loss=0.8483, pruned_loss=0.9517, over 4212771.63 frames. ], batch size: 60, lr: 4.28e-02, grad_scale: 4.0 2023-09-28 11:26:42,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:26:42,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-28 11:26:42,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:26:44,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:26:48,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:26:52,741 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=3000.0, ans=0.0875 2023-09-28 11:26:57,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:26:59,557 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:27:01,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-28 11:27:03,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-28 11:27:04,935 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=3066.6666666666665, ans=0.35625 2023-09-28 11:27:08,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:27:11,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:27:13,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:27:19,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:27:19,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:27:21,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-28 11:27:22,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-28 11:27:25,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-28 11:27:26,766 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:27:28,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:27:28,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:27:31,094 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-28 11:27:31,299 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=3133.3333333333335, ans=0.353125 2023-09-28 11:27:32,584 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-28 11:27:32,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:27:34,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:27:36,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-28 11:27:38,699 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=32.59 vs. limit=8.7 2023-09-28 11:27:39,457 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:27:39,511 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:27:41,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-28 11:27:41,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-28 11:27:44,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:27:45,115 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=3200.0, ans=7.0 2023-09-28 11:27:46,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:27:46,613 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=3200.0, ans=0.35 2023-09-28 11:27:48,084 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:27:49,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-28 11:27:52,774 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.78 vs. limit=8.7 2023-09-28 11:27:53,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:27:56,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-28 11:27:57,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-28 11:27:59,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:28:04,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:28:05,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:28:05,699 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=21.61 vs. limit=8.725 2023-09-28 11:28:09,062 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:28:09,110 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-28 11:28:12,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:28:13,992 INFO [train.py:1039] (1/4) Epoch 1, batch 500, loss[loss=0.8601, simple_loss=0.6986, pruned_loss=0.7648, over 23428.00 frames. ], tot_loss[loss=0.9696, simple_loss=0.8132, pruned_loss=0.9097, over 4325538.35 frames. ], batch size: 93, lr: 4.49e-02, grad_scale: 8.0 2023-09-28 11:28:14,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:28:14,198 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:28:15,788 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-28 11:28:15,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-28 11:28:15,995 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:28:19,328 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 9.903e+01 1.529e+02 1.913e+02 2.430e+02 4.167e+02, threshold=3.825e+02, percent-clipped=6.0 2023-09-28 11:28:19,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:28:26,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:28:27,696 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:28:31,581 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:28:31,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:28:33,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:28:35,858 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=3400.0, ans=0.340625 2023-09-28 11:28:42,408 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=3400.0, ans=7.125 2023-09-28 11:28:43,162 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=3400.0, ans=0.340625 2023-09-28 11:28:47,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:28:47,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-28 11:28:47,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:28:47,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:28:48,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-28 11:28:48,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:28:52,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:28:53,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:28:53,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:28:53,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:28:53,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-28 11:28:55,852 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-28 11:28:59,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:28:59,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:29:01,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:29:02,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:29:02,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:29:03,485 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.56 vs. limit=10.1 2023-09-28 11:29:04,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-28 11:29:06,248 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=3533.3333333333335, ans=0.7763333333333333 2023-09-28 11:29:08,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:29:10,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:29:14,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:29:16,746 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=3533.3333333333335, ans=0.26466666666666666 2023-09-28 11:29:17,095 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=14.96 vs. limit=8.825 2023-09-28 11:29:19,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:29:26,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:29:30,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-28 11:29:30,228 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:29:30,782 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=16.10 vs. limit=8.85 2023-09-28 11:29:31,780 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:29:35,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-28 11:29:35,325 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:29:36,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:29:37,825 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=15.04 vs. limit=8.85 2023-09-28 11:29:43,591 INFO [train.py:1039] (1/4) Epoch 1, batch 550, loss[loss=0.9492, simple_loss=0.768, pruned_loss=0.8247, over 24441.00 frames. ], tot_loss[loss=0.9444, simple_loss=0.7866, pruned_loss=0.8731, over 4413492.06 frames. ], batch size: 69, lr: 4.49e-02, grad_scale: 8.0 2023-09-28 11:29:43,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-28 11:29:45,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-28 11:29:45,396 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:29:45,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-28 11:29:47,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:29:47,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:29:49,267 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:29:49,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:29:49,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:29:51,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:29:52,568 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.32 vs. limit=5.916666666666667 2023-09-28 11:29:52,890 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=2.61 vs. limit=5.466666666666667 2023-09-28 11:29:53,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:29:56,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-28 11:29:56,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:30:00,804 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:30:00,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:30:03,168 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=57.68 vs. limit=8.9 2023-09-28 11:30:04,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:30:05,070 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.91 vs. limit=3.56 2023-09-28 11:30:05,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:30:11,500 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=6.49 vs. limit=5.493333333333333 2023-09-28 11:30:12,406 WARNING [train.py:1197] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-28 11:30:12,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-28 11:30:14,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:30:14,426 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=3733.3333333333335, ans=0.07666666666666667 2023-09-28 11:30:19,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:30:20,406 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.34 vs. limit=10.35 2023-09-28 11:30:21,069 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:30:22,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:30:27,087 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:30:27,097 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-28 11:30:27,233 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:30:28,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 11:30:32,934 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:30:35,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:30:35,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:30:37,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:30:38,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-28 11:30:40,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-28 11:30:40,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:30:40,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:30:42,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:30:42,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:30:45,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:30:45,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:30:48,011 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=14.15 vs. limit=8.95 2023-09-28 11:30:48,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:30:49,563 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.26 vs. limit=10.4 2023-09-28 11:30:50,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:30:51,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 11:30:51,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:30:53,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:30:55,411 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:30:55,523 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:30:57,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:30:58,831 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-28 11:31:03,322 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=30.75 vs. limit=10.45 2023-09-28 11:31:05,287 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.52 vs. limit=5.573333333333333 2023-09-28 11:31:06,844 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=3933.3333333333335, ans=0.315625 2023-09-28 11:31:08,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-28 11:31:12,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-28 11:31:14,892 INFO [train.py:1039] (1/4) Epoch 1, batch 600, loss[loss=0.8965, simple_loss=0.726, pruned_loss=0.7547, over 24655.00 frames. ], tot_loss[loss=0.9266, simple_loss=0.7669, pruned_loss=0.8425, over 4489050.71 frames. ], batch size: 68, lr: 4.49e-02, grad_scale: 8.0 2023-09-28 11:31:14,981 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:31:15,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:31:15,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:31:15,897 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.23 vs. limit=9.0 2023-09-28 11:31:17,055 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=4000.0, ans=0.04999999999999999 2023-09-28 11:31:21,135 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.60 vs. limit=9.0 2023-09-28 11:31:21,656 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.125e+02 1.678e+02 2.306e+02 3.262e+02 8.742e+02, threshold=4.612e+02, percent-clipped=14.0 2023-09-28 11:31:23,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:31:24,545 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.17 vs. limit=9.0 2023-09-28 11:31:25,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:31:26,833 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-28 11:31:28,578 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:31:30,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:31:32,086 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:31:36,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-28 11:31:37,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:31:44,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-28 11:31:49,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:31:49,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:31:49,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:31:55,609 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.70 vs. limit=9.05 2023-09-28 11:31:56,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:31:56,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:31:57,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:32:06,188 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:32:08,677 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=25.37 vs. limit=9.075 2023-09-28 11:32:08,887 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.53 vs. limit=9.075 2023-09-28 11:32:10,534 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=4200.0, ans=0.303125 2023-09-28 11:32:11,790 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:32:11,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:32:11,808 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:32:12,053 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=4200.0, ans=0.0 2023-09-28 11:32:14,195 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=29.07 vs. limit=9.075 2023-09-28 11:32:17,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-28 11:32:24,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:32:24,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:32:28,439 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=4266.666666666667, ans=0.03666666666666667 2023-09-28 11:32:29,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-28 11:32:29,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:32:33,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-28 11:32:33,208 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:32:33,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:32:37,331 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.21 vs. limit=9.1 2023-09-28 11:32:42,886 INFO [train.py:1039] (1/4) Epoch 1, batch 650, loss[loss=0.9004, simple_loss=0.7391, pruned_loss=0.7166, over 23963.00 frames. ], tot_loss[loss=0.9028, simple_loss=0.7449, pruned_loss=0.8034, over 4536717.56 frames. ], batch size: 80, lr: 4.49e-02, grad_scale: 8.0 2023-09-28 11:32:42,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 11:32:45,182 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:32:48,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:32:48,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:32:52,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:32:55,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-28 11:32:56,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:33:02,173 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=24.01 vs. limit=9.15 2023-09-28 11:33:03,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:33:03,719 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:33:05,691 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:33:06,546 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.12 vs. limit=10.8 2023-09-28 11:33:09,151 WARNING [train.py:1197] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-28 11:33:10,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:33:10,904 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:33:15,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:33:15,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 11:33:19,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:33:19,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:33:19,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:33:21,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:33:23,104 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:33:26,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:33:26,461 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-28 11:33:26,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:33:26,521 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:33:31,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:33:33,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:33:33,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:33:34,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:33:34,385 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=4466.666666666667, ans=0.290625 2023-09-28 11:33:35,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-28 11:33:37,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:33:37,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:33:37,667 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=4533.333333333333, ans=0.04777777777777778 2023-09-28 11:33:39,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-28 11:33:39,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:33:40,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:33:42,514 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-28 11:33:42,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-28 11:33:42,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:33:42,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:33:42,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:33:43,054 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=4533.333333333333, ans=0.25466666666666665 2023-09-28 11:33:44,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:33:46,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:33:48,080 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=4533.333333333333, ans=0.09899494936611666 2023-09-28 11:33:53,245 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:33:53,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:33:54,882 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:33:58,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:33:59,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 11:33:59,807 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:34:00,382 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=4600.0, ans=0.739 2023-09-28 11:34:05,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:34:05,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:34:07,396 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:34:07,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:34:13,637 INFO [train.py:1039] (1/4) Epoch 1, batch 700, loss[loss=0.8183, simple_loss=0.6818, pruned_loss=0.6163, over 24581.00 frames. ], tot_loss[loss=0.8753, simple_loss=0.7219, pruned_loss=0.7593, over 4587972.77 frames. ], batch size: 71, lr: 4.49e-02, grad_scale: 8.0 2023-09-28 11:34:15,327 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-28 11:34:16,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-28 11:34:19,534 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.63 vs. limit=7.333333333333334 2023-09-28 11:34:20,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-28 11:34:20,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:34:21,819 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.160e+02 1.725e+02 2.743e+02 3.715e+02 1.987e+03, threshold=5.486e+02, percent-clipped=15.0 2023-09-28 11:34:22,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:34:24,583 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.71 vs. limit=11.0 2023-09-28 11:34:25,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-28 11:34:30,468 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:34:33,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:34:34,257 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=4733.333333333333, ans=0.0 2023-09-28 11:34:35,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:34:35,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:34:37,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:34:37,867 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=6.31 vs. limit=5.8933333333333335 2023-09-28 11:34:40,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:34:43,400 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=23.87 vs. limit=9.275 2023-09-28 11:34:44,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 11:34:44,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:34:46,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-28 11:34:51,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-28 11:34:54,347 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=4800.0, ans=0.272 2023-09-28 11:34:55,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:34:57,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:34:58,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:35:02,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:35:04,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-28 11:35:08,747 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=11.13 vs. limit=9.325 2023-09-28 11:35:09,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:35:11,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:35:11,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-28 11:35:14,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:35:16,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:35:20,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:35:27,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:35:28,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-28 11:35:31,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-28 11:35:31,411 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-28 11:35:33,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:35:34,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:35:36,403 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:35:37,147 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.36 vs. limit=11.2 2023-09-28 11:35:39,534 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:35:39,544 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-28 11:35:41,515 INFO [train.py:1039] (1/4) Epoch 1, batch 750, loss[loss=0.6821, simple_loss=0.5654, pruned_loss=0.5088, over 23463.00 frames. ], tot_loss[loss=0.8444, simple_loss=0.6975, pruned_loss=0.7124, over 4624199.69 frames. ], batch size: 134, lr: 4.49e-02, grad_scale: 4.0 2023-09-28 11:35:44,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-28 11:35:44,938 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-28 11:35:44,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-28 11:35:45,971 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.89 vs. limit=11.25 2023-09-28 11:35:46,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-28 11:35:46,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-28 11:35:46,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:35:48,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-28 11:35:49,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:35:49,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:35:51,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:35:53,316 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:35:53,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:35:54,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:35:56,651 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:35:57,405 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=14.91 vs. limit=9.4 2023-09-28 11:35:58,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:36:04,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:36:06,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:36:07,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:36:07,816 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-28 11:36:09,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:36:11,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:36:12,846 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:36:14,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:36:16,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-28 11:36:16,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:36:19,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-28 11:36:19,676 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-28 11:36:19,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-28 11:36:19,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:36:19,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:36:22,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:36:29,814 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=5133.333333333333, ans=0.259375 2023-09-28 11:36:31,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:36:31,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:36:31,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:36:32,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:36:35,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:36:36,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-28 11:36:38,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:36:38,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-28 11:36:40,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:36:42,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:36:42,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-28 11:36:44,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:36:46,067 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=5200.0, ans=0.03375 2023-09-28 11:36:50,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:36:52,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:36:52,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:36:55,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:37:00,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-28 11:37:00,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:37:02,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:37:04,088 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:37:04,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:37:06,511 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.04 vs. limit=9.5 2023-09-28 11:37:07,756 INFO [train.py:1039] (1/4) Epoch 1, batch 800, loss[loss=0.6899, simple_loss=0.5803, pruned_loss=0.4906, over 23226.00 frames. ], tot_loss[loss=0.8148, simple_loss=0.6755, pruned_loss=0.667, over 4643704.23 frames. ], batch size: 93, lr: 4.49e-02, grad_scale: 8.0 2023-09-28 11:37:07,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:37:10,032 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:37:16,636 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.639e+02 4.125e+02 6.476e+02 9.801e+02 2.445e+03, threshold=1.295e+03, percent-clipped=55.0 2023-09-28 11:37:19,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:37:19,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:37:21,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:37:21,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:37:23,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:37:23,148 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:37:26,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:37:28,782 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=5400.0, ans=0.246875 2023-09-28 11:37:28,884 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=5400.0, ans=0.246 2023-09-28 11:37:30,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:37:30,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:37:31,281 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.69 vs. limit=11.55 2023-09-28 11:37:33,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-28 11:37:35,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:37:35,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:37:35,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:37:36,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:37:36,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-28 11:37:36,783 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:37:38,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-28 11:37:40,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:37:44,085 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:37:44,827 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=5466.666666666667, ans=9.55 2023-09-28 11:37:47,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:37:47,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:37:50,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:37:50,866 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=5466.666666666667, ans=0.24533333333333332 2023-09-28 11:37:52,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:37:56,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:37:57,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:37:57,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-28 11:37:58,133 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=5466.666666666667, ans=0.04949747468305833 2023-09-28 11:38:00,583 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.51 vs. limit=9.575 2023-09-28 11:38:01,300 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-28 11:38:01,336 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-28 11:38:01,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:38:01,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:38:02,011 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.81 vs. limit=9.575 2023-09-28 11:38:03,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:38:03,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:38:08,709 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=5533.333333333333, ans=0.24062499999999998 2023-09-28 11:38:09,856 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-28 11:38:09,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-28 11:38:11,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:38:13,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:38:17,267 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.55 vs. limit=9.6 2023-09-28 11:38:18,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:38:21,476 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:38:22,154 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.43 vs. limit=9.6 2023-09-28 11:38:23,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-28 11:38:23,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:38:27,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-28 11:38:34,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:38:36,584 INFO [train.py:1039] (1/4) Epoch 1, batch 850, loss[loss=0.6388, simple_loss=0.5453, pruned_loss=0.4339, over 23146.00 frames. ], tot_loss[loss=0.7814, simple_loss=0.6514, pruned_loss=0.6199, over 4655737.43 frames. ], batch size: 105, lr: 4.49e-02, grad_scale: 8.0 2023-09-28 11:38:38,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:38:40,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-28 11:38:40,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:38:40,416 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:38:40,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-28 11:38:40,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:38:43,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:38:44,123 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=5666.666666666667, ans=0.24333333333333332 2023-09-28 11:38:45,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:38:45,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:38:46,936 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:38:48,590 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-28 11:38:50,013 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-28 11:38:50,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-28 11:38:51,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:38:51,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:38:53,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:38:53,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:38:54,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:38:57,548 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=5733.333333333333, ans=0.07 2023-09-28 11:39:01,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:39:01,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:39:02,717 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.36 vs. limit=9.65 2023-09-28 11:39:03,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-28 11:39:06,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-28 11:39:09,927 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.35 vs. limit=11.85 2023-09-28 11:39:10,674 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:39:10,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-28 11:39:15,243 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=8.45 vs. limit=9.675 2023-09-28 11:39:16,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-28 11:39:16,742 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-28 11:39:19,981 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-28 11:39:19,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:39:20,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:39:20,020 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 11:39:23,310 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:39:24,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:39:25,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-28 11:39:26,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:39:28,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:39:29,259 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.77 vs. limit=7.933333333333334 2023-09-28 11:39:29,999 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:39:31,456 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:39:33,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:39:34,634 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1.whitening_limit, batch_count=5866.666666666667, ans=6.466666666666667 2023-09-28 11:39:35,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:39:35,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-28 11:39:42,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:39:42,458 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:39:42,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:39:42,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:39:44,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:39:44,567 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=5933.333333333333, ans=0.009579710144927537 2023-09-28 11:39:45,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:39:46,625 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.24 vs. limit=11.95 2023-09-28 11:39:47,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:39:49,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:39:49,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:39:51,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:39:55,407 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=5933.333333333333, ans=0.221875 2023-09-28 11:39:57,551 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.20 vs. limit=6.483333333333333 2023-09-28 11:40:00,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:40:01,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:40:03,136 INFO [train.py:1039] (1/4) Epoch 1, batch 900, loss[loss=0.5674, simple_loss=0.4973, pruned_loss=0.3602, over 24453.00 frames. ], tot_loss[loss=0.7519, simple_loss=0.6301, pruned_loss=0.5787, over 4660129.23 frames. ], batch size: 58, lr: 4.48e-02, grad_scale: 8.0 2023-09-28 11:40:03,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-28 11:40:03,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:40:03,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:40:03,641 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=6000.0, ans=0.21875 2023-09-28 11:40:06,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-28 11:40:10,772 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:40:12,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:40:13,969 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 3.628e+02 6.882e+02 1.109e+03 2.718e+03, threshold=1.376e+03, percent-clipped=19.0 2023-09-28 11:40:14,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-28 11:40:17,314 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=8.46 vs. limit=9.75 2023-09-28 11:40:17,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:40:18,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-28 11:40:18,137 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-28 11:40:19,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:40:19,663 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:40:21,199 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:40:21,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:40:30,845 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.67 vs. limit=12.05 2023-09-28 11:40:35,084 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=6066.666666666667, ans=0.215625 2023-09-28 11:40:36,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:40:36,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:40:36,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:40:38,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:40:41,868 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=6133.333333333333, ans=0.21250000000000002 2023-09-28 11:40:43,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-28 11:40:46,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:40:52,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-28 11:40:54,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:40:54,171 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-28 11:40:55,777 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-28 11:41:01,311 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-28 11:41:02,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:41:02,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:41:09,840 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:41:09,856 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:41:11,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-28 11:41:13,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:41:13,294 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-28 11:41:16,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:41:16,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:41:17,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:41:17,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:41:21,283 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-28 11:41:23,467 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-28 11:41:25,436 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=6266.666666666667, ans=0.20625 2023-09-28 11:41:26,577 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-28 11:41:26,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-28 11:41:29,574 INFO [train.py:1039] (1/4) Epoch 1, batch 950, loss[loss=0.5948, simple_loss=0.524, pruned_loss=0.3701, over 24619.00 frames. ], tot_loss[loss=0.72, simple_loss=0.6074, pruned_loss=0.537, over 4675673.90 frames. ], batch size: 65, lr: 4.48e-02, grad_scale: 8.0 2023-09-28 11:41:29,666 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:41:33,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-28 11:41:38,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:41:42,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:41:42,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:41:43,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:41:46,841 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-28 11:41:48,718 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=6400.0, ans=0.04 2023-09-28 11:41:50,860 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.91 vs. limit=9.9 2023-09-28 11:41:51,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:41:53,099 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:41:53,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:41:53,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:41:53,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-28 11:41:54,981 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-28 11:41:56,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:41:56,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-28 11:41:59,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:42:04,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:42:04,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:42:04,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:42:05,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-28 11:42:07,644 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:42:11,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:42:12,470 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=13.38 vs. limit=12.35 2023-09-28 11:42:12,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:42:18,495 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:42:18,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:42:21,861 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-28 11:42:22,493 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.84 vs. limit=8.266666666666666 2023-09-28 11:42:23,485 WARNING [train.py:1197] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 11:42:23,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:42:25,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:42:25,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:42:25,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:42:25,913 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=6.43 vs. limit=6.613333333333333 2023-09-28 11:42:30,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-28 11:42:32,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:42:33,848 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:42:35,302 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:42:35,329 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-28 11:42:35,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:42:35,377 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:42:36,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-28 11:42:42,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:42:46,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:42:47,189 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=7.45 vs. limit=8.3 2023-09-28 11:42:50,109 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.61 vs. limit=12.45 2023-09-28 11:42:51,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:42:53,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-28 11:42:53,213 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-28 11:42:56,627 INFO [train.py:1039] (1/4) Epoch 1, batch 1000, loss[loss=0.5493, simple_loss=0.4789, pruned_loss=0.3456, over 23752.00 frames. ], tot_loss[loss=0.6875, simple_loss=0.5843, pruned_loss=0.4971, over 4690396.03 frames. ], batch size: 179, lr: 4.48e-02, grad_scale: 8.0 2023-09-28 11:42:58,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:43:00,162 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=6666.666666666667, ans=0.03888888888888889 2023-09-28 11:43:01,599 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-28 11:43:01,804 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=6666.666666666667, ans=0.1875 2023-09-28 11:43:03,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:43:06,723 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.970e+02 4.014e+02 6.511e+02 1.253e+03 2.271e+03, threshold=1.302e+03, percent-clipped=16.0 2023-09-28 11:43:08,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:43:10,236 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-28 11:43:10,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-28 11:43:15,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:43:15,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:43:15,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:43:19,599 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-28 11:43:25,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-28 11:43:25,806 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=6733.333333333333, ans=0.009405797101449275 2023-09-28 11:43:26,236 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.63 vs. limit=12.55 2023-09-28 11:43:27,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-28 11:43:27,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:43:27,504 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=8.636e-02 2023-09-28 11:43:28,861 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-28 11:43:32,486 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-28 11:43:32,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-28 11:43:32,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:43:34,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:43:39,108 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=6800.0, ans=0.182 2023-09-28 11:43:43,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:43:44,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:43:45,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:43:45,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:43:45,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-28 11:43:47,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:43:47,848 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:43:49,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:43:49,382 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-28 11:43:52,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-28 11:43:54,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-28 11:43:55,012 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=6866.666666666667, ans=0.17812499999999998 2023-09-28 11:43:56,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-28 11:43:57,078 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=13.01 vs. limit=10.075 2023-09-28 11:43:57,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:44:03,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:44:04,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:44:04,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:44:06,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:44:09,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-28 11:44:09,787 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:44:11,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-28 11:44:11,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-28 11:44:12,872 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:44:12,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:44:15,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:44:17,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:44:18,148 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.76 vs. limit=10.1 2023-09-28 11:44:20,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:44:21,954 INFO [train.py:1039] (1/4) Epoch 1, batch 1050, loss[loss=0.4784, simple_loss=0.4319, pruned_loss=0.2792, over 24458.00 frames. ], tot_loss[loss=0.6588, simple_loss=0.5638, pruned_loss=0.4629, over 4693327.65 frames. ], batch size: 58, lr: 4.48e-02, grad_scale: 8.0 2023-09-28 11:44:25,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:44:26,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:44:28,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:44:30,454 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:44:30,784 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_ff2.min_abs, batch_count=7000.0, ans=0.1 2023-09-28 11:44:33,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:44:35,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:44:36,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:44:40,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:44:40,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:44:40,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:44:42,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:44:42,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-28 11:44:43,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:44:43,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-28 11:44:47,343 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:44:47,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-28 11:44:47,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-28 11:44:51,487 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=7066.666666666667, ans=0.16875 2023-09-28 11:44:56,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:44:58,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:44:58,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:45:00,044 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=7133.333333333333, ans=0.036944444444444446 2023-09-28 11:45:01,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-28 11:45:01,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-28 11:45:01,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:45:02,178 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.84 vs. limit=10.175 2023-09-28 11:45:05,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-28 11:45:08,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-28 11:45:09,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:45:12,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 11:45:14,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-28 11:45:14,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:45:16,550 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:45:19,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:45:22,881 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-28 11:45:23,618 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=6.97 vs. limit=6.88 2023-09-28 11:45:25,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-28 11:45:25,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-28 11:45:26,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:45:26,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:45:28,959 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-28 11:45:33,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:45:35,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:45:35,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:45:36,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:45:36,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:45:39,928 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.48 vs. limit=10.225 2023-09-28 11:45:40,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:45:40,703 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-28 11:45:43,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:45:43,922 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-28 11:45:43,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-28 11:45:45,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:45:46,847 INFO [train.py:1039] (1/4) Epoch 1, batch 1100, loss[loss=0.5082, simple_loss=0.4685, pruned_loss=0.2832, over 24498.00 frames. ], tot_loss[loss=0.6331, simple_loss=0.5463, pruned_loss=0.4318, over 4702929.33 frames. ], batch size: 66, lr: 4.48e-02, grad_scale: 8.0 2023-09-28 11:45:48,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:45:55,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:45:58,776 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.864e+02 4.555e+02 7.978e+02 1.389e+03 3.645e+03, threshold=1.596e+03, percent-clipped=29.0 2023-09-28 11:46:00,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:46:00,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:46:00,688 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:46:02,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-28 11:46:04,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:46:05,018 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=15.39 vs. limit=10.275 2023-09-28 11:46:07,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:46:08,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:46:10,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:46:10,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-28 11:46:12,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 11:46:13,618 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:46:13,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:46:17,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:46:17,566 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=7400.0, ans=0.226 2023-09-28 11:46:20,285 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:46:23,726 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:46:27,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-28 11:46:27,752 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=7466.666666666667, ans=0.22533333333333333 2023-09-28 11:46:28,909 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-28 11:46:29,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:46:32,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:46:33,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:46:34,027 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.73 vs. limit=13.1 2023-09-28 11:46:34,643 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:46:34,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-28 11:46:34,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:46:34,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:46:35,325 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=7466.666666666667, ans=0.312 2023-09-28 11:46:36,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:46:36,421 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:46:37,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-28 11:46:39,941 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=7533.333333333333, ans=0.14687499999999998 2023-09-28 11:46:42,960 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:46:43,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-28 11:46:45,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:46:51,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:46:54,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-28 11:46:54,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:46:56,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:46:59,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:46:59,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:47:00,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-28 11:47:03,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:47:04,446 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:47:04,819 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=7600.0, ans=0.634 2023-09-28 11:47:06,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-28 11:47:06,772 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:47:06,996 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=7600.0, ans=0.224 2023-09-28 11:47:08,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-28 11:47:09,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:47:09,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:47:10,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:47:12,089 INFO [train.py:1039] (1/4) Epoch 1, batch 1150, loss[loss=0.4515, simple_loss=0.4174, pruned_loss=0.2495, over 24271.00 frames. ], tot_loss[loss=0.6115, simple_loss=0.5314, pruned_loss=0.4059, over 4701318.33 frames. ], batch size: 56, lr: 4.47e-02, grad_scale: 4.0 2023-09-28 11:47:15,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:47:18,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:47:20,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:47:21,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:47:21,665 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-28 11:47:21,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:47:22,011 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=7666.666666666667, ans=0.140625 2023-09-28 11:47:25,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-28 11:47:26,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:47:26,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:47:31,553 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=7733.333333333333, ans=0.1375 2023-09-28 11:47:32,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-28 11:47:35,835 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:47:39,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:47:41,783 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:47:41,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-28 11:47:42,055 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=7733.333333333333, ans=0.00918840579710145 2023-09-28 11:47:43,210 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:47:43,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:47:47,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-28 11:47:48,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:47:50,744 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.65 vs. limit=10.425 2023-09-28 11:47:51,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:47:58,375 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=7800.0, ans=0.222 2023-09-28 11:47:59,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:48:07,786 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:48:07,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-28 11:48:09,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:48:09,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:48:16,051 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-28 11:48:18,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:48:26,730 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-28 11:48:30,053 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:48:32,920 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:48:32,972 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:48:33,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:48:34,809 INFO [train.py:1039] (1/4) Epoch 1, batch 1200, loss[loss=0.5395, simple_loss=0.4941, pruned_loss=0.3025, over 24671.00 frames. ], tot_loss[loss=0.5917, simple_loss=0.5184, pruned_loss=0.3823, over 4707641.66 frames. ], batch size: 73, lr: 4.47e-02, grad_scale: 8.0 2023-09-28 11:48:35,395 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=8000.0, ans=0.22 2023-09-28 11:48:37,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:48:41,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:48:41,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:48:43,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:48:43,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:48:44,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:48:46,294 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.850e+02 4.760e+02 7.806e+02 1.164e+03 2.947e+03, threshold=1.561e+03, percent-clipped=14.0 2023-09-28 11:48:46,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:48:48,032 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:48:50,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:48:50,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:48:51,843 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-28 11:48:54,364 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-28 11:49:00,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:49:02,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:49:03,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:49:06,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:49:06,751 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-28 11:49:08,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:49:12,101 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=8133.333333333333, ans=0.6153333333333333 2023-09-28 11:49:18,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:49:18,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:49:18,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-28 11:49:18,451 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=8133.333333333333, ans=0.125 2023-09-28 11:49:19,790 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:49:23,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-28 11:49:24,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-28 11:49:26,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:49:28,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:49:28,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:49:30,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-28 11:49:32,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:49:32,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:49:34,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:49:34,194 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-28 11:49:35,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:49:35,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:49:37,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 11:49:38,591 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:49:38,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:49:43,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-28 11:49:45,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:49:48,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-28 11:49:53,496 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-28 11:49:55,187 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:49:58,118 INFO [train.py:1039] (1/4) Epoch 1, batch 1250, loss[loss=0.4507, simple_loss=0.4206, pruned_loss=0.2439, over 24277.00 frames. ], tot_loss[loss=0.578, simple_loss=0.509, pruned_loss=0.3656, over 4696974.78 frames. ], batch size: 56, lr: 4.47e-02, grad_scale: 4.0 2023-09-28 11:49:58,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:49:59,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:50:01,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:50:04,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-28 11:50:08,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:50:09,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:50:09,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-28 11:50:10,894 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=8.53 vs. limit=9.166666666666668 2023-09-28 11:50:11,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:50:12,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:50:15,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:50:18,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:50:19,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:50:19,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:50:21,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:50:25,214 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=14.21 vs. limit=9.2 2023-09-28 11:50:26,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:50:26,066 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:50:26,076 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:50:27,694 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:50:29,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:50:30,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:50:32,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-28 11:50:38,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-28 11:50:38,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:50:41,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:50:41,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-28 11:50:41,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:50:42,841 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-28 11:50:42,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:50:42,901 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:50:47,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:50:52,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:50:52,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:50:54,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-28 11:50:54,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-28 11:50:55,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-28 11:50:58,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:51:00,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-28 11:51:00,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:51:04,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-28 11:51:04,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:51:05,047 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=8600.0, ans=0.0 2023-09-28 11:51:07,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-28 11:51:07,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:51:07,903 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:51:10,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-28 11:51:10,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:51:12,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-28 11:51:13,910 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:51:17,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:51:18,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:51:20,633 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:51:22,007 INFO [train.py:1039] (1/4) Epoch 1, batch 1300, loss[loss=0.608, simple_loss=0.5164, pruned_loss=0.3786, over 19985.00 frames. ], tot_loss[loss=0.564, simple_loss=0.4997, pruned_loss=0.3492, over 4686077.16 frames. ], batch size: 388, lr: 4.47e-02, grad_scale: 8.0 2023-09-28 11:51:23,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:51:23,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-28 11:51:28,271 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.26 vs. limit=9.333333333333332 2023-09-28 11:51:30,449 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:51:31,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:51:32,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:51:34,987 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.019e+02 3.707e+02 6.388e+02 1.142e+03 3.121e+03, threshold=1.278e+03, percent-clipped=13.0 2023-09-28 11:51:35,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:51:38,140 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:51:38,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-28 11:51:43,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:51:45,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:51:47,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-28 11:51:50,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:51:55,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:51:55,967 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:51:57,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:51:58,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:52:00,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:52:00,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-28 11:52:00,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-28 11:52:04,639 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=8800.0, ans=0.125 2023-09-28 11:52:08,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:52:08,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:52:10,167 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-28 11:52:10,259 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:52:11,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:52:14,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:52:15,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-28 11:52:16,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:52:16,526 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-28 11:52:20,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:52:22,441 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:52:22,457 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:52:25,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-28 11:52:27,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-28 11:52:29,380 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-28 11:52:31,645 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.87 vs. limit=14.2 2023-09-28 11:52:32,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:52:36,172 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-28 11:52:39,088 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:52:42,656 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=8933.333333333334, ans=0.5873333333333334 2023-09-28 11:52:45,143 INFO [train.py:1039] (1/4) Epoch 1, batch 1350, loss[loss=0.5033, simple_loss=0.4709, pruned_loss=0.2705, over 24006.00 frames. ], tot_loss[loss=0.5485, simple_loss=0.4893, pruned_loss=0.3327, over 4688324.44 frames. ], batch size: 80, lr: 4.46e-02, grad_scale: 4.0 2023-09-28 11:52:46,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-28 11:52:48,573 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 11:52:49,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:52:52,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:52:55,326 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:52:55,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:52:56,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:52:58,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:53:03,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:53:05,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-28 11:53:05,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:53:05,420 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=9066.666666666666, ans=0.20933333333333334 2023-09-28 11:53:06,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:53:07,059 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=9066.666666666666, ans=0.20933333333333334 2023-09-28 11:53:10,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-28 11:53:11,273 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=3.63 vs. limit=7.626666666666667 2023-09-28 11:53:11,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:53:13,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:53:13,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-28 11:53:15,200 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=9066.666666666666, ans=0.028888888888888895 2023-09-28 11:53:16,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-28 11:53:17,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-28 11:53:19,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:53:19,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-28 11:53:29,281 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=9133.333333333334, ans=0.125 2023-09-28 11:53:30,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:53:39,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:53:39,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:53:41,270 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-28 11:53:42,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:53:45,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-28 11:53:45,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:53:46,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:53:49,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:53:49,663 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.52 vs. limit=10.95 2023-09-28 11:53:50,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-28 11:53:53,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:53:53,947 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=9266.666666666666, ans=0.025 2023-09-28 11:53:54,291 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=9.00 vs. limit=7.316666666666666 2023-09-28 11:54:00,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-28 11:54:02,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-28 11:54:05,908 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=9266.666666666666, ans=0.125 2023-09-28 11:54:09,172 INFO [train.py:1039] (1/4) Epoch 1, batch 1400, loss[loss=0.4749, simple_loss=0.4524, pruned_loss=0.2477, over 24033.00 frames. ], tot_loss[loss=0.5305, simple_loss=0.4776, pruned_loss=0.3147, over 4693365.52 frames. ], batch size: 80, lr: 4.46e-02, grad_scale: 8.0 2023-09-28 11:54:09,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-28 11:54:11,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:54:11,656 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.47 vs. limit=11.0 2023-09-28 11:54:16,025 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:54:16,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:54:22,324 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-28 11:54:23,717 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.036e+02 3.566e+02 5.835e+02 9.354e+02 4.572e+03, threshold=1.167e+03, percent-clipped=13.0 2023-09-28 11:54:23,832 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-28 11:54:33,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:54:35,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:54:37,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:54:37,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:54:40,882 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:54:41,154 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=9466.666666666666, ans=0.5686666666666667 2023-09-28 11:54:42,465 WARNING [train.py:1197] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 11:54:42,877 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=9466.666666666666, ans=0.125 2023-09-28 11:54:45,144 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=9466.666666666666, ans=0.125 2023-09-28 11:54:48,567 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.56 vs. limit=9.733333333333333 2023-09-28 11:54:52,339 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:54:52,439 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:54:57,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-28 11:54:58,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:54:58,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:55:00,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:55:00,419 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:55:02,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:55:02,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:55:02,144 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:55:05,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-28 11:55:05,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:55:08,604 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=9533.333333333334, ans=0.125 2023-09-28 11:55:10,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:55:13,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:55:21,153 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=9600.0, ans=0.02666666666666667 2023-09-28 11:55:23,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-28 11:55:25,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:55:25,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:55:28,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 11:55:30,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:55:31,862 INFO [train.py:1039] (1/4) Epoch 1, batch 1450, loss[loss=0.4466, simple_loss=0.4341, pruned_loss=0.2258, over 24506.00 frames. ], tot_loss[loss=0.5177, simple_loss=0.4696, pruned_loss=0.3013, over 4690796.52 frames. ], batch size: 66, lr: 4.46e-02, grad_scale: 8.0 2023-09-28 11:55:31,972 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:55:35,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:55:36,777 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:55:36,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:55:36,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-28 11:55:42,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:55:44,318 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:55:44,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:55:44,540 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-28 11:55:46,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:55:47,963 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=14.76 vs. limit=14.8 2023-09-28 11:55:48,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-28 11:55:50,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:55:50,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:55:50,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-28 11:55:52,661 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:55:54,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:55:56,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 11:55:56,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:55:57,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:55:58,030 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=9733.333333333334, ans=0.0 2023-09-28 11:55:59,249 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:56:00,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:56:04,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:56:04,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:56:07,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:56:07,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:56:08,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:56:09,008 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:56:10,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:56:10,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:56:13,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-28 11:56:18,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:56:21,160 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-28 11:56:23,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:56:25,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:56:27,085 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:56:29,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-28 11:56:33,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:56:35,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-28 11:56:36,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-28 11:56:38,654 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:56:40,541 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=9933.333333333334, ans=0.125 2023-09-28 11:56:40,998 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.85 vs. limit=11.225 2023-09-28 11:56:41,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:56:42,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:56:43,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-28 11:56:45,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-28 11:56:45,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-28 11:56:46,841 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:56:48,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:56:54,978 INFO [train.py:1039] (1/4) Epoch 1, batch 1500, loss[loss=0.4773, simple_loss=0.4377, pruned_loss=0.2629, over 23691.00 frames. ], tot_loss[loss=0.5067, simple_loss=0.4632, pruned_loss=0.2896, over 4705935.42 frames. ], batch size: 232, lr: 4.46e-02, grad_scale: 8.0 2023-09-28 11:56:59,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-28 11:56:59,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:56:59,427 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:57:00,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:57:02,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:57:02,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:57:02,816 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=10000.0, ans=0.125 2023-09-28 11:57:04,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-28 11:57:06,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:57:06,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-28 11:57:06,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:57:07,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:57:10,821 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.048e+02 3.499e+02 5.909e+02 9.288e+02 2.563e+03, threshold=1.182e+03, percent-clipped=18.0 2023-09-28 11:57:10,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:57:12,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:57:18,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:57:18,766 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-28 11:57:18,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:57:18,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:57:19,518 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=5.75 vs. limit=8.026666666666667 2023-09-28 11:57:20,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:57:23,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-28 11:57:25,547 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=10066.666666666666, ans=0.125 2023-09-28 11:57:26,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-28 11:57:27,077 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=10133.333333333334, ans=0.024444444444444446 2023-09-28 11:57:28,171 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:57:28,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-28 11:57:32,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-28 11:57:35,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:57:37,448 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:57:37,473 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:57:39,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-28 11:57:39,125 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:57:39,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:57:41,255 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-28 11:57:42,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:57:46,458 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=10200.0, ans=0.02416666666666667 2023-09-28 11:57:47,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:57:47,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-28 11:57:53,560 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:57:53,807 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=10200.0, ans=0.543 2023-09-28 11:57:55,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:57:59,731 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-28 11:58:01,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:58:01,216 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-28 11:58:02,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:58:04,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:58:06,032 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-28 11:58:06,176 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:58:09,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-28 11:58:11,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:58:16,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:58:16,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:58:18,380 INFO [train.py:1039] (1/4) Epoch 1, batch 1550, loss[loss=0.4514, simple_loss=0.4381, pruned_loss=0.2296, over 24389.00 frames. ], tot_loss[loss=0.4972, simple_loss=0.4571, pruned_loss=0.2801, over 4698566.04 frames. ], batch size: 77, lr: 4.45e-02, grad_scale: 8.0 2023-09-28 11:58:18,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:58:18,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:58:18,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:58:20,349 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-28 11:58:21,020 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.52 vs. limit=15.25 2023-09-28 11:58:21,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-28 11:58:21,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:58:23,404 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-28 11:58:23,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-28 11:58:25,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:58:26,615 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:58:26,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:58:26,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:58:29,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:58:29,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:58:31,421 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-28 11:58:32,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:58:32,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:58:32,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:58:34,863 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=10400.0, ans=0.09899494936611666 2023-09-28 11:58:36,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:58:36,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-28 11:58:37,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:58:37,640 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-28 11:58:40,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-28 11:58:40,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-28 11:58:40,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:58:42,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:58:47,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:58:49,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-28 11:58:49,712 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-28 11:58:49,943 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=10400.0, ans=0.196 2023-09-28 11:58:50,116 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=10400.0, ans=0.125 2023-09-28 11:58:54,632 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=10466.666666666666, ans=0.008594202898550726 2023-09-28 11:58:58,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:59:02,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:59:02,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:59:02,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:59:03,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-28 11:59:09,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:59:11,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:59:15,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:59:18,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:59:18,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:59:20,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-28 11:59:20,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:59:20,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:59:20,485 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=10533.333333333334, ans=0.0 2023-09-28 11:59:21,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:59:23,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-28 11:59:23,365 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-28 11:59:25,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:59:31,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-28 11:59:36,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:59:36,834 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=10600.0, ans=0.0 2023-09-28 11:59:38,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:59:39,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-28 11:59:41,359 INFO [train.py:1039] (1/4) Epoch 1, batch 1600, loss[loss=0.5885, simple_loss=0.5104, pruned_loss=0.3442, over 19648.00 frames. ], tot_loss[loss=0.4899, simple_loss=0.4532, pruned_loss=0.2723, over 4706027.99 frames. ], batch size: 388, lr: 4.45e-02, grad_scale: 16.0 2023-09-28 11:59:43,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:59:44,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:59:44,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:59:44,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:59:44,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:59:49,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:59:49,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-28 11:59:51,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-28 11:59:55,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-28 11:59:56,694 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:59:58,016 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.152e+02 3.597e+02 5.871e+02 8.452e+02 2.438e+03, threshold=1.174e+03, percent-clipped=11.0 2023-09-28 11:59:58,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-28 11:59:58,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:00:02,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:00:05,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:00:08,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-28 12:00:11,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:00:13,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-28 12:00:13,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:00:14,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-28 12:00:19,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-28 12:00:23,092 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=10800.0, ans=0.125 2023-09-28 12:00:28,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:00:28,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-28 12:00:28,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:00:30,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:00:30,131 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:00:30,342 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=10866.666666666666, ans=0.125 2023-09-28 12:00:30,414 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=10866.666666666666, ans=0.125 2023-09-28 12:00:33,532 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 12:00:35,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-28 12:00:40,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 12:00:40,328 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:00:41,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:00:41,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:00:43,504 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:00:45,126 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:00:45,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:00:48,073 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:00:55,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:00:55,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:00:57,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-28 12:00:57,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:00:59,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-28 12:00:59,883 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=10933.333333333334, ans=10.0 2023-09-28 12:01:04,474 INFO [train.py:1039] (1/4) Epoch 1, batch 1650, loss[loss=0.4047, simple_loss=0.4044, pruned_loss=0.1984, over 24544.00 frames. ], tot_loss[loss=0.4799, simple_loss=0.4476, pruned_loss=0.2628, over 4724107.15 frames. ], batch size: 60, lr: 4.45e-02, grad_scale: 8.0 2023-09-28 12:01:04,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:01:08,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:01:08,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:01:08,348 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-28 12:01:08,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-28 12:01:08,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-28 12:01:08,675 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=11000.0, ans=0.5150000000000001 2023-09-28 12:01:09,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-28 12:01:14,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:01:16,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:01:16,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:01:16,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-28 12:01:18,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:01:21,203 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-28 12:01:22,927 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:01:22,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:01:22,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:01:24,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:01:24,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-28 12:01:25,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-28 12:01:33,943 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 12:01:34,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:01:42,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-28 12:01:43,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:01:47,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-28 12:01:47,624 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=11133.333333333334, ans=0.125 2023-09-28 12:01:48,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:01:50,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:01:50,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:01:52,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:01:53,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:01:55,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:01:59,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:01:59,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:01:59,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:02:01,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:02:01,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:02:03,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:02:03,664 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=11200.0, ans=0.125 2023-09-28 12:02:06,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:02:06,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-28 12:02:08,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:02:08,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-28 12:02:10,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-28 12:02:10,590 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-28 12:02:11,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:02:13,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:02:13,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:02:13,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:02:13,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-28 12:02:18,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:02:20,405 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:02:20,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:02:24,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-28 12:02:28,772 INFO [train.py:1039] (1/4) Epoch 1, batch 1700, loss[loss=0.3757, simple_loss=0.3761, pruned_loss=0.1843, over 18108.00 frames. ], tot_loss[loss=0.4701, simple_loss=0.4413, pruned_loss=0.2544, over 4718829.76 frames. ], batch size: 39, lr: 4.44e-02, grad_scale: 8.0 2023-09-28 12:02:29,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:02:29,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:02:29,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-28 12:02:30,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:02:30,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:02:30,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:02:33,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:02:33,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:02:33,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-28 12:02:37,555 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 12:02:40,841 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=11333.333333333334, ans=0.09899494936611666 2023-09-28 12:02:45,393 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.253e+02 3.835e+02 6.904e+02 1.046e+03 2.238e+03, threshold=1.381e+03, percent-clipped=16.0 2023-09-28 12:02:45,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:02:49,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:02:51,239 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=11400.0, ans=0.008391304347826088 2023-09-28 12:02:51,302 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=11400.0, ans=0.186 2023-09-28 12:02:57,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-28 12:02:57,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:02:59,081 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:02:59,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:03:02,220 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-28 12:03:05,358 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:03:05,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:03:06,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-28 12:03:08,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-28 12:03:10,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-28 12:03:10,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-28 12:03:12,369 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:03:13,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-28 12:03:15,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:03:24,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:03:24,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:03:24,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:03:26,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-28 12:03:26,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-28 12:03:27,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:03:29,649 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:03:29,650 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-28 12:03:30,027 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=11533.333333333334, ans=0.018611111111111106 2023-09-28 12:03:31,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:03:31,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:03:31,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:03:31,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:03:34,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:03:34,678 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:03:36,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:03:36,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:03:36,331 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:03:41,178 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:03:41,355 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-28 12:03:44,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:03:46,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:03:48,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-28 12:03:52,709 INFO [train.py:1039] (1/4) Epoch 1, batch 1750, loss[loss=0.4066, simple_loss=0.4073, pruned_loss=0.1999, over 24604.00 frames. ], tot_loss[loss=0.4601, simple_loss=0.4348, pruned_loss=0.2462, over 4712586.17 frames. ], batch size: 60, lr: 4.44e-02, grad_scale: 8.0 2023-09-28 12:03:56,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:03:58,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:03:59,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-28 12:04:01,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-28 12:04:01,394 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:04:04,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:04:05,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:04:08,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-28 12:04:11,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:04:12,232 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.23 vs. limit=7.933333333333334 2023-09-28 12:04:13,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-28 12:04:14,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:04:16,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:04:19,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 12:04:21,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-28 12:04:22,628 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:04:22,690 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-28 12:04:33,184 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:04:34,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:04:34,887 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:04:39,426 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:04:39,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:04:41,503 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:04:43,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:04:46,223 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:04:47,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:04:47,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-28 12:04:49,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:04:51,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-28 12:04:53,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:04:54,256 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.78 vs. limit=11.95 2023-09-28 12:04:54,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:04:56,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:05:01,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 12:05:01,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-28 12:05:03,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:05:06,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:05:11,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:05:12,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:05:15,296 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:05:15,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-28 12:05:15,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:05:17,317 INFO [train.py:1039] (1/4) Epoch 1, batch 1800, loss[loss=0.4298, simple_loss=0.4336, pruned_loss=0.2102, over 24342.00 frames. ], tot_loss[loss=0.4511, simple_loss=0.4299, pruned_loss=0.2386, over 4722186.08 frames. ], batch size: 77, lr: 4.44e-02, grad_scale: 8.0 2023-09-28 12:05:17,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-28 12:05:17,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:05:17,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:05:17,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:05:17,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-28 12:05:20,873 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:05:22,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:05:24,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 12:05:27,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:05:30,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 12:05:32,137 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:05:33,423 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.239e+02 3.495e+02 5.189e+02 7.461e+02 1.869e+03, threshold=1.038e+03, percent-clipped=4.0 2023-09-28 12:05:35,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:05:35,581 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=12066.666666666666, ans=0.17933333333333334 2023-09-28 12:05:37,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:05:39,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:05:41,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:05:42,819 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:05:42,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-28 12:05:44,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:05:47,569 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=12066.666666666666, ans=0.125 2023-09-28 12:05:48,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:05:52,738 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-28 12:05:53,096 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=12133.333333333334, ans=0.125 2023-09-28 12:05:54,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-28 12:05:54,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-28 12:05:55,074 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.28 vs. limit=12.05 2023-09-28 12:05:55,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:05:56,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:05:56,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:05:57,472 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:05:59,628 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=12133.333333333334, ans=0.47533333333333333 2023-09-28 12:06:05,714 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-28 12:06:07,231 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:06:08,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:06:10,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-28 12:06:10,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-28 12:06:10,784 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=12200.0, ans=0.015833333333333338 2023-09-28 12:06:11,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:06:14,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:06:14,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 12:06:19,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-28 12:06:27,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:06:29,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-28 12:06:29,397 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:06:29,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:06:29,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:06:30,892 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-28 12:06:32,754 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=12266.666666666666, ans=0.125 2023-09-28 12:06:34,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:06:34,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:06:37,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-28 12:06:37,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:06:37,932 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=12266.666666666666, ans=0.008202898550724638 2023-09-28 12:06:39,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:06:40,780 INFO [train.py:1039] (1/4) Epoch 1, batch 1850, loss[loss=0.4942, simple_loss=0.4471, pruned_loss=0.2723, over 22818.00 frames. ], tot_loss[loss=0.447, simple_loss=0.4276, pruned_loss=0.2348, over 4725969.57 frames. ], batch size: 322, lr: 4.43e-02, grad_scale: 8.0 2023-09-28 12:06:40,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-28 12:06:40,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:06:42,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:06:43,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:06:44,165 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:06:44,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:06:48,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:06:48,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:06:53,576 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=12333.333333333334, ans=0.125 2023-09-28 12:06:56,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:06:56,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-28 12:07:00,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-28 12:07:04,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-28 12:07:07,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:07:07,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-28 12:07:07,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 12:07:08,363 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=12400.0, ans=0.125 2023-09-28 12:07:17,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:07:19,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-28 12:07:23,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:07:23,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:07:29,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-28 12:07:29,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:07:29,945 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 12:07:31,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:07:34,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:07:37,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:07:40,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:07:40,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:07:40,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 12:07:40,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:07:43,248 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:07:43,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:07:46,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-28 12:07:46,679 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:07:51,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:07:53,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 12:07:53,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-28 12:07:53,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-28 12:07:55,376 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-28 12:07:55,502 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-28 12:07:57,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 12:07:57,819 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:07:59,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:07:59,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:08:00,598 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-28 12:08:00,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:08:01,928 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:08:03,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-28 12:08:04,769 INFO [train.py:1039] (1/4) Epoch 1, batch 1900, loss[loss=0.4409, simple_loss=0.4397, pruned_loss=0.2199, over 23972.00 frames. ], tot_loss[loss=0.4436, simple_loss=0.4263, pruned_loss=0.2316, over 4721021.32 frames. ], batch size: 86, lr: 4.43e-02, grad_scale: 8.0 2023-09-28 12:08:04,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 12:08:06,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:08:06,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-28 12:08:08,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:08:08,113 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-28 12:08:08,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:08:09,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:08:16,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:08:16,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:08:18,043 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-28 12:08:18,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-28 12:08:20,940 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.193e+02 3.536e+02 5.623e+02 9.146e+02 3.125e+03, threshold=1.125e+03, percent-clipped=17.0 2023-09-28 12:08:21,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:08:21,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:08:21,245 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-28 12:08:22,667 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-28 12:08:29,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-28 12:08:31,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:08:35,182 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=12733.333333333334, ans=0.0 2023-09-28 12:08:36,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-28 12:08:38,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-28 12:08:48,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-28 12:08:51,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-28 12:08:51,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:08:52,844 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-28 12:08:52,861 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-28 12:08:52,924 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-28 12:08:54,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-28 12:08:54,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:08:57,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-28 12:09:00,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:09:05,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:09:05,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-28 12:09:08,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:09:08,961 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=12866.666666666666, ans=0.125 2023-09-28 12:09:13,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-28 12:09:13,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-28 12:09:19,452 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=12933.333333333334, ans=0.0 2023-09-28 12:09:20,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:09:20,761 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:09:20,798 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:09:20,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:09:24,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 12:09:24,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-28 12:09:24,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:09:27,546 INFO [train.py:1039] (1/4) Epoch 1, batch 1950, loss[loss=0.4067, simple_loss=0.4304, pruned_loss=0.1905, over 24436.00 frames. ], tot_loss[loss=0.438, simple_loss=0.4238, pruned_loss=0.2268, over 4727350.99 frames. ], batch size: 69, lr: 4.43e-02, grad_scale: 8.0 2023-09-28 12:09:27,640 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:09:27,643 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:09:29,383 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=13000.0, ans=0.445 2023-09-28 12:09:30,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:09:30,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:09:30,756 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-28 12:09:32,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:09:34,151 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:09:37,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:09:37,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:09:37,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 12:09:42,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-28 12:09:42,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 12:09:42,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:09:44,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:09:47,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:09:47,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:09:47,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:09:50,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:09:53,682 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:09:53,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 12:09:53,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:09:53,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:09:58,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:10:00,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:10:00,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:10:01,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-28 12:10:01,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-28 12:10:03,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 12:10:03,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:10:03,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:10:08,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:10:11,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:10:13,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 12:10:14,069 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=13133.333333333334, ans=0.09899494936611666 2023-09-28 12:10:17,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:10:19,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-28 12:10:19,190 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-28 12:10:20,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:10:25,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:10:25,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:10:26,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-28 12:10:34,948 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:10:36,996 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:10:38,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:10:40,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:10:42,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:10:42,342 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:10:43,817 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-28 12:10:43,826 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 12:10:45,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:10:47,998 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-28 12:10:48,179 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=13266.666666666666, ans=0.125 2023-09-28 12:10:49,613 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=13333.333333333334, ans=0.125 2023-09-28 12:10:51,424 INFO [train.py:1039] (1/4) Epoch 1, batch 2000, loss[loss=0.3972, simple_loss=0.4086, pruned_loss=0.1929, over 24595.00 frames. ], tot_loss[loss=0.4356, simple_loss=0.423, pruned_loss=0.2246, over 4712700.98 frames. ], batch size: 60, lr: 4.42e-02, grad_scale: 16.0 2023-09-28 12:10:51,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:10:56,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-28 12:10:56,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:10:57,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:10:57,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:11:00,928 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:11:02,820 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=13333.333333333334, ans=0.16666666666666666 2023-09-28 12:11:05,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-28 12:11:05,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-28 12:11:06,950 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.094e+02 3.925e+02 5.056e+02 7.202e+02 2.152e+03, threshold=1.011e+03, percent-clipped=10.0 2023-09-28 12:11:08,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:11:10,811 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-28 12:11:11,033 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=13400.0, ans=0.43100000000000005 2023-09-28 12:11:12,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 12:11:12,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:11:15,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:11:15,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-28 12:11:16,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:11:17,368 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=13400.0, ans=0.16599999999999998 2023-09-28 12:11:18,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:11:18,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:11:20,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-28 12:11:20,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 12:11:23,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-28 12:11:23,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:11:28,150 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:11:29,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-28 12:11:29,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:11:31,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:11:32,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:11:32,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-28 12:11:35,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-28 12:11:35,932 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:11:35,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:11:36,159 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=13466.666666666666, ans=0.16533333333333333 2023-09-28 12:11:40,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:11:41,780 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.62 vs. limit=17.65 2023-09-28 12:11:42,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:11:42,873 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 12:11:44,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:11:46,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:11:46,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:11:46,497 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=13533.333333333334, ans=0.09899494936611666 2023-09-28 12:11:47,670 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 12:11:47,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:11:49,240 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:11:52,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:11:53,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-28 12:12:01,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 12:12:03,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:12:05,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:12:05,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:12:09,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:12:11,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:12:11,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:12:11,688 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=13600.0, ans=0.16399999999999998 2023-09-28 12:12:12,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 12:12:12,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:12:14,461 INFO [train.py:1039] (1/4) Epoch 1, batch 2050, loss[loss=0.3981, simple_loss=0.4069, pruned_loss=0.1946, over 24314.00 frames. ], tot_loss[loss=0.4271, simple_loss=0.4185, pruned_loss=0.2182, over 4723869.20 frames. ], batch size: 61, lr: 4.42e-02, grad_scale: 16.0 2023-09-28 12:12:14,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:12:16,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:12:19,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:12:19,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:12:23,258 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=13666.666666666666, ans=0.125 2023-09-28 12:12:24,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:12:26,220 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:12:26,296 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:12:27,765 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:12:31,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-28 12:12:31,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:12:34,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:12:34,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-28 12:12:34,796 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=13733.333333333334, ans=0.009444444444444443 2023-09-28 12:12:36,881 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.63 vs. limit=12.65 2023-09-28 12:12:42,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:12:42,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:12:43,227 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=13733.333333333334, ans=12.65 2023-09-28 12:12:43,916 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-28 12:12:46,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:12:49,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-28 12:12:50,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:12:53,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:12:55,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:12:55,198 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-28 12:12:56,685 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:12:56,903 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:12:58,451 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:12:59,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:13:03,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:13:05,870 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:13:07,527 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:13:09,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:13:12,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 12:13:20,261 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:13:20,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-28 12:13:25,368 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=13933.333333333334, ans=0.16066666666666665 2023-09-28 12:13:26,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:13:28,118 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:13:29,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:13:32,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-28 12:13:32,820 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=13933.333333333334, ans=0.41233333333333333 2023-09-28 12:13:35,634 INFO [train.py:1039] (1/4) Epoch 1, batch 2100, loss[loss=0.4215, simple_loss=0.4313, pruned_loss=0.2058, over 23806.00 frames. ], tot_loss[loss=0.4204, simple_loss=0.4149, pruned_loss=0.2132, over 4731397.74 frames. ], batch size: 85, lr: 4.42e-02, grad_scale: 16.0 2023-09-28 12:13:36,192 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=14000.0, ans=0.125 2023-09-28 12:13:38,021 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-28 12:13:38,022 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:13:38,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:13:38,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 12:13:40,400 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:13:40,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-28 12:13:41,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-28 12:13:43,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 12:13:45,186 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=14000.0, ans=0.008333333333333338 2023-09-28 12:13:47,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:13:48,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:13:49,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:13:51,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:13:51,220 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-28 12:13:52,531 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.170e+02 3.843e+02 5.173e+02 8.078e+02 2.053e+03, threshold=1.035e+03, percent-clipped=17.0 2023-09-28 12:13:52,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:13:52,864 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-28 12:13:52,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-28 12:13:54,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:13:54,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:13:54,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-28 12:13:56,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 12:14:01,483 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-28 12:14:01,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 12:14:04,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:14:04,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:14:08,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:14:08,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-28 12:14:09,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:14:09,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 12:14:12,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-28 12:14:12,255 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:14:13,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-28 12:14:13,701 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-28 12:14:13,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-28 12:14:16,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-28 12:14:19,778 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:14:21,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 12:14:23,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 12:14:24,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:14:27,613 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:14:27,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-28 12:14:27,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:14:27,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:14:29,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:14:29,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-28 12:14:31,197 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-28 12:14:32,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-28 12:14:37,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:14:39,079 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:14:39,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-28 12:14:41,498 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.81 vs. limit=8.566666666666666 2023-09-28 12:14:46,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:14:48,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:14:49,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:14:49,797 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:14:49,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-28 12:14:51,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 12:14:52,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:14:52,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-28 12:14:54,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:14:54,419 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:14:56,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-28 12:14:58,988 INFO [train.py:1039] (1/4) Epoch 1, batch 2150, loss[loss=0.3618, simple_loss=0.3765, pruned_loss=0.1736, over 24320.00 frames. ], tot_loss[loss=0.4154, simple_loss=0.412, pruned_loss=0.2096, over 4733599.22 frames. ], batch size: 56, lr: 4.41e-02, grad_scale: 16.0 2023-09-28 12:14:59,084 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-28 12:14:59,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:14:59,818 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.01 vs. limit=18.25 2023-09-28 12:15:02,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:15:02,107 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:15:02,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:15:03,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:15:10,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 12:15:10,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:15:11,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:15:12,651 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.38 vs. limit=12.875 2023-09-28 12:15:13,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:15:13,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:15:15,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:15:20,770 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:15:20,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:15:20,869 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:15:27,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:15:27,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-28 12:15:31,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:15:31,918 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.00 vs. limit=12.925 2023-09-28 12:15:32,767 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:15:34,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:15:34,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:15:35,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:15:35,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-28 12:15:37,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:15:37,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:15:37,522 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:15:39,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-28 12:15:40,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:15:40,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:15:42,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:15:42,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 12:15:43,872 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.29 vs. limit=18.35 2023-09-28 12:15:44,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:15:44,843 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=14466.666666666666, ans=10.0 2023-09-28 12:15:47,566 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:15:47,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:15:49,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:15:49,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-28 12:15:49,288 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-28 12:15:53,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:15:53,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:15:56,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:15:56,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 12:15:56,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:15:59,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:15:59,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-28 12:16:01,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-28 12:16:01,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:16:01,495 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-28 12:16:02,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:16:04,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:16:04,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-28 12:16:04,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:16:05,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-28 12:16:05,923 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-28 12:16:05,923 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-28 12:16:05,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-28 12:16:08,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:16:10,454 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:16:10,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:16:10,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:16:11,024 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=14600.0, ans=0.005833333333333336 2023-09-28 12:16:12,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 12:16:13,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:16:13,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:16:15,793 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=14600.0, ans=0.0076956521739130436 2023-09-28 12:16:20,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:16:20,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-28 12:16:21,252 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=14666.666666666666, ans=0.15333333333333335 2023-09-28 12:16:22,796 INFO [train.py:1039] (1/4) Epoch 1, batch 2200, loss[loss=0.4638, simple_loss=0.4152, pruned_loss=0.2562, over 19190.00 frames. ], tot_loss[loss=0.4114, simple_loss=0.4098, pruned_loss=0.2067, over 4725847.72 frames. ], batch size: 388, lr: 4.41e-02, grad_scale: 16.0 2023-09-28 12:16:23,489 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=14666.666666666666, ans=0.125 2023-09-28 12:16:24,715 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:16:31,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:16:33,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:16:33,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:16:33,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-28 12:16:36,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:16:36,559 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=14666.666666666666, ans=0.15333333333333335 2023-09-28 12:16:37,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:16:37,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-28 12:16:39,267 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.312e+02 4.143e+02 6.351e+02 9.037e+02 1.826e+03, threshold=1.270e+03, percent-clipped=17.0 2023-09-28 12:16:41,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-28 12:16:44,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 12:16:50,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-28 12:16:52,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:16:54,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:16:55,973 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:17:00,669 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:17:00,703 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-28 12:17:05,496 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 12:17:06,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-28 12:17:08,176 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:17:08,291 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-28 12:17:11,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:17:13,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:17:14,776 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=14866.666666666666, ans=0.125 2023-09-28 12:17:16,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:17:17,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:17:18,265 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=14866.666666666666, ans=0.004722222222222225 2023-09-28 12:17:20,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-28 12:17:22,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:17:23,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-28 12:17:25,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:17:25,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-28 12:17:25,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:17:27,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:17:28,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:17:28,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:17:28,874 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:17:30,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-28 12:17:32,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:17:33,962 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 12:17:36,402 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 12:17:37,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:17:39,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:17:41,112 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-28 12:17:44,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:17:44,934 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-28 12:17:46,372 INFO [train.py:1039] (1/4) Epoch 1, batch 2250, loss[loss=0.3905, simple_loss=0.4146, pruned_loss=0.1832, over 24566.00 frames. ], tot_loss[loss=0.4091, simple_loss=0.4089, pruned_loss=0.2048, over 4720649.68 frames. ], batch size: 71, lr: 4.40e-02, grad_scale: 16.0 2023-09-28 12:17:46,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-28 12:17:46,547 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-28 12:17:47,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:17:48,068 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-28 12:17:49,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:17:51,308 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-28 12:17:51,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:17:54,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:17:56,402 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=15000.0, ans=0.125 2023-09-28 12:17:56,528 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=15000.0, ans=0.15000000000000002 2023-09-28 12:17:58,426 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=12.59 vs. limit=12.5 2023-09-28 12:18:00,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:18:03,640 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:18:04,042 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=15066.666666666666, ans=0.125 2023-09-28 12:18:05,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:18:07,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 12:18:08,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:18:11,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-28 12:18:11,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:18:11,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:18:14,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-28 12:18:14,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:18:14,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:18:18,479 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 12:18:23,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:18:24,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 12:18:26,171 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-28 12:18:26,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-28 12:18:27,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:18:31,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:18:32,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:18:34,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:18:36,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:18:36,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:18:39,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:18:39,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:18:45,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:18:47,404 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-28 12:18:52,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 12:18:52,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:18:53,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:18:58,368 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=15266.666666666666, ans=0.125 2023-09-28 12:18:59,005 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.53 vs. limit=13.225 2023-09-28 12:18:59,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 12:19:02,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-28 12:19:02,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-28 12:19:02,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:19:04,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:19:04,856 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=15266.666666666666, ans=0.00755072463768116 2023-09-28 12:19:07,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-28 12:19:09,228 INFO [train.py:1039] (1/4) Epoch 1, batch 2300, loss[loss=0.3226, simple_loss=0.3593, pruned_loss=0.143, over 24336.00 frames. ], tot_loss[loss=0.406, simple_loss=0.4082, pruned_loss=0.202, over 4728422.22 frames. ], batch size: 61, lr: 4.40e-02, grad_scale: 16.0 2023-09-28 12:19:10,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:19:10,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:19:16,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:19:16,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:19:20,927 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-28 12:19:21,816 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.69 vs. limit=13.25 2023-09-28 12:19:24,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:19:27,594 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.211e+02 3.558e+02 5.040e+02 6.600e+02 1.327e+03, threshold=1.008e+03, percent-clipped=3.0 2023-09-28 12:19:30,901 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:19:30,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-28 12:19:32,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:19:32,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:19:32,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-28 12:19:35,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:19:37,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:19:37,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:19:41,824 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:19:43,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:19:48,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:19:53,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:19:53,680 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:19:57,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:20:00,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:20:01,741 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=10.19 vs. limit=12.766666666666667 2023-09-28 12:20:02,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:20:03,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 12:20:03,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:20:03,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-28 12:20:07,209 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 12:20:07,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:20:07,567 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=15533.333333333334, ans=0.001944444444444443 2023-09-28 12:20:08,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:20:08,693 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:20:08,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:20:10,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 12:20:10,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-28 12:20:10,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-28 12:20:10,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:20:10,381 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:20:10,559 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 12:20:11,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-28 12:20:17,314 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:20:21,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:20:23,691 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=15600.0, ans=0.14400000000000002 2023-09-28 12:20:25,462 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=15600.0, ans=0.14400000000000002 2023-09-28 12:20:27,210 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:20:27,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:20:28,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-28 12:20:32,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 12:20:32,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:20:33,951 INFO [train.py:1039] (1/4) Epoch 1, batch 2350, loss[loss=0.3729, simple_loss=0.4037, pruned_loss=0.1711, over 24650.00 frames. ], tot_loss[loss=0.4022, simple_loss=0.4063, pruned_loss=0.1991, over 4734132.65 frames. ], batch size: 68, lr: 4.40e-02, grad_scale: 16.0 2023-09-28 12:20:34,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 12:20:34,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-28 12:20:37,887 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=15666.666666666666, ans=0.125 2023-09-28 12:20:39,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:20:39,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-28 12:20:45,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-28 12:20:49,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:20:50,259 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=15733.333333333334, ans=0.125 2023-09-28 12:20:54,972 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:20:54,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:20:55,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:20:56,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:20:56,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-28 12:20:57,276 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=18.41 vs. limit=19.3 2023-09-28 12:20:58,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:21:01,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-28 12:21:06,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:21:09,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:21:09,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:21:12,361 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:21:12,529 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-28 12:21:12,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:21:15,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:21:16,999 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:21:17,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:21:21,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:21:23,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-28 12:21:23,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:21:26,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:21:26,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:21:28,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-28 12:21:30,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:21:33,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-28 12:21:33,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:21:37,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-28 12:21:37,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=15866.666666666666, ans=0.0 2023-09-28 12:21:41,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-28 12:21:41,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:21:41,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-28 12:21:43,241 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-28 12:21:43,271 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-28 12:21:44,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-28 12:21:47,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:21:53,615 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:21:53,975 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=16000.0, ans=0.007391304347826087 2023-09-28 12:21:54,406 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.78 vs. limit=13.5 2023-09-28 12:21:55,463 INFO [train.py:1039] (1/4) Epoch 1, batch 2400, loss[loss=0.392, simple_loss=0.4161, pruned_loss=0.1839, over 24073.00 frames. ], tot_loss[loss=0.3984, simple_loss=0.4039, pruned_loss=0.1966, over 4730372.55 frames. ], batch size: 80, lr: 4.39e-02, grad_scale: 32.0 2023-09-28 12:21:59,291 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:21:59,468 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:22:01,084 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-28 12:22:01,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-28 12:22:01,574 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=16000.0, ans=0.125 2023-09-28 12:22:09,068 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 12:22:09,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:22:13,391 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.157e+02 3.788e+02 5.121e+02 7.907e+02 1.984e+03, threshold=1.024e+03, percent-clipped=10.0 2023-09-28 12:22:13,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-28 12:22:13,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:22:15,030 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:22:15,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-28 12:22:21,351 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:22:24,298 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-28 12:22:27,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-28 12:22:32,613 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-28 12:22:37,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:22:38,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:22:43,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:22:45,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-28 12:22:45,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:22:48,300 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.37 vs. limit=13.1 2023-09-28 12:22:52,500 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:22:54,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:22:55,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:22:57,565 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:22:57,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-28 12:22:57,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:22:57,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:22:57,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:22:57,755 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 12:23:02,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:23:03,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 12:23:03,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-28 12:23:05,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-28 12:23:07,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:23:07,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:23:07,719 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=16266.666666666666, ans=0.0 2023-09-28 12:23:08,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-28 12:23:09,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-28 12:23:09,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-28 12:23:09,040 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-28 12:23:09,655 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.41 vs. limit=13.6 2023-09-28 12:23:12,021 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-28 12:23:12,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:23:14,255 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:23:14,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:23:15,593 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-28 12:23:17,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:23:17,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-28 12:23:18,625 INFO [train.py:1039] (1/4) Epoch 1, batch 2450, loss[loss=0.3551, simple_loss=0.3777, pruned_loss=0.1663, over 24293.00 frames. ], tot_loss[loss=0.3922, simple_loss=0.3997, pruned_loss=0.1925, over 4716039.32 frames. ], batch size: 56, lr: 4.39e-02, grad_scale: 32.0 2023-09-28 12:23:21,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:23:21,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:23:25,667 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:23:25,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:23:27,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-28 12:23:31,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:23:33,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:23:35,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:23:36,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:23:36,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:23:36,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-28 12:23:42,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:23:44,561 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 12:23:44,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:23:49,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-28 12:23:51,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:23:51,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:23:52,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:23:55,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-28 12:23:57,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:23:57,498 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=16466.666666666668, ans=0.125 2023-09-28 12:24:04,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:24:06,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:24:06,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:24:07,695 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:24:07,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:24:09,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:24:09,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-28 12:24:12,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:24:14,841 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:24:18,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:24:18,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:24:22,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:24:24,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-28 12:24:24,728 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:24:26,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:24:26,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-28 12:24:26,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:24:27,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:24:30,987 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=16600.0, ans=0.125 2023-09-28 12:24:32,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:24:34,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:24:35,808 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:24:39,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-28 12:24:40,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:24:42,492 INFO [train.py:1039] (1/4) Epoch 1, batch 2500, loss[loss=0.3448, simple_loss=0.364, pruned_loss=0.1628, over 23768.00 frames. ], tot_loss[loss=0.3889, simple_loss=0.3979, pruned_loss=0.19, over 4706973.22 frames. ], batch size: 164, lr: 4.38e-02, grad_scale: 32.0 2023-09-28 12:24:47,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:24:47,642 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=16666.666666666668, ans=0.125 2023-09-28 12:24:53,531 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.71 vs. limit=13.75 2023-09-28 12:24:57,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 12:24:58,653 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.092e+02 3.311e+02 4.772e+02 6.840e+02 1.468e+03, threshold=9.543e+02, percent-clipped=7.0 2023-09-28 12:24:58,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:25:00,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:25:00,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-28 12:25:08,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 12:25:08,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:25:09,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-28 12:25:09,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 12:25:09,977 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-28 12:25:12,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:25:14,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:25:14,557 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-28 12:25:14,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:25:14,708 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-28 12:25:16,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:25:20,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:25:22,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:25:24,096 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 12:25:24,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-28 12:25:26,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:25:29,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:25:32,403 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:25:37,563 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:25:39,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:25:40,281 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=16866.666666666668, ans=0.0 2023-09-28 12:25:45,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-28 12:25:45,712 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=16866.666666666668, ans=0.0 2023-09-28 12:25:46,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-28 12:25:48,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:25:48,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-28 12:25:50,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:25:50,067 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 12:25:50,262 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-28 12:25:50,262 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-28 12:25:50,271 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-28 12:25:54,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:25:56,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-28 12:25:56,038 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-28 12:25:57,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:25:59,618 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-28 12:26:02,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-28 12:26:03,151 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=16933.333333333332, ans=0.125 2023-09-28 12:26:06,326 INFO [train.py:1039] (1/4) Epoch 1, batch 2550, loss[loss=0.3135, simple_loss=0.3558, pruned_loss=0.1356, over 24534.00 frames. ], tot_loss[loss=0.386, simple_loss=0.3968, pruned_loss=0.1877, over 4723440.69 frames. ], batch size: 63, lr: 4.38e-02, grad_scale: 32.0 2023-09-28 12:26:06,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:26:06,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:26:08,200 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:26:09,865 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:26:11,293 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-28 12:26:11,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-28 12:26:12,071 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.74 vs. limit=20.25 2023-09-28 12:26:12,109 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.39 vs. limit=13.875 2023-09-28 12:26:16,966 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-28 12:26:18,582 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:26:20,188 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:26:21,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:26:21,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 12:26:23,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 12:26:23,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:26:23,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:26:27,607 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:26:27,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-28 12:26:27,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-28 12:26:27,693 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:26:27,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-28 12:26:41,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:26:43,026 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=17133.333333333332, ans=0.125 2023-09-28 12:26:47,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:26:47,911 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:26:47,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:26:49,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 12:26:55,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:26:56,973 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=17200.0, ans=0.025 2023-09-28 12:26:58,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 12:26:58,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:26:58,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:26:59,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-28 12:26:59,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-28 12:27:02,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:27:04,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:27:05,166 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=17200.0, ans=9.3 2023-09-28 12:27:09,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:27:09,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-28 12:27:09,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:27:09,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:27:11,699 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:27:13,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 12:27:14,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:27:21,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:27:23,995 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:27:26,878 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-28 12:27:30,286 INFO [train.py:1039] (1/4) Epoch 1, batch 2600, loss[loss=0.3766, simple_loss=0.4113, pruned_loss=0.1709, over 24654.00 frames. ], tot_loss[loss=0.3862, simple_loss=0.3979, pruned_loss=0.1873, over 4732188.14 frames. ], batch size: 73, lr: 4.37e-02, grad_scale: 16.0 2023-09-28 12:27:31,829 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-28 12:27:31,870 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:27:31,923 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-28 12:27:33,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-28 12:27:33,460 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-28 12:27:36,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:27:36,594 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-28 12:27:38,047 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-28 12:27:38,614 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=5.56 vs. limit=10.933333333333334 2023-09-28 12:27:39,557 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-28 12:27:41,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:27:44,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-28 12:27:45,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-28 12:27:45,935 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=17400.0, ans=0.125 2023-09-28 12:27:47,477 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.295e+02 3.359e+02 4.665e+02 7.266e+02 2.532e+03, threshold=9.331e+02, percent-clipped=13.0 2023-09-28 12:27:47,682 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-28 12:27:47,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-28 12:27:51,311 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-28 12:27:51,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-28 12:28:01,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:28:01,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:28:01,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:28:01,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-28 12:28:03,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:28:09,670 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-28 12:28:14,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:28:15,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:28:15,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-28 12:28:17,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:28:17,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:28:17,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-28 12:28:21,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-28 12:28:21,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:28:25,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:28:29,249 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-28 12:28:29,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:28:29,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:28:36,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:28:36,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:28:36,857 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-28 12:28:38,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:28:39,889 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:28:41,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:28:44,766 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=17600.0, ans=0.0 2023-09-28 12:28:45,054 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=18.63 vs. limit=20.7 2023-09-28 12:28:46,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-28 12:28:47,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:28:47,834 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 12:28:52,355 INFO [train.py:1039] (1/4) Epoch 1, batch 2650, loss[loss=0.5041, simple_loss=0.4596, pruned_loss=0.2743, over 19300.00 frames. ], tot_loss[loss=0.3869, simple_loss=0.3982, pruned_loss=0.1878, over 4716133.78 frames. ], batch size: 388, lr: 4.37e-02, grad_scale: 16.0 2023-09-28 12:28:53,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-28 12:28:53,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:28:54,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 12:28:54,137 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-28 12:28:54,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:28:57,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:29:01,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 12:29:02,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:29:05,034 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:29:06,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-28 12:29:06,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 12:29:06,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:29:09,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-28 12:29:10,098 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=17733.333333333332, ans=0.125 2023-09-28 12:29:10,220 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=17733.333333333332, ans=0.0 2023-09-28 12:29:11,441 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-28 12:29:14,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:29:16,160 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-28 12:29:16,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:29:16,273 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-28 12:29:20,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:29:20,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-28 12:29:22,291 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:29:22,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:29:25,947 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.45 vs. limit=14.175 2023-09-28 12:29:29,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-28 12:29:29,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-28 12:29:34,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-28 12:29:37,440 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-28 12:29:37,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:29:39,122 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:29:39,188 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-28 12:29:41,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:29:41,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:29:41,406 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=17866.666666666668, ans=0.0 2023-09-28 12:29:43,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:29:46,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:29:47,606 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:29:47,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:29:48,017 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=17866.666666666668, ans=0.0713333333333333 2023-09-28 12:29:50,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:29:50,837 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=17866.666666666668, ans=0.125 2023-09-28 12:29:52,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:29:53,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:29:53,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:29:55,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:29:55,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-28 12:29:59,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:29:59,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:29:59,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:30:01,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-28 12:30:04,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:30:04,887 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:30:05,680 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.83 vs. limit=14.225 2023-09-28 12:30:06,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:30:08,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:30:10,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-28 12:30:10,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:30:13,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:30:13,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-28 12:30:15,472 INFO [train.py:1039] (1/4) Epoch 1, batch 2700, loss[loss=0.3434, simple_loss=0.3769, pruned_loss=0.155, over 24581.00 frames. ], tot_loss[loss=0.3833, simple_loss=0.3964, pruned_loss=0.1851, over 4726467.89 frames. ], batch size: 60, lr: 4.36e-02, grad_scale: 16.0 2023-09-28 12:30:17,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:30:19,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 12:30:22,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:30:22,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:30:22,366 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:30:22,652 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=18000.0, ans=0.0 2023-09-28 12:30:23,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:30:23,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:30:23,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:30:25,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-28 12:30:25,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-28 12:30:25,473 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:30:27,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-28 12:30:28,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 12:30:30,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:30:32,944 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.245e+02 3.486e+02 4.470e+02 6.707e+02 1.380e+03, threshold=8.939e+02, percent-clipped=9.0 2023-09-28 12:30:33,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:30:35,514 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.57 vs. limit=14.275 2023-09-28 12:30:36,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-28 12:30:36,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:30:38,248 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=18066.666666666668, ans=0.125 2023-09-28 12:30:41,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:30:41,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:30:45,012 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=18066.666666666668, ans=0.125 2023-09-28 12:30:48,948 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:30:48,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:30:49,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:30:49,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:30:49,341 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=18133.333333333332, ans=0.0 2023-09-28 12:30:52,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:30:55,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:30:55,842 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-28 12:30:55,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:31:00,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:31:00,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:31:10,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:31:10,907 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:31:11,834 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=13.04 vs. limit=14.325 2023-09-28 12:31:15,725 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=18200.0, ans=0.125 2023-09-28 12:31:16,880 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:31:16,883 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:31:22,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:31:22,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:31:24,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:31:24,558 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:31:26,145 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:31:26,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:31:27,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:31:31,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:31:31,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:31:34,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-28 12:31:35,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:31:37,384 INFO [train.py:1039] (1/4) Epoch 1, batch 2750, loss[loss=0.3193, simple_loss=0.359, pruned_loss=0.1398, over 24432.00 frames. ], tot_loss[loss=0.3823, simple_loss=0.3956, pruned_loss=0.1846, over 4717720.87 frames. ], batch size: 58, lr: 4.36e-02, grad_scale: 16.0 2023-09-28 12:31:37,491 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:31:37,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-28 12:31:39,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-28 12:31:39,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:31:39,521 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=18333.333333333332, ans=0.1166666666666667 2023-09-28 12:31:43,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:31:43,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:31:44,202 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=18333.333333333332, ans=0.125 2023-09-28 12:31:45,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:31:45,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-28 12:31:45,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:31:50,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:31:50,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 12:31:50,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:31:50,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:31:50,232 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-28 12:31:50,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:31:52,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:31:59,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-28 12:32:02,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:32:02,298 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:32:02,730 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=18400.0, ans=0.125 2023-09-28 12:32:03,812 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:32:03,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-28 12:32:05,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:32:05,525 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=18400.0, ans=0.0 2023-09-28 12:32:07,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:32:07,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:32:07,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:32:12,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 12:32:12,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 12:32:12,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:32:13,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:32:15,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 12:32:21,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:32:24,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 12:32:25,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:32:30,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:32:30,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:32:30,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 12:32:37,099 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:32:37,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:32:37,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-28 12:32:43,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:32:45,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-28 12:32:50,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-28 12:32:53,095 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:32:53,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-28 12:32:54,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:32:56,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:32:57,616 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-28 12:32:57,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:32:58,011 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 12:32:59,451 INFO [train.py:1039] (1/4) Epoch 1, batch 2800, loss[loss=0.3686, simple_loss=0.3754, pruned_loss=0.1809, over 23760.00 frames. ], tot_loss[loss=0.3791, simple_loss=0.3931, pruned_loss=0.1826, over 4704885.26 frames. ], batch size: 179, lr: 4.36e-02, grad_scale: 32.0 2023-09-28 12:33:01,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-28 12:33:01,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:33:02,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:33:04,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-28 12:33:04,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:33:04,114 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:33:05,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:33:05,739 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-28 12:33:05,740 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-28 12:33:10,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:33:11,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 12:33:11,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:33:17,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:33:18,625 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.102e+02 3.169e+02 4.499e+02 7.440e+02 2.031e+03, threshold=8.997e+02, percent-clipped=14.0 2023-09-28 12:33:18,830 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-28 12:33:20,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-28 12:33:23,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-28 12:33:24,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:33:24,883 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:33:24,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:33:29,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:33:31,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:33:31,127 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-28 12:33:31,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:33:39,246 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=15.23 vs. limit=14.4 2023-09-28 12:33:39,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:33:41,571 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:33:44,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:33:44,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:33:46,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:33:46,416 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=18800.0, ans=0.125 2023-09-28 12:33:49,276 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=18866.666666666668, ans=0.125 2023-09-28 12:33:50,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:33:50,788 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-28 12:33:52,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:33:52,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:33:52,833 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:33:58,726 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:33:58,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:33:59,065 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=18866.666666666668, ans=0.0 2023-09-28 12:34:03,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:34:04,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:34:04,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:34:04,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 12:34:04,994 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 12:34:07,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 12:34:09,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:34:09,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-28 12:34:09,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:34:11,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:34:11,458 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:34:12,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-28 12:34:13,494 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.35 vs. limit=21.7 2023-09-28 12:34:14,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:34:14,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:34:14,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:34:16,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-28 12:34:22,224 INFO [train.py:1039] (1/4) Epoch 1, batch 2850, loss[loss=0.3317, simple_loss=0.3702, pruned_loss=0.1466, over 24484.00 frames. ], tot_loss[loss=0.3765, simple_loss=0.3917, pruned_loss=0.1807, over 4720905.44 frames. ], batch size: 63, lr: 4.35e-02, grad_scale: 32.0 2023-09-28 12:34:22,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:34:22,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 12:34:22,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:34:24,731 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:34:29,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:34:29,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:34:29,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:34:32,419 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:34:33,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:34:34,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-28 12:34:35,555 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-28 12:34:41,854 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-28 12:34:41,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:34:43,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-28 12:34:43,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:34:46,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-28 12:34:46,685 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=19066.666666666668, ans=0.0 2023-09-28 12:34:47,931 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-28 12:34:48,258 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=19066.666666666668, ans=0.10933333333333331 2023-09-28 12:34:49,533 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:35:00,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:35:03,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:35:03,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:35:05,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 12:35:05,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 12:35:06,707 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:35:08,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 12:35:09,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-28 12:35:11,343 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=19200.0, ans=0.035 2023-09-28 12:35:12,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:35:12,880 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=19200.0, ans=0.125 2023-09-28 12:35:13,600 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.16 vs. limit=14.7 2023-09-28 12:35:14,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:35:15,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:35:16,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:35:18,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:35:18,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:35:20,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:35:22,258 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:35:22,734 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.06 vs. limit=14.7 2023-09-28 12:35:25,195 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:35:25,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:35:25,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:35:25,771 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=19200.0, ans=0.125 2023-09-28 12:35:26,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:35:33,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:35:35,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-28 12:35:35,175 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-28 12:35:35,597 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=19266.666666666668, ans=0.125 2023-09-28 12:35:36,855 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 12:35:36,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:35:38,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-28 12:35:38,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:35:39,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:35:39,966 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:35:41,382 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:35:41,383 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-28 12:35:41,458 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-28 12:35:41,464 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:35:41,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:35:44,382 INFO [train.py:1039] (1/4) Epoch 1, batch 2900, loss[loss=0.3242, simple_loss=0.3514, pruned_loss=0.1485, over 21096.00 frames. ], tot_loss[loss=0.3738, simple_loss=0.3893, pruned_loss=0.1792, over 4704146.29 frames. ], batch size: 46, lr: 4.35e-02, grad_scale: 32.0 2023-09-28 12:35:46,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-28 12:35:48,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:35:48,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:35:50,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-28 12:35:53,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:35:55,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-28 12:35:55,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-28 12:35:57,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:35:57,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-28 12:36:00,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:36:00,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:36:03,248 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.312e+02 3.297e+02 4.561e+02 6.852e+02 1.887e+03, threshold=9.123e+02, percent-clipped=12.0 2023-09-28 12:36:04,940 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:36:06,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:36:06,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-28 12:36:08,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-28 12:36:08,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-28 12:36:10,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:36:14,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-28 12:36:16,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-28 12:36:19,123 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:36:19,128 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-28 12:36:19,174 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:36:22,777 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:36:22,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-28 12:36:23,446 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten.whitening_limit, batch_count=19466.666666666668, ans=14.8 2023-09-28 12:36:24,659 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=19466.666666666668, ans=0.09899494936611666 2023-09-28 12:36:26,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:36:26,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:36:28,364 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=19466.666666666668, ans=0.0 2023-09-28 12:36:31,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:36:33,259 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:36:33,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-28 12:36:35,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-28 12:36:35,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:36:36,856 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=19533.333333333332, ans=0.125 2023-09-28 12:36:39,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 12:36:43,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-28 12:36:44,965 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:36:47,110 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=14.79 vs. limit=14.825 2023-09-28 12:36:48,360 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=19533.333333333332, ans=0.125 2023-09-28 12:36:48,449 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=19533.333333333332, ans=0.125 2023-09-28 12:36:49,607 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:36:58,519 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=22.45 vs. limit=22.2 2023-09-28 12:36:59,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:36:59,949 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-28 12:37:03,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-28 12:37:04,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:37:04,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-28 12:37:04,874 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:37:06,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-28 12:37:08,474 INFO [train.py:1039] (1/4) Epoch 1, batch 2950, loss[loss=0.3752, simple_loss=0.3988, pruned_loss=0.1758, over 24640.00 frames. ], tot_loss[loss=0.3732, simple_loss=0.3892, pruned_loss=0.1786, over 4690876.40 frames. ], batch size: 65, lr: 4.34e-02, grad_scale: 32.0 2023-09-28 12:37:13,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:37:14,738 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-28 12:37:16,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:37:16,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:37:16,523 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer_ff2.min_abs, batch_count=19666.666666666668, ans=0.1 2023-09-28 12:37:18,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:37:19,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:37:20,561 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.55 vs. limit=14.875 2023-09-28 12:37:21,403 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-28 12:37:21,671 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=19666.666666666668, ans=0.0 2023-09-28 12:37:22,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-28 12:37:23,103 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=19733.333333333332, ans=0.125 2023-09-28 12:37:24,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 12:37:24,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:37:29,243 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:37:31,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:37:32,290 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.84 vs. limit=9.933333333333334 2023-09-28 12:37:34,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:37:34,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:37:38,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:37:38,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:37:39,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:37:41,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:37:41,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:37:44,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-28 12:37:49,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-28 12:37:49,273 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-28 12:37:50,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:37:52,249 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-28 12:37:53,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-28 12:37:53,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:37:54,251 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=19800.0, ans=0.0 2023-09-28 12:37:55,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:37:55,692 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-28 12:37:55,699 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-28 12:37:58,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-28 12:37:58,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:37:58,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:38:03,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:38:04,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:38:05,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:38:05,574 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-28 12:38:05,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:38:05,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-28 12:38:12,396 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:38:13,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:38:15,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-28 12:38:15,335 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:38:17,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-28 12:38:19,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:38:22,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:38:22,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:38:22,442 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=19933.333333333332, ans=0.125 2023-09-28 12:38:22,915 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.55 vs. limit=14.975 2023-09-28 12:38:23,674 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:38:23,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 12:38:25,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:38:26,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:38:26,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-28 12:38:26,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-28 12:38:28,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:38:30,190 INFO [train.py:1039] (1/4) Epoch 1, batch 3000, loss[loss=0.3169, simple_loss=0.3607, pruned_loss=0.1365, over 24336.00 frames. ], tot_loss[loss=0.3725, simple_loss=0.3896, pruned_loss=0.1777, over 4704887.87 frames. ], batch size: 61, lr: 4.34e-02, grad_scale: 32.0 2023-09-28 12:38:30,190 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-28 12:38:43,012 INFO [zipformer.py:1853] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([2.8054, 1.9216, 2.2539, 2.2908, 2.1219, 2.1916, 2.1106, 2.4212], device='cuda:1') 2023-09-28 12:38:44,454 INFO [train.py:1071] (1/4) Epoch 1, validation: loss=0.4132, simple_loss=0.3632, pruned_loss=0.2317, over 1125622.00 frames. 2023-09-28 12:38:44,454 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-28 12:38:44,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:38:44,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:38:44,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-28 12:38:48,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:38:49,919 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:38:51,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-28 12:38:54,484 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-28 12:38:55,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-28 12:38:57,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:38:59,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:38:59,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-28 12:38:59,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:39:02,621 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.319e+02 3.597e+02 4.607e+02 6.753e+02 1.897e+03, threshold=9.214e+02, percent-clipped=10.0 2023-09-28 12:39:07,424 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 12:39:07,814 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=20066.666666666668, ans=0.0 2023-09-28 12:39:15,893 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:39:23,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-28 12:39:23,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:39:25,760 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=20133.333333333332, ans=0.125 2023-09-28 12:39:27,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 12:39:27,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:39:28,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:39:31,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:39:31,557 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-28 12:39:33,227 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-28 12:39:35,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:39:35,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 12:39:37,172 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:39:38,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:39:39,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:39:39,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:39:43,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 12:39:43,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:39:43,135 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:39:46,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:39:46,386 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-28 12:39:47,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:39:49,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:39:49,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:39:53,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:39:55,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:39:56,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-28 12:39:57,505 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-28 12:39:57,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:39:57,613 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-28 12:39:59,076 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 12:39:59,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-28 12:40:02,263 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-28 12:40:02,534 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=20266.666666666668, ans=0.2 2023-09-28 12:40:03,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 12:40:03,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-28 12:40:04,061 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=20266.666666666668, ans=0.125 2023-09-28 12:40:05,340 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-28 12:40:05,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 12:40:07,409 INFO [train.py:1039] (1/4) Epoch 1, batch 3050, loss[loss=0.4341, simple_loss=0.4223, pruned_loss=0.223, over 22612.00 frames. ], tot_loss[loss=0.3734, simple_loss=0.391, pruned_loss=0.1779, over 4703865.92 frames. ], batch size: 322, lr: 4.33e-02, grad_scale: 32.0 2023-09-28 12:40:07,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:40:09,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:40:09,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-28 12:40:09,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:40:10,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:40:12,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-28 12:40:13,699 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:40:13,973 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=20333.333333333332, ans=0.125 2023-09-28 12:40:15,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:40:15,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:40:20,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:40:22,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-28 12:40:31,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-28 12:40:31,399 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-28 12:40:32,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:40:37,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:40:39,170 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:40:39,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:40:41,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:40:43,190 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=20466.666666666668, ans=0.1 2023-09-28 12:40:44,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:40:45,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-28 12:40:45,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:40:46,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:40:46,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:40:47,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:40:49,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:40:50,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:40:52,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-28 12:40:52,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:40:52,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 12:40:54,417 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=20466.666666666668, ans=0.125 2023-09-28 12:40:57,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:40:57,707 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 12:40:57,971 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=20533.333333333332, ans=0.125 2023-09-28 12:40:59,172 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:40:59,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:41:05,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:41:05,983 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:41:06,253 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=20533.333333333332, ans=0.1 2023-09-28 12:41:12,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:41:14,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:41:14,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:41:16,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:41:17,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 12:41:17,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:41:19,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-28 12:41:19,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:41:19,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:41:21,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-28 12:41:24,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:41:28,729 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:41:30,054 INFO [train.py:1039] (1/4) Epoch 1, batch 3100, loss[loss=0.3719, simple_loss=0.3857, pruned_loss=0.179, over 23350.00 frames. ], tot_loss[loss=0.3713, simple_loss=0.39, pruned_loss=0.1763, over 4711317.29 frames. ], batch size: 119, lr: 4.33e-02, grad_scale: 32.0 2023-09-28 12:41:32,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:41:35,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 12:41:37,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-28 12:41:40,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-28 12:41:40,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-28 12:41:42,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:41:47,176 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:41:47,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:41:48,544 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.389e+02 3.317e+02 4.517e+02 6.154e+02 1.203e+03, threshold=9.035e+02, percent-clipped=5.0 2023-09-28 12:41:50,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-28 12:41:54,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:41:58,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-28 12:42:00,856 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.73 vs. limit=15.0 2023-09-28 12:42:03,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 12:42:04,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:42:05,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:42:05,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:42:05,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-28 12:42:07,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:42:07,174 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-28 12:42:07,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:42:08,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:42:10,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-28 12:42:12,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:42:15,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:42:17,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-28 12:42:17,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-28 12:42:18,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:42:20,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:42:24,013 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:42:24,045 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:42:24,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:42:25,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-28 12:42:25,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:42:28,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:42:28,638 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:42:28,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:42:28,665 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 12:42:29,119 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=20866.666666666668, ans=0.125 2023-09-28 12:42:31,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:42:33,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-28 12:42:36,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:42:36,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-28 12:42:36,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:42:38,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:42:38,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-28 12:42:40,741 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.68 vs. limit=10.0 2023-09-28 12:42:43,870 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=20933.333333333332, ans=0.125 2023-09-28 12:42:51,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-28 12:42:53,242 INFO [train.py:1039] (1/4) Epoch 1, batch 3150, loss[loss=0.3606, simple_loss=0.3713, pruned_loss=0.1749, over 23576.00 frames. ], tot_loss[loss=0.3672, simple_loss=0.3871, pruned_loss=0.1737, over 4711550.56 frames. ], batch size: 135, lr: 4.32e-02, grad_scale: 32.0 2023-09-28 12:42:53,504 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=21000.0, ans=0.1 2023-09-28 12:42:54,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:42:56,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:42:57,105 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:42:57,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:42:57,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-28 12:42:59,261 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.40 vs. limit=22.5 2023-09-28 12:43:00,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:43:00,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-28 12:43:01,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-28 12:43:01,889 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=21000.0, ans=0.125 2023-09-28 12:43:04,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:43:06,226 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-28 12:43:09,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-28 12:43:10,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:43:12,194 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-28 12:43:13,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-28 12:43:14,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-28 12:43:14,801 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=21066.666666666668, ans=0.1 2023-09-28 12:43:15,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-28 12:43:15,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-28 12:43:15,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:43:16,001 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:43:16,159 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:43:17,686 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-28 12:43:17,998 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=21066.666666666668, ans=0.006289855072463768 2023-09-28 12:43:21,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:43:21,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:43:23,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:43:24,848 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-28 12:43:28,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-28 12:43:28,237 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:43:31,061 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-28 12:43:31,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:43:33,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-28 12:43:34,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-28 12:43:35,106 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=21133.333333333332, ans=0.125 2023-09-28 12:43:36,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:43:36,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 12:43:36,373 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 12:43:37,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:43:37,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:43:39,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-28 12:43:39,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-28 12:43:39,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-28 12:43:40,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 12:43:40,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:43:42,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:43:42,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:43:42,583 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-28 12:43:44,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:43:45,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-28 12:43:45,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:43:47,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-28 12:43:50,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-28 12:43:53,027 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:43:53,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:43:54,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-28 12:43:54,999 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=21200.0, ans=0.006260869565217392 2023-09-28 12:43:56,173 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 12:43:56,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:43:58,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:43:58,378 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=21266.666666666668, ans=0.125 2023-09-28 12:44:01,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:44:01,140 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:44:06,363 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:44:06,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:44:09,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-28 12:44:12,949 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=21266.666666666668, ans=0.0 2023-09-28 12:44:16,259 INFO [train.py:1039] (1/4) Epoch 1, batch 3200, loss[loss=0.3489, simple_loss=0.3962, pruned_loss=0.1508, over 24312.00 frames. ], tot_loss[loss=0.3652, simple_loss=0.3857, pruned_loss=0.1724, over 4713087.15 frames. ], batch size: 74, lr: 4.32e-02, grad_scale: 32.0 2023-09-28 12:44:16,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:44:16,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-28 12:44:19,635 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=21333.333333333332, ans=10.0 2023-09-28 12:44:20,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:44:23,086 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:44:23,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-28 12:44:26,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:44:26,919 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=21333.333333333332, ans=0.07 2023-09-28 12:44:30,343 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-28 12:44:33,569 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:44:35,017 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.369e+02 3.557e+02 4.560e+02 5.822e+02 1.709e+03, threshold=9.121e+02, percent-clipped=8.0 2023-09-28 12:44:42,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:44:47,293 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=21400.0, ans=0.125 2023-09-28 12:44:50,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-28 12:44:52,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:44:52,836 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=21466.666666666668, ans=0.125 2023-09-28 12:44:55,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-28 12:44:57,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 12:45:00,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:45:00,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 12:45:02,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:45:05,434 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-28 12:45:08,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-28 12:45:10,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-28 12:45:15,095 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-28 12:45:16,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-28 12:45:18,459 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=21533.333333333332, ans=0.125 2023-09-28 12:45:18,547 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=21533.333333333332, ans=0.125 2023-09-28 12:45:22,828 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:45:22,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:45:22,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:45:24,416 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-28 12:45:24,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 12:45:24,658 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 12:45:29,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:45:32,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-28 12:45:32,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-28 12:45:34,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-28 12:45:34,446 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 12:45:36,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-28 12:45:36,498 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=21600.0, ans=0.0061739130434782605 2023-09-28 12:45:38,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:45:38,820 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=21666.666666666668, ans=0.125 2023-09-28 12:45:39,877 INFO [train.py:1039] (1/4) Epoch 1, batch 3250, loss[loss=0.376, simple_loss=0.3869, pruned_loss=0.1825, over 23340.00 frames. ], tot_loss[loss=0.364, simple_loss=0.3845, pruned_loss=0.1718, over 4707977.07 frames. ], batch size: 119, lr: 4.31e-02, grad_scale: 32.0 2023-09-28 12:45:40,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-28 12:45:40,165 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-28 12:45:41,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:45:41,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:45:41,704 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-28 12:45:43,629 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=21666.666666666668, ans=0.125 2023-09-28 12:45:46,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 12:45:49,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:45:59,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:45:59,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-28 12:45:59,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:45:59,646 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=21733.333333333332, ans=0.1 2023-09-28 12:46:00,116 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.63 vs. limit=6.0 2023-09-28 12:46:00,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:46:00,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:46:00,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:46:02,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 12:46:03,450 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=21733.333333333332, ans=0.0 2023-09-28 12:46:04,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:46:04,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:46:06,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:46:06,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:46:06,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:46:06,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:46:11,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:46:13,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:46:14,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:46:14,951 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:46:16,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:46:16,601 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:46:17,849 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:46:23,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-28 12:46:24,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:46:24,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:46:26,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:46:26,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-28 12:46:29,878 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=21866.666666666668, ans=0.0 2023-09-28 12:46:32,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:46:41,026 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:46:41,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:46:41,072 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-28 12:46:41,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:46:41,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 12:46:42,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:46:44,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-28 12:46:45,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-28 12:46:45,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:46:47,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:46:49,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:46:49,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-28 12:46:49,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:46:52,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:46:52,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:46:54,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-28 12:46:54,829 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:46:57,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:46:57,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-28 12:47:00,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:47:00,658 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-28 12:47:02,061 INFO [train.py:1039] (1/4) Epoch 1, batch 3300, loss[loss=0.3211, simple_loss=0.3609, pruned_loss=0.1406, over 24451.00 frames. ], tot_loss[loss=0.3654, simple_loss=0.3856, pruned_loss=0.1726, over 4710402.91 frames. ], batch size: 58, lr: 4.31e-02, grad_scale: 32.0 2023-09-28 12:47:02,261 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-28 12:47:03,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-28 12:47:03,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:47:06,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:47:09,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:47:09,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:47:12,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 12:47:12,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 12:47:14,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:47:16,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:47:18,942 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=22066.666666666668, ans=0.006072463768115942 2023-09-28 12:47:20,468 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.229e+02 3.284e+02 5.093e+02 6.809e+02 1.583e+03, threshold=1.019e+03, percent-clipped=11.0 2023-09-28 12:47:24,123 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-28 12:47:25,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:47:25,597 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:47:27,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:47:27,287 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-28 12:47:27,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:47:28,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 12:47:30,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 12:47:30,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:47:30,417 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-28 12:47:31,074 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.16 vs. limit=22.5 2023-09-28 12:47:35,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:47:35,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-28 12:47:37,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:47:37,015 WARNING [train.py:1197] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-28 12:47:38,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-28 12:47:38,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:47:40,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:47:41,655 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-28 12:47:43,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-28 12:47:45,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:47:46,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-28 12:47:48,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:47:50,642 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.76 vs. limit=15.0 2023-09-28 12:47:51,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:47:51,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:47:54,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:47:54,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:47:54,919 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:47:54,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:47:55,259 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=22200.0, ans=0.0 2023-09-28 12:47:57,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:47:58,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:47:58,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:48:00,285 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-28 12:48:00,621 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=22200.0, ans=0.1 2023-09-28 12:48:01,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-28 12:48:04,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-28 12:48:04,089 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:48:04,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:48:07,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:48:07,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:48:07,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 12:48:08,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:48:08,766 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-28 12:48:10,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:48:11,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 12:48:15,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-28 12:48:15,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:48:16,928 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:48:17,644 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.70 vs. limit=10.0 2023-09-28 12:48:19,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:48:19,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:48:21,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:48:23,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:48:23,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:48:23,545 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=22333.333333333332, ans=0.0 2023-09-28 12:48:24,671 INFO [train.py:1039] (1/4) Epoch 1, batch 3350, loss[loss=0.3241, simple_loss=0.3559, pruned_loss=0.1462, over 21756.00 frames. ], tot_loss[loss=0.3662, simple_loss=0.3865, pruned_loss=0.1729, over 4707763.65 frames. ], batch size: 47, lr: 4.30e-02, grad_scale: 16.0 2023-09-28 12:48:24,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:48:26,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:48:28,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:48:30,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:48:33,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:48:35,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:48:36,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:48:39,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-28 12:48:39,318 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-28 12:48:40,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:48:43,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-28 12:48:43,941 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-28 12:48:46,802 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:48:46,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:48:48,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:48:48,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-28 12:48:48,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:48:48,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:48:52,240 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:48:55,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:48:55,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:48:55,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:48:57,141 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=22466.666666666668, ans=0.0 2023-09-28 12:48:59,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:49:03,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:49:03,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:49:08,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:49:08,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:49:11,599 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:49:11,614 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:49:13,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:49:16,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-28 12:49:16,994 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 12:49:17,061 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-28 12:49:17,123 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:49:18,592 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-28 12:49:20,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:49:21,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:49:24,887 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=22533.333333333332, ans=0.125 2023-09-28 12:49:25,401 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=15.98 vs. limit=15.0 2023-09-28 12:49:27,364 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=22533.333333333332, ans=0.125 2023-09-28 12:49:29,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:49:31,281 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-28 12:49:32,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 12:49:32,941 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:49:35,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:49:36,577 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=23.53 vs. limit=22.5 2023-09-28 12:49:41,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:49:44,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-28 12:49:44,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 12:49:44,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:49:44,619 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=22600.0, ans=0.1 2023-09-28 12:49:48,229 INFO [train.py:1039] (1/4) Epoch 1, batch 3400, loss[loss=0.4769, simple_loss=0.4501, pruned_loss=0.2518, over 19847.00 frames. ], tot_loss[loss=0.3644, simple_loss=0.3857, pruned_loss=0.1715, over 4707557.88 frames. ], batch size: 388, lr: 4.29e-02, grad_scale: 16.0 2023-09-28 12:49:48,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:49:48,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-28 12:49:48,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:49:50,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-28 12:49:51,193 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.60 vs. limit=22.5 2023-09-28 12:49:52,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:49:52,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:49:53,575 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-28 12:49:53,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:49:54,419 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=14.97 vs. limit=15.0 2023-09-28 12:49:54,981 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-28 12:49:59,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-28 12:49:59,545 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-28 12:49:59,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:50:03,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:50:03,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 12:50:03,458 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=22733.333333333332, ans=0.125 2023-09-28 12:50:04,857 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:50:05,046 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=22733.333333333332, ans=0.0 2023-09-28 12:50:06,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:50:07,269 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=4.27 vs. limit=12.0 2023-09-28 12:50:07,750 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.237e+02 3.122e+02 3.897e+02 5.653e+02 2.230e+03, threshold=7.795e+02, percent-clipped=8.0 2023-09-28 12:50:11,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:50:12,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-28 12:50:17,822 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:50:19,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:50:19,470 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:50:21,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-28 12:50:25,600 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=22800.0, ans=0.125 2023-09-28 12:50:28,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:50:34,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-28 12:50:40,962 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:50:41,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:50:41,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-28 12:50:41,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:50:42,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:50:42,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:50:43,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:50:48,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:50:52,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 12:50:52,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:50:58,827 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:51:00,916 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-28 12:51:05,081 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=4.31 vs. limit=15.0 2023-09-28 12:51:05,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 12:51:09,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-28 12:51:10,466 INFO [train.py:1039] (1/4) Epoch 1, batch 3450, loss[loss=0.3816, simple_loss=0.3712, pruned_loss=0.196, over 19719.00 frames. ], tot_loss[loss=0.3615, simple_loss=0.3844, pruned_loss=0.1693, over 4715943.31 frames. ], batch size: 388, lr: 4.29e-02, grad_scale: 16.0 2023-09-28 12:51:12,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-28 12:51:14,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:51:16,952 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:51:16,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-28 12:51:18,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:51:22,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-28 12:51:24,083 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=23000.0, ans=0.125 2023-09-28 12:51:26,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:51:28,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:51:29,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-28 12:51:29,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:51:32,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:51:33,003 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.81 vs. limit=15.0 2023-09-28 12:51:38,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-28 12:51:39,116 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.45 vs. limit=6.0 2023-09-28 12:51:44,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-28 12:51:45,361 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 12:51:45,419 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:51:46,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:51:50,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-28 12:51:51,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:51:51,868 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=23133.333333333332, ans=0.125 2023-09-28 12:51:53,827 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=23133.333333333332, ans=0.0 2023-09-28 12:51:55,431 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=23133.333333333332, ans=0.125 2023-09-28 12:51:56,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:51:56,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:51:56,873 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=23133.333333333332, ans=0.005840579710144928 2023-09-28 12:51:58,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-28 12:51:59,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:52:03,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-28 12:52:03,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:52:04,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:52:08,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:52:10,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-28 12:52:11,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:52:16,030 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=23266.666666666668, ans=0.0 2023-09-28 12:52:17,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:52:17,381 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=23266.666666666668, ans=0.125 2023-09-28 12:52:19,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:52:23,336 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:52:27,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:52:27,921 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:52:29,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:52:29,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:52:32,289 INFO [train.py:1039] (1/4) Epoch 1, batch 3500, loss[loss=0.3273, simple_loss=0.3538, pruned_loss=0.1504, over 23305.00 frames. ], tot_loss[loss=0.3575, simple_loss=0.3815, pruned_loss=0.1668, over 4723943.05 frames. ], batch size: 105, lr: 4.28e-02, grad_scale: 16.0 2023-09-28 12:52:34,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:52:36,149 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=23333.333333333332, ans=0.1 2023-09-28 12:52:38,032 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:52:39,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-28 12:52:39,746 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=23333.333333333332, ans=0.125 2023-09-28 12:52:41,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 12:52:45,872 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-28 12:52:48,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:52:48,955 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-28 12:52:51,803 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.155e+02 3.379e+02 4.182e+02 5.188e+02 1.059e+03, threshold=8.364e+02, percent-clipped=3.0 2023-09-28 12:52:55,617 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:52:55,761 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:52:57,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:52:57,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:52:57,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-28 12:52:59,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:52:59,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:52:59,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-28 12:53:00,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:53:02,411 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-28 12:53:04,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:53:07,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:53:08,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-28 12:53:08,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:53:11,931 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:53:14,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:53:15,581 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:53:17,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:53:17,195 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:53:17,552 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=23466.666666666668, ans=0.125 2023-09-28 12:53:19,414 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=23466.666666666668, ans=0.2 2023-09-28 12:53:20,512 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-28 12:53:20,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-28 12:53:20,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-28 12:53:22,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:53:23,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:53:25,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:53:25,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:53:25,475 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=23533.333333333332, ans=0.125 2023-09-28 12:53:29,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 12:53:30,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:53:33,028 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=23533.333333333332, ans=0.2 2023-09-28 12:53:35,720 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:53:37,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-28 12:53:37,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-28 12:53:37,239 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:53:40,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:53:40,353 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:53:41,889 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:53:46,240 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-28 12:53:46,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:53:48,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:53:50,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-28 12:53:51,953 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-28 12:53:53,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:53:55,350 INFO [train.py:1039] (1/4) Epoch 1, batch 3550, loss[loss=0.3798, simple_loss=0.393, pruned_loss=0.1832, over 23714.00 frames. ], tot_loss[loss=0.3559, simple_loss=0.3799, pruned_loss=0.1659, over 4724216.63 frames. ], batch size: 149, lr: 4.28e-02, grad_scale: 16.0 2023-09-28 12:53:55,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:53:55,543 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:53:57,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:54:00,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:54:01,206 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=23666.666666666668, ans=0.125 2023-09-28 12:54:01,563 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=13.65 vs. limit=15.0 2023-09-28 12:54:10,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:54:12,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 12:54:15,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:54:16,541 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:54:18,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:54:18,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:54:18,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 12:54:21,416 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=23733.333333333332, ans=0.125 2023-09-28 12:54:23,306 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:54:23,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:54:23,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:54:23,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-28 12:54:24,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 12:54:33,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:54:33,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:54:34,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:54:34,629 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:54:36,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:54:36,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-28 12:54:36,111 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:54:38,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:54:38,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 12:54:43,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:54:44,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:54:46,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:54:47,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-28 12:54:49,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:54:50,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-28 12:54:50,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:54:53,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:54:53,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:54:58,761 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-28 12:55:00,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:55:02,216 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 12:55:07,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:55:07,373 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-28 12:55:08,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:55:12,835 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=23933.333333333332, ans=0.0 2023-09-28 12:55:13,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:55:14,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-28 12:55:17,949 INFO [train.py:1039] (1/4) Epoch 1, batch 3600, loss[loss=0.3654, simple_loss=0.3804, pruned_loss=0.1752, over 23839.00 frames. ], tot_loss[loss=0.3538, simple_loss=0.3789, pruned_loss=0.1643, over 4726920.29 frames. ], batch size: 179, lr: 4.27e-02, grad_scale: 32.0 2023-09-28 12:55:21,390 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-28 12:55:21,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:55:22,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:55:25,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:55:26,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:55:27,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:55:30,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:55:32,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:55:33,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:55:36,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:55:36,538 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=24066.666666666668, ans=0.0 2023-09-28 12:55:37,406 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.159e+02 2.926e+02 4.483e+02 7.377e+02 1.636e+03, threshold=8.966e+02, percent-clipped=15.0 2023-09-28 12:55:37,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:55:37,542 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-28 12:55:41,232 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 12:55:42,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:55:45,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:55:48,839 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten.whitening_limit, batch_count=24066.666666666668, ans=15.0 2023-09-28 12:55:49,866 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:55:50,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 12:55:51,455 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:55:51,491 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-28 12:55:51,601 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:55:52,218 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.13 vs. limit=15.0 2023-09-28 12:55:54,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:55:56,081 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:55:56,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:55:57,853 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:55:59,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:55:59,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-28 12:56:06,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:56:07,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 12:56:09,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-28 12:56:11,610 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=24200.0, ans=0.0 2023-09-28 12:56:12,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:56:17,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:56:21,439 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:56:27,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-28 12:56:27,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 12:56:27,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-28 12:56:29,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-28 12:56:31,897 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-28 12:56:34,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:56:35,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:56:36,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-28 12:56:36,650 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:56:36,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:56:36,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:56:38,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-28 12:56:39,577 INFO [train.py:1039] (1/4) Epoch 1, batch 3650, loss[loss=0.2967, simple_loss=0.3361, pruned_loss=0.1287, over 21983.00 frames. ], tot_loss[loss=0.3532, simple_loss=0.3786, pruned_loss=0.1639, over 4719112.51 frames. ], batch size: 48, lr: 4.27e-02, grad_scale: 32.0 2023-09-28 12:56:39,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-28 12:56:43,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:56:43,506 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-28 12:56:50,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-28 12:56:53,631 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:56:57,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-28 12:56:59,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-28 12:57:03,719 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:57:03,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:57:03,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 12:57:04,065 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=24400.0, ans=0.0 2023-09-28 12:57:06,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-28 12:57:06,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:57:06,982 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=24400.0, ans=0.1 2023-09-28 12:57:08,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-28 12:57:09,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-28 12:57:09,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:57:09,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-28 12:57:11,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 12:57:12,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:57:12,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:57:14,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:57:18,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-28 12:57:19,627 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-28 12:57:19,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:57:22,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-28 12:57:24,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:57:24,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-28 12:57:27,069 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=24466.666666666668, ans=0.1 2023-09-28 12:57:33,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:57:35,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:57:35,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-28 12:57:35,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:57:35,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:57:38,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:57:40,262 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:57:41,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:57:41,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:57:43,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 12:57:46,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:57:46,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:57:52,453 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-28 12:57:56,228 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:57:56,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:57:58,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-28 12:57:58,333 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:58:00,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:58:00,740 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=24600.0, ans=0.125 2023-09-28 12:58:02,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:58:02,370 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=24666.666666666668, ans=0.04949747468305833 2023-09-28 12:58:03,218 INFO [train.py:1039] (1/4) Epoch 1, batch 3700, loss[loss=0.3684, simple_loss=0.3799, pruned_loss=0.1784, over 23651.00 frames. ], tot_loss[loss=0.3528, simple_loss=0.3781, pruned_loss=0.1638, over 4725025.91 frames. ], batch size: 232, lr: 4.26e-02, grad_scale: 32.0 2023-09-28 12:58:04,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-28 12:58:04,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:58:07,095 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 12:58:10,270 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:58:10,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:58:11,971 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:58:11,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-28 12:58:13,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:58:14,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 12:58:14,930 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 12:58:18,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 12:58:22,279 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.123e+02 3.422e+02 4.027e+02 5.760e+02 1.496e+03, threshold=8.053e+02, percent-clipped=7.0 2023-09-28 12:58:22,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:58:23,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:58:24,185 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=24733.333333333332, ans=0.1 2023-09-28 12:58:25,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:58:25,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:58:25,389 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 12:58:28,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:58:28,555 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-28 12:58:39,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:58:40,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 12:58:41,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 12:58:41,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-28 12:58:41,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:58:44,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:58:46,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-28 12:58:47,582 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:58:48,558 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.12 vs. limit=15.0 2023-09-28 12:58:49,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:58:50,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:58:52,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 12:58:52,547 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=24866.666666666668, ans=0.125 2023-09-28 12:58:53,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 12:58:54,008 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=24866.666666666668, ans=0.125 2023-09-28 12:58:56,103 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=13.69 vs. limit=15.0 2023-09-28 12:58:58,358 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:58:58,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-28 12:58:58,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:58:58,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-28 12:59:03,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:59:03,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-28 12:59:06,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:59:08,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-28 12:59:10,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:59:10,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-28 12:59:10,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:59:11,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:59:13,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:59:15,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-28 12:59:16,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-28 12:59:16,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:59:18,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:59:19,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:59:19,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:59:22,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:59:24,346 INFO [train.py:1039] (1/4) Epoch 1, batch 3750, loss[loss=0.3269, simple_loss=0.3647, pruned_loss=0.1446, over 24334.00 frames. ], tot_loss[loss=0.3551, simple_loss=0.38, pruned_loss=0.1651, over 4703879.62 frames. ], batch size: 61, lr: 4.26e-02, grad_scale: 32.0 2023-09-28 12:59:24,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:59:25,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:59:27,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-28 12:59:29,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 12:59:32,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-28 12:59:32,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-28 12:59:33,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:59:35,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:59:37,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:59:39,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:59:42,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:59:45,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:59:46,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 12:59:49,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:59:51,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:59:53,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-28 12:59:53,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:59:54,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:59:54,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:59:59,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-28 12:59:59,669 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=25133.333333333332, ans=0.2 2023-09-28 13:00:03,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-28 13:00:03,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:00:03,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:00:05,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:00:12,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:00:13,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-28 13:00:18,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-28 13:00:21,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:00:25,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:00:25,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:00:27,383 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=25200.0, ans=0.125 2023-09-28 13:00:30,075 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 13:00:30,576 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=25266.666666666668, ans=0.125 2023-09-28 13:00:31,939 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=25266.666666666668, ans=0.015 2023-09-28 13:00:32,051 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=25266.666666666668, ans=0.125 2023-09-28 13:00:34,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 13:00:34,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-28 13:00:36,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:00:38,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:00:39,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-28 13:00:46,961 INFO [train.py:1039] (1/4) Epoch 1, batch 3800, loss[loss=0.4364, simple_loss=0.4117, pruned_loss=0.2306, over 19421.00 frames. ], tot_loss[loss=0.357, simple_loss=0.3814, pruned_loss=0.1663, over 4690503.90 frames. ], batch size: 390, lr: 4.25e-02, grad_scale: 32.0 2023-09-28 13:00:49,237 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:00:51,116 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=25333.333333333332, ans=0.125 2023-09-28 13:00:52,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:00:54,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 13:00:55,666 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-28 13:00:55,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:00:57,387 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:00:59,133 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=25333.333333333332, ans=0.0 2023-09-28 13:00:59,168 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=25333.333333333332, ans=0.2 2023-09-28 13:01:00,356 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-28 13:01:01,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 13:01:01,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:01:02,065 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:01:03,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:01:05,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:01:06,440 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.982e+02 3.306e+02 4.581e+02 6.587e+02 1.016e+03, threshold=9.163e+02, percent-clipped=14.0 2023-09-28 13:01:06,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:01:06,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-28 13:01:11,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-28 13:01:12,538 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:01:14,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:01:17,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:01:18,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:01:19,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-28 13:01:19,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:01:23,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:01:25,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:01:28,509 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=25466.666666666668, ans=0.005333333333333333 2023-09-28 13:01:29,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 13:01:31,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-28 13:01:33,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:01:39,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:01:42,622 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=25533.333333333332, ans=0.0 2023-09-28 13:01:43,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:01:44,212 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=25533.333333333332, ans=0.125 2023-09-28 13:01:45,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-28 13:01:49,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-28 13:01:50,485 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:01:52,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:01:54,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:01:54,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-28 13:02:00,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-28 13:02:00,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-28 13:02:00,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:02:00,507 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=25600.0, ans=0.0 2023-09-28 13:02:01,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:02:06,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:02:07,938 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 13:02:09,328 INFO [train.py:1039] (1/4) Epoch 1, batch 3850, loss[loss=0.3414, simple_loss=0.3349, pruned_loss=0.1739, over 19662.00 frames. ], tot_loss[loss=0.3538, simple_loss=0.3775, pruned_loss=0.165, over 4652401.06 frames. ], batch size: 388, lr: 4.24e-02, grad_scale: 32.0 2023-09-28 13:02:11,885 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.33 vs. limit=15.0 2023-09-28 13:02:12,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:02:14,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-28 13:02:14,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:02:14,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:02:17,144 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=25666.666666666668, ans=0.0 2023-09-28 13:02:18,299 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 13:02:20,065 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:02:23,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-28 13:02:23,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-28 13:02:33,495 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:02:33,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:02:36,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:02:36,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:02:39,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:02:41,448 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:02:43,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:02:43,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:02:44,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:02:46,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:02:47,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:02:47,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:02:49,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-28 13:02:49,345 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-28 13:02:49,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:02:51,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:02:54,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:02:55,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:02:55,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-28 13:02:59,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-28 13:02:59,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:03:01,416 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-28 13:03:04,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-28 13:03:06,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=25866.666666666668, ans=0.0 2023-09-28 13:03:09,354 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=14.59 vs. limit=22.5 2023-09-28 13:03:11,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:03:11,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:03:16,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:03:16,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-28 13:03:17,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-28 13:03:18,880 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.16 vs. limit=15.0 2023-09-28 13:03:20,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:03:20,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:03:20,979 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=25933.333333333332, ans=0.125 2023-09-28 13:03:23,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 13:03:23,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 13:03:25,442 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:03:26,221 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:03:26,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:03:27,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-28 13:03:27,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:03:28,066 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=25933.333333333332, ans=0.0 2023-09-28 13:03:29,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-28 13:03:29,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:03:29,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:03:30,007 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.08 vs. limit=15.0 2023-09-28 13:03:30,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:03:32,687 INFO [train.py:1039] (1/4) Epoch 1, batch 3900, loss[loss=0.3238, simple_loss=0.3696, pruned_loss=0.139, over 24661.00 frames. ], tot_loss[loss=0.3511, simple_loss=0.3762, pruned_loss=0.163, over 4660072.20 frames. ], batch size: 65, lr: 4.24e-02, grad_scale: 32.0 2023-09-28 13:03:32,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:03:32,894 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:03:32,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:03:32,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:03:34,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:03:34,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-28 13:03:34,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:03:37,422 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=13.32 vs. limit=15.0 2023-09-28 13:03:38,692 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:03:38,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 13:03:38,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:03:41,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:03:44,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 13:03:44,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:03:46,397 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-28 13:03:47,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-28 13:03:47,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:03:49,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-28 13:03:50,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:03:51,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-28 13:03:52,277 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.968e+02 2.956e+02 3.706e+02 4.742e+02 9.282e+02, threshold=7.412e+02, percent-clipped=1.0 2023-09-28 13:03:52,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-28 13:03:57,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:03:57,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:03:57,345 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:03:57,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:04:02,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:04:06,355 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=26133.333333333332, ans=0.04949747468305833 2023-09-28 13:04:06,727 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.69 vs. limit=15.0 2023-09-28 13:04:07,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:04:09,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:04:09,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:04:11,818 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:04:16,579 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=26133.333333333332, ans=0.0 2023-09-28 13:04:16,633 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=26133.333333333332, ans=0.0 2023-09-28 13:04:19,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:04:21,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:04:21,475 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=26200.0, ans=0.2 2023-09-28 13:04:28,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 13:04:29,037 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:04:40,383 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:04:42,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:04:42,088 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-28 13:04:43,126 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.48 vs. limit=6.0 2023-09-28 13:04:44,328 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-28 13:04:44,360 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:04:45,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-28 13:04:46,207 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=26266.666666666668, ans=0.005159420289855072 2023-09-28 13:04:49,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:04:49,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-28 13:04:55,927 INFO [train.py:1039] (1/4) Epoch 1, batch 3950, loss[loss=0.3003, simple_loss=0.3406, pruned_loss=0.13, over 18377.00 frames. ], tot_loss[loss=0.3501, simple_loss=0.3752, pruned_loss=0.1625, over 4666482.06 frames. ], batch size: 40, lr: 4.23e-02, grad_scale: 32.0 2023-09-28 13:04:57,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:04:57,814 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=26333.333333333332, ans=0.025 2023-09-28 13:05:00,363 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-28 13:05:00,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:05:02,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:05:03,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:05:09,879 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-28 13:05:09,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:05:10,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-28 13:05:12,087 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-28 13:05:12,145 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:05:16,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:05:16,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:05:16,517 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:05:18,998 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-28 13:05:21,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:05:23,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:05:23,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:05:23,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:05:25,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:05:25,926 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1.whitening_limit, batch_count=26400.0, ans=10.0 2023-09-28 13:05:28,657 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=26466.666666666668, ans=0.0 2023-09-28 13:05:37,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:05:37,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:05:42,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-28 13:05:45,164 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=13.48 vs. limit=22.5 2023-09-28 13:05:47,781 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-28 13:05:47,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-28 13:05:47,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:05:49,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:05:58,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-28 13:06:00,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-28 13:06:00,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:06:00,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:06:01,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-28 13:06:01,923 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=26600.0, ans=0.1 2023-09-28 13:06:06,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:06:06,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:06:11,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-28 13:06:20,838 INFO [train.py:1039] (1/4) Epoch 1, batch 4000, loss[loss=0.3688, simple_loss=0.4111, pruned_loss=0.1632, over 24315.00 frames. ], tot_loss[loss=0.35, simple_loss=0.3764, pruned_loss=0.1618, over 4674047.24 frames. ], batch size: 74, lr: 4.23e-02, grad_scale: 32.0 2023-09-28 13:06:22,839 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=26666.666666666668, ans=0.125 2023-09-28 13:06:26,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:06:27,378 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.82 vs. limit=6.0 2023-09-28 13:06:31,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:06:38,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:06:38,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:06:40,165 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:06:40,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-28 13:06:40,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-28 13:06:41,652 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.100e+02 3.230e+02 4.293e+02 5.644e+02 1.126e+03, threshold=8.585e+02, percent-clipped=11.0 2023-09-28 13:06:41,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-28 13:06:41,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:06:41,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-28 13:06:43,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:06:45,597 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.70 vs. limit=6.0 2023-09-28 13:06:47,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:06:47,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:06:47,792 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:06:47,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:06:47,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-28 13:06:48,129 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=26733.333333333332, ans=0.2 2023-09-28 13:06:49,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:06:50,926 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-28 13:06:52,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:06:52,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:06:55,624 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-28 13:06:57,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 13:06:57,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:07:05,207 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-28 13:07:05,271 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:07:06,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:07:09,153 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-28 13:07:09,336 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:07:10,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-28 13:07:10,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:07:10,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:07:12,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:07:14,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:07:15,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:07:15,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:07:17,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-28 13:07:17,986 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.79 vs. limit=15.0 2023-09-28 13:07:18,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:07:20,052 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-28 13:07:22,128 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=26866.666666666668, ans=0.125 2023-09-28 13:07:24,121 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.57 vs. limit=22.5 2023-09-28 13:07:24,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:07:28,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 13:07:30,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:07:30,649 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=26933.333333333332, ans=0.125 2023-09-28 13:07:30,693 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=26933.333333333332, ans=0.025 2023-09-28 13:07:31,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:07:33,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:07:33,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:07:38,178 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:07:38,552 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=26933.333333333332, ans=0.125 2023-09-28 13:07:41,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-28 13:07:41,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-28 13:07:43,730 INFO [train.py:1039] (1/4) Epoch 1, batch 4050, loss[loss=0.3979, simple_loss=0.399, pruned_loss=0.1984, over 23608.00 frames. ], tot_loss[loss=0.3512, simple_loss=0.3775, pruned_loss=0.1624, over 4673154.67 frames. ], batch size: 256, lr: 4.22e-02, grad_scale: 32.0 2023-09-28 13:07:45,451 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:07:45,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:07:47,002 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:07:48,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-28 13:07:50,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:07:51,908 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=27000.0, ans=0.0 2023-09-28 13:07:51,977 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=27000.0, ans=0.09899494936611666 2023-09-28 13:07:53,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:07:54,892 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:07:55,116 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=27000.0, ans=0.125 2023-09-28 13:07:56,329 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 13:07:57,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:07:59,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:08:03,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:08:04,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-28 13:08:06,526 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=27066.666666666668, ans=0.0 2023-09-28 13:08:07,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 13:08:09,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-28 13:08:09,202 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-28 13:08:12,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:08:13,971 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=27133.333333333332, ans=0.2 2023-09-28 13:08:19,749 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=27133.333333333332, ans=0.025 2023-09-28 13:08:21,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-28 13:08:22,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:08:22,976 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=27133.333333333332, ans=0.125 2023-09-28 13:08:26,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:08:29,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:08:29,108 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:08:29,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:08:32,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:08:35,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-28 13:08:35,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 13:08:37,600 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:08:39,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-28 13:08:39,461 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=27200.0, ans=0.125 2023-09-28 13:08:43,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:08:43,914 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=27200.0, ans=0.125 2023-09-28 13:08:52,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-28 13:08:52,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:08:52,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:08:56,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-28 13:08:56,193 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-28 13:08:56,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:08:57,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:08:59,315 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:08:59,361 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:09:05,397 INFO [train.py:1039] (1/4) Epoch 1, batch 4100, loss[loss=0.3282, simple_loss=0.3642, pruned_loss=0.1461, over 24285.00 frames. ], tot_loss[loss=0.3506, simple_loss=0.3775, pruned_loss=0.1619, over 4657808.40 frames. ], batch size: 56, lr: 4.22e-02, grad_scale: 32.0 2023-09-28 13:09:08,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-28 13:09:10,230 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-28 13:09:14,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-28 13:09:14,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-28 13:09:14,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:09:16,693 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:09:16,752 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:09:18,108 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:09:18,245 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-28 13:09:21,604 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=27400.0, ans=0.1 2023-09-28 13:09:22,843 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:09:24,291 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.841e+02 2.870e+02 3.486e+02 4.089e+02 6.314e+02, threshold=6.972e+02, percent-clipped=0.0 2023-09-28 13:09:24,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:09:24,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:09:25,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:09:32,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 13:09:32,483 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=27400.0, ans=0.0 2023-09-28 13:09:33,620 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:09:33,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:09:33,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-28 13:09:35,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:09:35,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:09:35,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:09:35,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:09:35,557 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=27400.0, ans=0.0 2023-09-28 13:09:36,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-28 13:09:38,609 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:09:38,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-28 13:09:40,332 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:09:43,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:09:43,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-28 13:09:45,406 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=26.59 vs. limit=22.5 2023-09-28 13:09:46,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:09:46,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:09:47,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:09:49,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-28 13:09:49,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:09:51,644 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:09:55,972 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-28 13:09:56,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:09:56,383 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=27533.333333333332, ans=0.125 2023-09-28 13:09:57,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-28 13:10:01,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:10:06,620 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=27533.333333333332, ans=0.125 2023-09-28 13:10:08,456 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:10:11,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:10:12,964 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:10:16,563 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=27600.0, ans=0.125 2023-09-28 13:10:17,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:10:17,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:10:20,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:10:24,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:10:27,502 INFO [train.py:1039] (1/4) Epoch 1, batch 4150, loss[loss=0.3401, simple_loss=0.3844, pruned_loss=0.1478, over 24697.00 frames. ], tot_loss[loss=0.3489, simple_loss=0.3766, pruned_loss=0.1606, over 4667818.60 frames. ], batch size: 73, lr: 4.21e-02, grad_scale: 32.0 2023-09-28 13:10:29,074 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-28 13:10:30,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:10:32,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:10:32,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:10:35,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-28 13:10:35,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:10:35,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-28 13:10:37,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-28 13:10:37,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-28 13:10:38,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:10:44,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:10:44,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:10:44,606 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=27733.333333333332, ans=0.1 2023-09-28 13:10:47,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:10:47,568 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:10:49,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-28 13:10:50,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 13:10:51,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:10:53,302 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-28 13:10:53,665 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=27733.333333333332, ans=0.125 2023-09-28 13:10:58,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:11:03,176 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:11:03,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-28 13:11:06,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-28 13:11:06,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:11:07,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-28 13:11:07,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:11:07,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:11:08,353 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=27800.0, ans=0.2 2023-09-28 13:11:11,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:11:13,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:11:16,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-28 13:11:20,021 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-28 13:11:20,209 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 13:11:22,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:11:22,973 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-28 13:11:24,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:11:25,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-28 13:11:27,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:11:27,634 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=27866.666666666668, ans=0.125 2023-09-28 13:11:28,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:11:30,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:11:30,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-28 13:11:30,634 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:11:30,638 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-28 13:11:30,860 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=27933.333333333332, ans=0.0 2023-09-28 13:11:32,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 13:11:36,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-28 13:11:36,070 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:11:37,258 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 13:11:37,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 13:11:38,746 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-28 13:11:40,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:11:40,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 13:11:40,242 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:11:43,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:11:43,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-28 13:11:43,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-28 13:11:45,834 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 13:11:49,438 INFO [train.py:1039] (1/4) Epoch 1, batch 4200, loss[loss=0.364, simple_loss=0.3449, pruned_loss=0.1916, over 19417.00 frames. ], tot_loss[loss=0.3475, simple_loss=0.3752, pruned_loss=0.1599, over 4675394.62 frames. ], batch size: 388, lr: 4.20e-02, grad_scale: 32.0 2023-09-28 13:11:49,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:11:51,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-28 13:11:52,864 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:11:54,454 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:11:54,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:11:56,019 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:11:56,021 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:11:57,885 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=28000.0, ans=0.125 2023-09-28 13:11:58,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-28 13:12:02,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-28 13:12:02,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:12:05,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:12:07,583 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=28066.666666666668, ans=0.125 2023-09-28 13:12:08,513 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.106e+02 3.443e+02 4.074e+02 5.504e+02 1.074e+03, threshold=8.148e+02, percent-clipped=10.0 2023-09-28 13:12:08,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:12:10,471 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=28066.666666666668, ans=0.0 2023-09-28 13:12:13,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-28 13:12:15,001 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:12:15,049 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:12:15,924 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.60 vs. limit=10.0 2023-09-28 13:12:16,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-28 13:12:16,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:12:18,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:12:18,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:12:18,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:12:22,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:12:23,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-28 13:12:23,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:12:28,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-28 13:12:28,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:12:33,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:12:33,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:12:35,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:12:36,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-28 13:12:36,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:12:36,733 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=28200.0, ans=0.004739130434782609 2023-09-28 13:12:38,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:12:39,108 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.23 vs. limit=12.0 2023-09-28 13:12:43,420 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=28200.0, ans=0.004739130434782609 2023-09-28 13:12:44,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:12:46,202 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:12:48,243 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=28200.0, ans=0.125 2023-09-28 13:12:50,353 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.93 vs. limit=15.0 2023-09-28 13:12:52,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-28 13:12:55,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-28 13:12:58,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:13:03,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 13:13:03,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:13:06,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-28 13:13:11,061 INFO [train.py:1039] (1/4) Epoch 1, batch 4250, loss[loss=0.3011, simple_loss=0.3401, pruned_loss=0.1311, over 24299.00 frames. ], tot_loss[loss=0.3454, simple_loss=0.3732, pruned_loss=0.1587, over 4689642.63 frames. ], batch size: 56, lr: 4.20e-02, grad_scale: 32.0 2023-09-28 13:13:11,174 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-28 13:13:15,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:13:17,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-28 13:13:20,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:13:24,230 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=28333.333333333332, ans=0.004710144927536232 2023-09-28 13:13:25,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:13:25,738 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-28 13:13:27,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:13:29,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:13:33,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:13:37,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:13:38,536 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:13:40,116 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:13:41,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:13:41,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:13:43,383 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:13:43,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:13:46,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:13:48,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:13:50,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-28 13:13:52,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-28 13:13:52,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:13:52,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:13:53,504 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:13:53,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:13:53,667 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:13:55,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:13:58,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-28 13:14:00,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:14:03,115 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=28533.333333333332, ans=0.2 2023-09-28 13:14:04,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:14:05,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:14:07,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-28 13:14:07,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:14:07,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-28 13:14:10,505 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-28 13:14:12,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-28 13:14:13,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:14:13,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:14:16,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-28 13:14:18,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 13:14:19,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-28 13:14:23,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:14:26,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:14:28,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:14:28,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:14:31,291 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:14:31,659 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=28666.666666666668, ans=0.00463768115942029 2023-09-28 13:14:32,638 INFO [train.py:1039] (1/4) Epoch 1, batch 4300, loss[loss=0.3632, simple_loss=0.3696, pruned_loss=0.1784, over 22723.00 frames. ], tot_loss[loss=0.3446, simple_loss=0.3728, pruned_loss=0.1582, over 4682943.95 frames. ], batch size: 322, lr: 4.19e-02, grad_scale: 32.0 2023-09-28 13:14:32,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:14:32,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:14:34,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-28 13:14:36,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:14:42,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:14:43,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:14:46,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:14:52,608 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.815e+02 2.945e+02 3.341e+02 4.373e+02 7.931e+02, threshold=6.681e+02, percent-clipped=0.0 2023-09-28 13:14:53,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:14:53,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-28 13:14:55,213 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:14:59,546 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:14:59,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:14:59,624 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-28 13:15:02,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 13:15:04,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:15:06,109 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer_na.min_abs, batch_count=28800.0, ans=0.02 2023-09-28 13:15:07,425 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-28 13:15:08,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:15:08,940 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-28 13:15:11,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 13:15:11,562 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:15:11,815 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=28800.0, ans=0.1 2023-09-28 13:15:15,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:15:15,071 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:15:16,620 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:15:18,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:15:19,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:15:19,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-28 13:15:19,880 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-28 13:15:22,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:15:25,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:15:25,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 13:15:27,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:15:27,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:15:27,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-28 13:15:27,462 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-28 13:15:29,597 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-28 13:15:29,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:15:29,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-28 13:15:29,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-28 13:15:34,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:15:35,340 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=14.30 vs. limit=15.0 2023-09-28 13:15:36,101 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-28 13:15:37,518 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:15:39,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:15:39,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:15:40,877 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-28 13:15:42,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:15:42,902 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:15:44,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:15:44,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:15:44,567 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:15:47,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:15:50,353 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.12 vs. limit=15.0 2023-09-28 13:15:51,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:15:52,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:15:52,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:15:55,544 INFO [train.py:1039] (1/4) Epoch 1, batch 4350, loss[loss=0.3338, simple_loss=0.3729, pruned_loss=0.1474, over 24685.00 frames. ], tot_loss[loss=0.3457, simple_loss=0.3737, pruned_loss=0.1589, over 4687612.17 frames. ], batch size: 65, lr: 4.19e-02, grad_scale: 32.0 2023-09-28 13:15:57,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-28 13:15:58,821 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-28 13:16:02,794 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:16:05,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:16:07,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:16:07,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:16:12,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:16:13,049 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.73 vs. limit=15.0 2023-09-28 13:16:17,747 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:16:19,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:16:19,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:16:24,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:16:26,360 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=29066.666666666668, ans=0.05 2023-09-28 13:16:27,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:16:28,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-28 13:16:33,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-28 13:16:33,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:16:37,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:16:37,507 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=29133.333333333332, ans=0.0 2023-09-28 13:16:41,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:16:44,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-28 13:16:48,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:16:50,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 13:16:55,271 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-28 13:16:55,603 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=29200.0, ans=0.125 2023-09-28 13:16:56,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:16:58,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-28 13:16:58,503 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=29200.0, ans=0.125 2023-09-28 13:16:59,788 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-28 13:16:59,880 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-28 13:16:59,890 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:17:01,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:17:02,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:17:02,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:17:02,346 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=29266.666666666668, ans=0.125 2023-09-28 13:17:03,650 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:17:03,706 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:17:05,482 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-28 13:17:05,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:17:05,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:17:05,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:17:06,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-28 13:17:07,170 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-28 13:17:07,178 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-28 13:17:07,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-28 13:17:10,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:17:10,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:17:12,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:17:12,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:17:15,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-28 13:17:18,354 INFO [train.py:1039] (1/4) Epoch 1, batch 4400, loss[loss=0.377, simple_loss=0.3886, pruned_loss=0.1828, over 22727.00 frames. ], tot_loss[loss=0.3455, simple_loss=0.3742, pruned_loss=0.1584, over 4702849.42 frames. ], batch size: 322, lr: 4.18e-02, grad_scale: 32.0 2023-09-28 13:17:18,448 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-28 13:17:18,461 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:17:25,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:17:25,021 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:17:26,725 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:17:28,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-28 13:17:30,253 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-28 13:17:30,321 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-28 13:17:30,361 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-28 13:17:31,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 13:17:31,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:17:32,124 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=29333.333333333332, ans=0.125 2023-09-28 13:17:34,109 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=6.10 vs. limit=6.0 2023-09-28 13:17:34,771 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-28 13:17:36,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:17:37,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:17:38,347 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.230e+02 3.318e+02 4.079e+02 5.261e+02 1.011e+03, threshold=8.157e+02, percent-clipped=12.0 2023-09-28 13:17:38,470 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-28 13:17:41,596 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:17:41,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-28 13:17:41,701 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-28 13:17:41,870 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=29400.0, ans=0.1 2023-09-28 13:17:44,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-28 13:17:44,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-28 13:17:45,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-28 13:17:45,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:17:47,138 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:17:48,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:17:50,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:17:51,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-28 13:17:51,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-28 13:17:53,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:17:54,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:17:54,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:17:56,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:17:56,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:17:56,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-28 13:17:58,698 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-28 13:18:03,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:18:09,153 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.49 vs. limit=15.0 2023-09-28 13:18:09,934 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:18:12,157 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-28 13:18:15,219 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:18:18,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:18:20,594 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:18:22,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-28 13:18:22,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:18:22,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:18:22,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:18:23,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-28 13:18:28,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-28 13:18:29,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-28 13:18:31,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-28 13:18:31,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:18:31,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-28 13:18:31,979 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:18:35,136 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:18:38,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-28 13:18:41,496 INFO [train.py:1039] (1/4) Epoch 1, batch 4450, loss[loss=0.379, simple_loss=0.3876, pruned_loss=0.1852, over 23734.00 frames. ], tot_loss[loss=0.3477, simple_loss=0.3757, pruned_loss=0.1598, over 4704452.26 frames. ], batch size: 232, lr: 4.17e-02, grad_scale: 32.0 2023-09-28 13:18:41,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:18:45,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:18:45,339 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:18:45,914 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.27 vs. limit=15.0 2023-09-28 13:18:52,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:18:52,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:18:56,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:18:58,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:18:58,805 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=29733.333333333332, ans=0.125 2023-09-28 13:19:00,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:19:00,431 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=29733.333333333332, ans=0.125 2023-09-28 13:19:01,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:19:03,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-28 13:19:03,139 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:19:05,120 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:19:05,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:19:05,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:19:08,142 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 13:19:14,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:19:14,833 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:19:16,286 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:19:17,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:19:19,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:19:23,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 13:19:23,964 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=29800.0, ans=0.2 2023-09-28 13:19:24,982 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-28 13:19:25,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-28 13:19:25,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:19:26,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:19:28,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-28 13:19:28,749 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=29800.0, ans=0.2 2023-09-28 13:19:31,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-28 13:19:37,051 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:19:37,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-28 13:19:37,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:19:37,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:19:37,200 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:19:37,223 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:19:40,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:19:41,590 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.16 vs. limit=15.0 2023-09-28 13:19:44,047 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-28 13:19:44,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-28 13:19:46,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 13:19:49,222 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:19:50,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:19:52,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:19:53,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 13:19:55,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-28 13:19:59,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-28 13:20:01,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:20:03,926 INFO [train.py:1039] (1/4) Epoch 1, batch 4500, loss[loss=0.3117, simple_loss=0.3513, pruned_loss=0.1361, over 24619.00 frames. ], tot_loss[loss=0.3475, simple_loss=0.3756, pruned_loss=0.1597, over 4712328.93 frames. ], batch size: 60, lr: 4.17e-02, grad_scale: 32.0 2023-09-28 13:20:07,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:20:08,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-28 13:20:08,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-28 13:20:10,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:20:15,958 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:20:17,377 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:20:17,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 13:20:18,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:20:18,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:20:19,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:20:23,494 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.079e+02 2.950e+02 3.479e+02 4.293e+02 6.506e+02, threshold=6.959e+02, percent-clipped=0.0 2023-09-28 13:20:33,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:20:33,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:20:37,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:20:39,130 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:20:39,276 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:20:45,335 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 13:20:49,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:20:51,018 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=30133.333333333332, ans=0.1 2023-09-28 13:20:53,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 13:20:57,047 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:20:57,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-28 13:20:57,222 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:20:59,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:20:59,393 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=30200.0, ans=0.125 2023-09-28 13:21:02,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:21:02,110 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:21:05,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:21:05,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-28 13:21:05,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 13:21:05,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:21:10,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:21:10,473 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:21:14,219 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:21:14,510 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=30266.666666666668, ans=0.004289855072463768 2023-09-28 13:21:15,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-28 13:21:15,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:21:17,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-28 13:21:20,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-28 13:21:20,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-28 13:21:20,875 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=30266.666666666668, ans=0.2 2023-09-28 13:21:22,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-28 13:21:26,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-28 13:21:26,219 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=30333.333333333332, ans=0.125 2023-09-28 13:21:27,404 INFO [train.py:1039] (1/4) Epoch 1, batch 4550, loss[loss=0.3743, simple_loss=0.4035, pruned_loss=0.1725, over 23533.00 frames. ], tot_loss[loss=0.3454, simple_loss=0.374, pruned_loss=0.1584, over 4710899.25 frames. ], batch size: 93, lr: 4.16e-02, grad_scale: 32.0 2023-09-28 13:21:27,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:21:32,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:21:32,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:21:37,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:21:40,340 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=22.13 vs. limit=22.5 2023-09-28 13:21:40,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:21:42,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:21:43,776 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=6.99 vs. limit=15.0 2023-09-28 13:21:44,355 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:21:44,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:21:44,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:21:46,630 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:21:48,064 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:21:50,608 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.26 vs. limit=22.5 2023-09-28 13:21:51,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:21:53,668 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-28 13:21:55,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-28 13:21:55,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:21:55,556 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=30400.0, ans=0.125 2023-09-28 13:21:56,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-28 13:21:57,064 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=30400.0, ans=0.125 2023-09-28 13:22:01,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-28 13:22:02,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:22:04,420 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=30466.666666666668, ans=0.1 2023-09-28 13:22:07,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-28 13:22:09,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 13:22:13,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:22:13,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:22:13,965 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-28 13:22:14,304 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=30466.666666666668, ans=0.1 2023-09-28 13:22:16,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-28 13:22:16,387 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=30533.333333333332, ans=0.004231884057971015 2023-09-28 13:22:19,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:22:20,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:22:20,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:22:21,620 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=30533.333333333332, ans=0.1 2023-09-28 13:22:22,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:22:24,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-28 13:22:24,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-28 13:22:24,533 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:22:26,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-28 13:22:28,126 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-28 13:22:28,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:22:28,328 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:22:29,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:22:31,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:22:31,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:22:31,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 13:22:32,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-28 13:22:35,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:22:35,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 13:22:35,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-28 13:22:35,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:22:37,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-28 13:22:40,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:22:40,922 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:22:42,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:22:44,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:22:44,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-28 13:22:45,711 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:22:48,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-28 13:22:51,269 INFO [train.py:1039] (1/4) Epoch 1, batch 4600, loss[loss=0.3001, simple_loss=0.2985, pruned_loss=0.1508, over 18920.00 frames. ], tot_loss[loss=0.3428, simple_loss=0.3724, pruned_loss=0.1566, over 4707015.12 frames. ], batch size: 388, lr: 4.15e-02, grad_scale: 32.0 2023-09-28 13:22:52,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:22:54,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:22:56,847 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:22:57,077 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=30666.666666666668, ans=0.1 2023-09-28 13:22:58,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:22:58,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:22:59,910 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-28 13:23:02,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:23:05,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:23:06,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:23:08,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:23:11,498 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.016e+02 3.238e+02 3.793e+02 5.269e+02 1.285e+03, threshold=7.587e+02, percent-clipped=5.0 2023-09-28 13:23:15,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-28 13:23:17,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:23:17,287 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=30733.333333333332, ans=0.00418840579710145 2023-09-28 13:23:20,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:23:25,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:23:25,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:23:31,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-28 13:23:31,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 13:23:33,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:23:37,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:23:38,314 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=30800.0, ans=0.125 2023-09-28 13:23:39,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:23:40,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:23:44,219 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-28 13:23:44,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-28 13:23:49,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:23:49,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:23:51,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:23:51,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 13:23:52,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:23:53,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-28 13:23:53,903 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:23:55,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:23:56,890 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:23:57,002 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:23:57,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:23:57,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-28 13:23:59,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-28 13:24:00,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-28 13:24:00,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:24:02,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:24:02,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:24:04,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:24:14,412 INFO [train.py:1039] (1/4) Epoch 1, batch 4650, loss[loss=0.269, simple_loss=0.3185, pruned_loss=0.1098, over 24442.00 frames. ], tot_loss[loss=0.3396, simple_loss=0.3703, pruned_loss=0.1545, over 4708609.89 frames. ], batch size: 58, lr: 4.15e-02, grad_scale: 32.0 2023-09-28 13:24:14,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:24:17,731 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=31000.0, ans=0.09899494936611666 2023-09-28 13:24:18,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:24:18,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:24:19,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:24:19,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:24:19,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:24:21,088 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:24:24,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-28 13:24:25,976 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=31000.0, ans=0.125 2023-09-28 13:24:27,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:24:28,869 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-28 13:24:28,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:24:30,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-28 13:24:30,518 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:24:32,398 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-28 13:24:32,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-28 13:24:32,447 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:24:33,902 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:24:34,825 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=22.08 vs. limit=22.5 2023-09-28 13:24:38,822 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:24:38,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:24:39,018 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-28 13:24:41,257 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=31066.666666666668, ans=0.0 2023-09-28 13:24:41,402 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=31066.666666666668, ans=0.07 2023-09-28 13:24:42,839 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=31066.666666666668, ans=0.1 2023-09-28 13:24:44,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:24:45,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-28 13:24:48,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:24:48,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:24:50,244 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-28 13:24:51,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:24:55,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:24:58,700 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:25:03,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:25:06,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:25:07,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:25:09,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:25:11,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-28 13:25:12,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-28 13:25:14,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 13:25:14,824 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-28 13:25:15,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:25:17,207 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=31200.0, ans=0.125 2023-09-28 13:25:23,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-28 13:25:23,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:25:23,380 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-28 13:25:23,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:25:24,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:25:24,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:25:26,301 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:25:30,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:25:30,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:25:30,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:25:30,582 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=31266.666666666668, ans=0.004072463768115942 2023-09-28 13:25:33,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:25:34,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:25:34,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 13:25:34,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-28 13:25:35,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-28 13:25:36,411 INFO [train.py:1039] (1/4) Epoch 1, batch 4700, loss[loss=0.3448, simple_loss=0.3763, pruned_loss=0.1567, over 24647.00 frames. ], tot_loss[loss=0.341, simple_loss=0.3717, pruned_loss=0.1551, over 4719202.38 frames. ], batch size: 65, lr: 4.14e-02, grad_scale: 32.0 2023-09-28 13:25:36,618 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-28 13:25:38,392 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=31333.333333333332, ans=0.0 2023-09-28 13:25:44,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:25:45,040 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:25:47,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:25:48,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:25:50,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 13:25:55,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-28 13:25:56,846 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.950e+02 3.070e+02 3.636e+02 4.699e+02 2.301e+03, threshold=7.272e+02, percent-clipped=9.0 2023-09-28 13:25:56,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-28 13:26:00,088 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:26:02,150 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:26:02,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:26:05,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:26:12,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 13:26:14,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 13:26:15,230 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.99 vs. limit=12.0 2023-09-28 13:26:17,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:26:22,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-28 13:26:24,587 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:26:26,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:26:30,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-28 13:26:31,661 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:26:31,923 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.min_positive, batch_count=31533.333333333332, ans=0.025 2023-09-28 13:26:36,885 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:26:37,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-28 13:26:39,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:26:39,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:26:41,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:26:41,727 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:26:41,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-28 13:26:42,021 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=31600.0, ans=0.004 2023-09-28 13:26:43,240 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-28 13:26:43,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:26:45,739 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.99 vs. limit=10.0 2023-09-28 13:26:46,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:26:46,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:26:46,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-28 13:26:46,558 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=31600.0, ans=0.0 2023-09-28 13:26:49,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:26:52,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-28 13:26:55,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:26:56,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:26:59,832 INFO [train.py:1039] (1/4) Epoch 1, batch 4750, loss[loss=0.3455, simple_loss=0.3721, pruned_loss=0.1594, over 23833.00 frames. ], tot_loss[loss=0.3404, simple_loss=0.3714, pruned_loss=0.1547, over 4702660.02 frames. ], batch size: 179, lr: 4.14e-02, grad_scale: 32.0 2023-09-28 13:27:03,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:27:03,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:27:03,373 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=31666.666666666668, ans=0.125 2023-09-28 13:27:04,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-28 13:27:04,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:27:08,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-28 13:27:10,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:27:10,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:27:10,591 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=31666.666666666668, ans=0.003985507246376811 2023-09-28 13:27:11,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:27:16,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-28 13:27:19,499 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:27:22,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-28 13:27:23,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:27:26,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:27:26,945 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:27:26,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:27:27,090 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-28 13:27:27,095 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-28 13:27:33,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-28 13:27:36,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:27:36,795 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.73 vs. limit=15.0 2023-09-28 13:27:39,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:27:40,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 13:27:40,797 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-28 13:27:40,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:27:44,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-28 13:27:46,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:27:46,353 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=31800.0, ans=0.003956521739130435 2023-09-28 13:27:49,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-28 13:27:49,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-28 13:27:49,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:27:49,346 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:27:50,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:27:52,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 13:27:52,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-28 13:27:53,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-28 13:27:56,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:27:59,474 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=7.54 vs. limit=15.0 2023-09-28 13:28:00,030 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:28:00,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-28 13:28:00,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:28:02,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:28:04,259 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:28:04,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:28:05,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 13:28:10,310 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:28:10,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-28 13:28:11,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-28 13:28:11,903 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-28 13:28:15,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-28 13:28:15,475 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:28:16,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-28 13:28:21,635 INFO [train.py:1039] (1/4) Epoch 1, batch 4800, loss[loss=0.3301, simple_loss=0.3607, pruned_loss=0.1497, over 23566.00 frames. ], tot_loss[loss=0.341, simple_loss=0.3721, pruned_loss=0.1549, over 4702441.56 frames. ], batch size: 135, lr: 4.13e-02, grad_scale: 32.0 2023-09-28 13:28:23,282 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:28:23,349 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:28:24,308 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.01 vs. limit=12.0 2023-09-28 13:28:28,092 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=32000.0, ans=0.2 2023-09-28 13:28:29,424 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:28:30,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:28:30,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:28:31,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-28 13:28:32,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:28:32,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:28:36,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:28:41,495 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.966e+02 2.727e+02 3.187e+02 3.824e+02 7.207e+02, threshold=6.374e+02, percent-clipped=0.0 2023-09-28 13:28:41,680 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:28:43,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:28:43,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:28:44,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:28:44,809 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 13:28:44,829 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:28:46,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:28:49,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:28:54,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:28:54,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:28:55,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:28:57,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 13:28:59,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:29:00,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-28 13:29:01,995 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-28 13:29:02,118 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:29:02,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:29:03,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:29:03,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:29:03,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:29:06,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:29:06,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:29:10,042 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:29:16,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:29:18,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:29:23,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-28 13:29:23,888 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:29:23,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:29:24,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 13:29:25,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:29:27,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:29:28,057 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.72 vs. limit=22.5 2023-09-28 13:29:29,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:29:29,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:29:30,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:29:30,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:29:32,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:29:36,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:29:36,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:29:36,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:29:37,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-28 13:29:40,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-28 13:29:40,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:29:40,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:29:40,296 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:29:40,298 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:29:43,144 INFO [train.py:1039] (1/4) Epoch 1, batch 4850, loss[loss=0.3368, simple_loss=0.3703, pruned_loss=0.1516, over 24642.00 frames. ], tot_loss[loss=0.3411, simple_loss=0.3722, pruned_loss=0.155, over 4708597.31 frames. ], batch size: 65, lr: 4.12e-02, grad_scale: 32.0 2023-09-28 13:29:43,345 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:29:53,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-28 13:29:55,130 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:29:55,781 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.23 vs. limit=10.0 2023-09-28 13:29:56,972 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=32333.333333333332, ans=0.1 2023-09-28 13:30:01,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:30:02,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 13:30:02,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:30:04,386 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=32400.0, ans=0.125 2023-09-28 13:30:06,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:30:07,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:30:07,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:30:07,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-28 13:30:12,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:30:15,093 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:30:15,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 13:30:15,239 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 13:30:15,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-28 13:30:18,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:30:18,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:30:25,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:30:25,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-28 13:30:25,311 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=32466.666666666668, ans=0.2 2023-09-28 13:30:26,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-28 13:30:27,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 13:30:31,856 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=23.12 vs. limit=22.5 2023-09-28 13:30:34,916 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=32533.333333333332, ans=0.025 2023-09-28 13:30:36,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:30:36,407 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-28 13:30:37,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:30:37,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:30:39,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-28 13:30:39,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-28 13:30:39,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:30:43,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-28 13:30:43,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:30:43,478 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=32533.333333333332, ans=0.1 2023-09-28 13:30:44,514 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:30:46,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-28 13:30:52,886 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=32600.0, ans=0.0037826086956521737 2023-09-28 13:30:54,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:31:00,056 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:31:00,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:31:05,051 INFO [train.py:1039] (1/4) Epoch 1, batch 4900, loss[loss=0.2948, simple_loss=0.3101, pruned_loss=0.1397, over 22690.00 frames. ], tot_loss[loss=0.3402, simple_loss=0.3713, pruned_loss=0.1546, over 4695522.90 frames. ], batch size: 322, lr: 4.12e-02, grad_scale: 32.0 2023-09-28 13:31:05,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-28 13:31:05,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:31:12,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:31:12,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:31:12,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:31:17,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-28 13:31:22,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-28 13:31:23,892 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.271e+02 3.049e+02 3.510e+02 4.577e+02 9.864e+02, threshold=7.020e+02, percent-clipped=4.0 2023-09-28 13:31:25,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-28 13:31:27,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-28 13:31:27,873 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:31:29,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:31:29,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:31:29,953 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:31:29,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:31:31,349 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-28 13:31:35,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-28 13:31:37,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 13:31:39,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-28 13:31:39,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:31:41,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:31:42,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:31:42,839 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=32800.0, ans=0.125 2023-09-28 13:31:44,224 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:31:44,248 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-28 13:31:47,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:31:48,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:31:50,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-28 13:31:50,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-28 13:31:53,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-28 13:31:54,714 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.87 vs. limit=10.0 2023-09-28 13:31:55,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:31:55,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:31:57,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:31:57,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:31:57,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 13:31:57,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:31:58,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-28 13:32:02,367 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:32:03,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-28 13:32:04,358 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 13:32:06,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:32:06,522 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=32866.666666666664, ans=0.125 2023-09-28 13:32:09,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-28 13:32:09,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:32:10,705 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-28 13:32:10,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-28 13:32:11,467 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.89 vs. limit=15.0 2023-09-28 13:32:14,597 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=32933.333333333336, ans=0.125 2023-09-28 13:32:18,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:32:20,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:32:22,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-28 13:32:22,111 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=32933.333333333336, ans=0.0 2023-09-28 13:32:23,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 13:32:23,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:32:23,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:32:27,215 INFO [train.py:1039] (1/4) Epoch 1, batch 4950, loss[loss=0.3606, simple_loss=0.3774, pruned_loss=0.172, over 23462.00 frames. ], tot_loss[loss=0.3376, simple_loss=0.3687, pruned_loss=0.1532, over 4698614.90 frames. ], batch size: 134, lr: 4.11e-02, grad_scale: 32.0 2023-09-28 13:32:28,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:32:28,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:32:28,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:32:28,911 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-28 13:32:29,205 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=33000.0, ans=0.125 2023-09-28 13:32:30,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 13:32:30,891 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=33000.0, ans=0.0 2023-09-28 13:32:33,441 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:32:35,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 13:32:37,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-28 13:32:38,562 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-28 13:32:38,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-28 13:32:40,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-28 13:32:40,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:32:40,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:32:40,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-28 13:32:40,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:32:42,575 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:32:43,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:32:45,445 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:32:46,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:32:48,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:32:48,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:32:53,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 13:32:55,741 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=33066.666666666664, ans=0.0 2023-09-28 13:32:58,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:33:00,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:33:02,219 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:33:02,289 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:33:02,522 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=33133.333333333336, ans=0.0 2023-09-28 13:33:03,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:33:05,403 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-28 13:33:05,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-28 13:33:07,964 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.15 vs. limit=10.0 2023-09-28 13:33:09,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:33:10,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:33:12,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:33:12,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:33:12,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:33:13,150 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=33133.333333333336, ans=0.1 2023-09-28 13:33:14,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-28 13:33:16,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:33:19,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-28 13:33:19,493 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=33200.0, ans=0.0036521739130434784 2023-09-28 13:33:20,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:33:22,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:33:22,828 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.66 vs. limit=15.0 2023-09-28 13:33:23,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:33:23,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-28 13:33:25,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:33:25,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:33:30,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:33:32,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:33:32,519 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:33:32,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:33:34,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:33:34,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:33:37,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:33:37,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:33:38,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:33:38,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-28 13:33:42,505 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:33:42,855 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=33266.666666666664, ans=0.0 2023-09-28 13:33:49,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-28 13:33:50,625 INFO [train.py:1039] (1/4) Epoch 1, batch 5000, loss[loss=0.3483, simple_loss=0.3812, pruned_loss=0.1577, over 23436.00 frames. ], tot_loss[loss=0.337, simple_loss=0.3683, pruned_loss=0.1529, over 4693821.02 frames. ], batch size: 106, lr: 4.10e-02, grad_scale: 16.0 2023-09-28 13:33:50,634 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-28 13:33:57,216 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:33:57,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:33:58,889 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-28 13:34:02,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-28 13:34:04,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:34:04,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-28 13:34:05,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:34:05,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:34:07,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-28 13:34:07,547 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:34:07,655 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:34:09,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-28 13:34:09,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:34:09,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:34:09,620 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=33400.0, ans=0.2 2023-09-28 13:34:10,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-28 13:34:12,240 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.427e+02 3.057e+02 3.609e+02 4.472e+02 7.216e+02, threshold=7.218e+02, percent-clipped=2.0 2023-09-28 13:34:12,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-28 13:34:12,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:34:12,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-28 13:34:12,576 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 13:34:13,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:34:15,864 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 13:34:15,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-28 13:34:15,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-28 13:34:17,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-28 13:34:17,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:34:18,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:34:19,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-28 13:34:19,120 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:34:20,752 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:34:20,888 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=33400.0, ans=0.125 2023-09-28 13:34:22,887 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:34:24,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-28 13:34:25,970 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-28 13:34:26,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:34:26,335 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=33466.666666666664, ans=0.1 2023-09-28 13:34:27,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:34:32,123 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-28 13:34:36,303 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:34:37,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:34:37,834 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:34:41,454 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.72 vs. limit=15.0 2023-09-28 13:34:42,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-28 13:34:42,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:34:43,757 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:34:43,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:34:45,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-28 13:34:45,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:34:49,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:34:50,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:34:55,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-28 13:35:00,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:35:11,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:35:12,666 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:35:12,680 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:35:12,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:35:12,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 13:35:12,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:35:14,085 INFO [train.py:1039] (1/4) Epoch 1, batch 5050, loss[loss=0.3376, simple_loss=0.3608, pruned_loss=0.1572, over 23664.00 frames. ], tot_loss[loss=0.3367, simple_loss=0.3688, pruned_loss=0.1523, over 4710580.79 frames. ], batch size: 149, lr: 4.10e-02, grad_scale: 16.0 2023-09-28 13:35:14,217 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:35:19,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:35:19,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-28 13:35:19,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:35:22,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:35:26,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:35:26,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-28 13:35:26,967 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.55 vs. limit=15.0 2023-09-28 13:35:27,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:35:27,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:35:30,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 13:35:32,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:35:33,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-28 13:35:39,639 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=33733.333333333336, ans=0.125 2023-09-28 13:35:42,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-28 13:35:44,369 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-28 13:35:46,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:35:46,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-28 13:35:47,976 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:35:48,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:35:49,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:35:49,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:35:49,647 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-28 13:35:51,158 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-28 13:35:52,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:35:54,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:35:57,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:35:57,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-28 13:36:00,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:36:04,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-28 13:36:05,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:36:05,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:36:07,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:36:07,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:36:09,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:36:12,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:36:12,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:36:12,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:36:14,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:36:14,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-28 13:36:15,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:36:18,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:36:23,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:36:23,059 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-28 13:36:23,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-28 13:36:24,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:36:24,772 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:36:26,123 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-28 13:36:29,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:36:30,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-28 13:36:30,610 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:36:33,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:36:34,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:36:34,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-28 13:36:36,980 INFO [train.py:1039] (1/4) Epoch 1, batch 5100, loss[loss=0.3576, simple_loss=0.3674, pruned_loss=0.1739, over 23680.00 frames. ], tot_loss[loss=0.3369, simple_loss=0.3694, pruned_loss=0.1522, over 4717565.58 frames. ], batch size: 232, lr: 4.09e-02, grad_scale: 16.0 2023-09-28 13:36:37,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-28 13:36:37,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:36:37,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:36:39,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:36:42,575 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-28 13:36:46,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:36:47,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-28 13:36:49,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-28 13:36:49,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:36:49,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:36:49,861 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=34000.0, ans=0.003478260869565218 2023-09-28 13:36:52,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:36:54,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-28 13:36:54,416 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-28 13:36:54,865 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=34066.666666666664, ans=0.125 2023-09-28 13:36:59,956 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.897e+02 3.081e+02 3.841e+02 4.720e+02 7.459e+02, threshold=7.682e+02, percent-clipped=1.0 2023-09-28 13:37:00,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:37:01,577 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:37:06,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:37:09,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-28 13:37:09,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:37:13,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:37:13,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-28 13:37:13,312 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=34133.333333333336, ans=0.125 2023-09-28 13:37:16,298 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=34133.333333333336, ans=0.2 2023-09-28 13:37:17,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:37:17,766 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:37:17,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-28 13:37:20,126 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-28 13:37:21,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:37:21,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-28 13:37:21,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-28 13:37:26,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:37:27,120 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.99 vs. limit=15.0 2023-09-28 13:37:32,630 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:37:36,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-28 13:37:38,248 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-28 13:37:38,272 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-28 13:37:41,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-28 13:37:41,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:37:42,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-28 13:37:46,889 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-28 13:37:49,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 13:37:51,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:37:53,725 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-28 13:37:56,718 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-28 13:37:56,798 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-28 13:37:58,629 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=34266.666666666664, ans=0.125 2023-09-28 13:38:01,387 INFO [train.py:1039] (1/4) Epoch 1, batch 5150, loss[loss=0.3523, simple_loss=0.3778, pruned_loss=0.1634, over 18929.00 frames. ], tot_loss[loss=0.338, simple_loss=0.3704, pruned_loss=0.1528, over 4712870.15 frames. ], batch size: 41, lr: 4.09e-02, grad_scale: 16.0 2023-09-28 13:38:03,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:38:03,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:38:03,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:38:04,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:38:04,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 13:38:06,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:38:06,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-28 13:38:06,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-28 13:38:07,538 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-28 13:38:07,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:38:07,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-28 13:38:11,004 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:38:11,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 13:38:11,466 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=34333.333333333336, ans=0.125 2023-09-28 13:38:13,451 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:38:14,984 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:38:18,948 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=25.60 vs. limit=22.5 2023-09-28 13:38:19,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 13:38:19,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-28 13:38:19,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:38:19,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:38:23,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-28 13:38:23,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:38:23,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:38:25,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:38:25,111 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:38:25,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-28 13:38:25,368 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=34400.0, ans=0.125 2023-09-28 13:38:25,475 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=34400.0, ans=0.125 2023-09-28 13:38:26,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:38:27,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:38:28,389 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=15.32 vs. limit=22.5 2023-09-28 13:38:28,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 13:38:32,052 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-28 13:38:33,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:38:39,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:38:42,140 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=34466.666666666664, ans=0.05 2023-09-28 13:38:43,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-28 13:38:46,606 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:38:52,874 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=15.44 vs. limit=22.5 2023-09-28 13:38:54,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:38:55,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:38:55,235 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=34533.333333333336, ans=0.95 2023-09-28 13:38:56,985 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=34533.333333333336, ans=0.1 2023-09-28 13:39:00,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:39:00,368 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:39:02,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-28 13:39:07,529 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:39:08,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:39:09,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:39:12,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:39:12,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:39:13,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-28 13:39:20,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:39:21,799 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 13:39:24,045 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:39:24,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:39:24,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-28 13:39:25,900 INFO [train.py:1039] (1/4) Epoch 1, batch 5200, loss[loss=0.3252, simple_loss=0.3721, pruned_loss=0.1392, over 24478.00 frames. ], tot_loss[loss=0.3376, simple_loss=0.37, pruned_loss=0.1526, over 4702050.41 frames. ], batch size: 66, lr: 4.08e-02, grad_scale: 32.0 2023-09-28 13:39:25,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-28 13:39:26,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:39:27,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:39:30,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:39:32,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:39:32,458 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=34666.666666666664, ans=0.003333333333333334 2023-09-28 13:39:33,930 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=34666.666666666664, ans=0.125 2023-09-28 13:39:35,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:39:39,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-28 13:39:41,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:39:42,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:39:44,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:39:44,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:39:44,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:39:46,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-28 13:39:47,264 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.249e+02 2.894e+02 3.453e+02 4.172e+02 7.980e+02, threshold=6.907e+02, percent-clipped=1.0 2023-09-28 13:39:47,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 13:39:49,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:39:51,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-28 13:39:52,913 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer_ff2.min_abs, batch_count=34733.333333333336, ans=0.1 2023-09-28 13:39:54,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-28 13:39:55,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:39:55,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-28 13:39:57,561 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-28 13:39:59,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-28 13:40:00,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:40:00,618 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-28 13:40:00,628 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:40:02,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:40:02,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:40:03,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-28 13:40:03,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:40:07,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:40:09,657 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=34800.0, ans=0.025 2023-09-28 13:40:12,447 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-28 13:40:12,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-28 13:40:12,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-28 13:40:12,819 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=34800.0, ans=0.125 2023-09-28 13:40:18,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-28 13:40:18,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 13:40:26,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-28 13:40:26,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:40:28,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-28 13:40:28,548 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:40:28,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-28 13:40:29,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:40:30,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 13:40:32,191 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=34933.333333333336, ans=0.0032753623188405794 2023-09-28 13:40:33,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:40:33,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:40:37,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:40:39,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:40:39,048 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:40:46,313 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=34933.333333333336, ans=0.125 2023-09-28 13:40:47,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:40:47,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-28 13:40:47,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:40:49,091 INFO [train.py:1039] (1/4) Epoch 1, batch 5250, loss[loss=0.3222, simple_loss=0.376, pruned_loss=0.1342, over 24465.00 frames. ], tot_loss[loss=0.3382, simple_loss=0.3706, pruned_loss=0.1529, over 4692567.19 frames. ], batch size: 69, lr: 4.07e-02, grad_scale: 32.0 2023-09-28 13:40:49,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:40:49,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:40:49,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-28 13:40:50,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:40:53,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:40:56,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:40:56,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:40:58,407 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:41:02,388 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 13:41:03,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:41:05,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:41:10,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:41:11,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 13:41:13,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-28 13:41:13,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:41:14,742 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:41:48,799 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.85 vs. limit=6.0 2023-09-28 13:41:54,320 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=35266.666666666664, ans=0.0 2023-09-28 13:41:54,370 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=35266.666666666664, ans=0.025 2023-09-28 13:42:03,682 INFO [train.py:1039] (1/4) Epoch 1, batch 5300, loss[loss=0.3506, simple_loss=0.3639, pruned_loss=0.1686, over 23752.00 frames. ], tot_loss[loss=0.3355, simple_loss=0.3678, pruned_loss=0.1515, over 4699197.57 frames. ], batch size: 232, lr: 4.07e-02, grad_scale: 32.0 2023-09-28 13:42:18,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:42:18,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 13:42:18,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 13:42:18,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:42:19,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:42:19,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:42:19,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:42:20,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:42:20,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:42:20,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:42:20,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 13:42:20,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:42:20,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 13:42:20,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 13:42:20,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 13:42:21,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 13:42:21,183 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 13:42:21,309 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 13:42:21,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:42:21,957 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:42:22,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:42:22,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:42:22,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:42:22,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:42:22,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:42:22,833 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:42:23,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:42:23,402 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:42:23,410 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:42:23,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:42:23,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:42:24,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 13:42:24,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:42:24,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:42:24,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 13:42:24,907 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 13:42:25,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 13:42:25,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:42:25,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 13:42:25,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 13:42:25,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 13:42:26,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:42:26,286 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:42:26,436 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 13:42:26,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 13:42:26,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:42:26,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:42:27,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 13:42:27,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 13:42:27,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 13:42:27,816 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-28 13:42:30,677 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.18 vs. limit=6.0 2023-09-28 13:42:40,315 INFO [train.py:1039] (1/4) Epoch 2, batch 0, loss[loss=0.368, simple_loss=0.4082, pruned_loss=0.1639, over 24658.00 frames. ], tot_loss[loss=0.368, simple_loss=0.4082, pruned_loss=0.1639, over 24658.00 frames. ], batch size: 73, lr: 3.99e-02, grad_scale: 32.0 2023-09-28 13:42:40,316 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-28 13:42:56,276 INFO [train.py:1071] (1/4) Epoch 2, validation: loss=0.367, simple_loss=0.3421, pruned_loss=0.196, over 1125622.00 frames. 2023-09-28 13:42:56,277 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-28 13:42:56,618 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=35413.333333333336, ans=0.125 2023-09-28 13:42:57,798 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.107e+02 3.100e+02 3.616e+02 4.753e+02 9.571e+02, threshold=7.232e+02, percent-clipped=1.0 2023-09-28 13:42:59,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-28 13:42:59,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:43:03,040 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:43:07,056 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:43:07,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:43:07,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:43:08,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-28 13:43:10,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-28 13:43:13,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:43:14,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:43:17,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:43:17,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:43:19,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:43:19,364 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:43:21,133 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=35480.0, ans=0.0 2023-09-28 13:43:22,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-28 13:43:24,482 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:43:32,220 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=35546.666666666664, ans=0.125 2023-09-28 13:43:33,351 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:43:33,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:43:34,197 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.48 vs. limit=22.5 2023-09-28 13:43:35,820 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-28 13:43:41,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:43:41,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:43:44,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:43:49,131 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:43:53,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:44:00,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-28 13:44:02,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-28 13:44:02,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:44:02,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:44:03,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:44:05,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:44:05,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-28 13:44:08,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:44:10,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:44:14,474 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=35680.0, ans=0.125 2023-09-28 13:44:15,719 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:44:19,258 INFO [train.py:1039] (1/4) Epoch 2, batch 50, loss[loss=0.3148, simple_loss=0.3467, pruned_loss=0.1414, over 23864.00 frames. ], tot_loss[loss=0.3354, simple_loss=0.3698, pruned_loss=0.1505, over 1081692.93 frames. ], batch size: 212, lr: 3.98e-02, grad_scale: 32.0 2023-09-28 13:44:19,335 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-28 13:44:19,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:44:22,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:44:24,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:44:24,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-28 13:44:25,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 13:44:25,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:44:27,516 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=35746.666666666664, ans=0.125 2023-09-28 13:44:28,971 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=35746.666666666664, ans=0.0 2023-09-28 13:44:30,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:44:32,013 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:44:35,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:44:37,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-28 13:44:37,015 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:44:39,331 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=20.02 vs. limit=22.5 2023-09-28 13:44:42,445 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=16.90 vs. limit=15.0 2023-09-28 13:44:43,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-28 13:44:44,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-28 13:44:46,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-28 13:44:49,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:44:52,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:44:52,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:44:52,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:44:53,239 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=35880.0, ans=0.1 2023-09-28 13:44:54,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-28 13:44:54,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 13:44:54,586 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:45:02,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:45:05,170 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:45:05,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:45:05,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-28 13:45:07,479 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:45:08,126 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten.whitening_limit, batch_count=35946.666666666664, ans=15.0 2023-09-28 13:45:08,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:45:09,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-28 13:45:10,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:45:12,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-28 13:45:18,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:45:18,343 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:45:22,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:45:22,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:45:22,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-28 13:45:27,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-28 13:45:27,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-28 13:45:27,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:45:29,096 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-28 13:45:30,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:45:30,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:45:30,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-28 13:45:30,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-28 13:45:32,245 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-28 13:45:32,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:45:33,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:45:35,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-28 13:45:35,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-28 13:45:35,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:45:37,031 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:45:38,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-28 13:45:38,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:45:42,091 INFO [train.py:1039] (1/4) Epoch 2, batch 100, loss[loss=0.3211, simple_loss=0.3654, pruned_loss=0.1384, over 23759.00 frames. ], tot_loss[loss=0.3321, simple_loss=0.3675, pruned_loss=0.1483, over 1882386.25 frames. ], batch size: 85, lr: 3.97e-02, grad_scale: 32.0 2023-09-28 13:45:43,570 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.251e+02 2.783e+02 3.462e+02 4.523e+02 1.049e+03, threshold=6.924e+02, percent-clipped=4.0 2023-09-28 13:45:43,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:45:45,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:45:48,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:45:49,237 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=36080.0, ans=0.2 2023-09-28 13:45:50,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-28 13:45:50,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:45:56,074 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:45:56,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:45:56,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:45:56,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:45:57,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:45:59,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-28 13:46:00,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:46:01,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:46:02,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:46:02,753 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:46:05,146 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.09 vs. limit=6.0 2023-09-28 13:46:05,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-28 13:46:07,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:46:08,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:46:10,502 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-28 13:46:12,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 13:46:15,759 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-28 13:46:15,803 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-28 13:46:15,990 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:46:15,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:46:16,288 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=36213.333333333336, ans=0.0 2023-09-28 13:46:20,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-28 13:46:22,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:46:23,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:46:31,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:46:32,837 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-28 13:46:34,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-28 13:46:36,324 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=36280.0, ans=0.125 2023-09-28 13:46:39,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:46:41,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:46:42,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:46:45,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:46:48,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:46:49,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:46:52,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:46:54,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:46:54,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:46:55,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:46:55,797 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:46:57,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-28 13:46:57,357 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-28 13:46:57,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:46:58,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:47:01,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:47:01,738 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:47:01,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 13:47:01,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 13:47:01,882 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-28 13:47:01,892 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:47:03,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:47:05,404 INFO [train.py:1039] (1/4) Epoch 2, batch 150, loss[loss=0.3157, simple_loss=0.3492, pruned_loss=0.1411, over 23646.00 frames. ], tot_loss[loss=0.3321, simple_loss=0.3668, pruned_loss=0.1487, over 2514197.17 frames. ], batch size: 135, lr: 3.97e-02, grad_scale: 32.0 2023-09-28 13:47:05,497 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:47:05,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:47:07,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:47:08,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:47:14,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:47:14,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:47:14,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:47:16,741 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.95 vs. limit=22.5 2023-09-28 13:47:17,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:47:17,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:47:19,514 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=36413.333333333336, ans=0.002953623188405796 2023-09-28 13:47:20,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:47:20,699 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:47:24,199 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=36480.0, ans=0.002939130434782609 2023-09-28 13:47:25,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-28 13:47:27,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-28 13:47:27,293 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-28 13:47:29,168 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=36480.0, ans=0.125 2023-09-28 13:47:30,342 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:47:30,350 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:47:31,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:47:33,251 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:47:33,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:47:33,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:47:33,412 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:47:34,951 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-28 13:47:37,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:47:44,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:47:47,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:47:50,688 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-28 13:47:54,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:47:56,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:47:56,193 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:47:59,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:48:01,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:48:01,603 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=36613.333333333336, ans=0.125 2023-09-28 13:48:02,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-28 13:48:04,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:48:06,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-28 13:48:08,575 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=36613.333333333336, ans=0.125 2023-09-28 13:48:11,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:48:11,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:48:11,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:48:11,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:48:13,332 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=36680.0, ans=0.125 2023-09-28 13:48:14,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:48:16,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 13:48:16,183 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=36680.0, ans=0.125 2023-09-28 13:48:19,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:48:21,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:48:21,356 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:48:24,970 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:48:25,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-28 13:48:25,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:48:25,085 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-28 13:48:30,043 INFO [train.py:1039] (1/4) Epoch 2, batch 200, loss[loss=0.4681, simple_loss=0.4455, pruned_loss=0.2453, over 19412.00 frames. ], tot_loss[loss=0.33, simple_loss=0.3652, pruned_loss=0.1474, over 3009278.54 frames. ], batch size: 389, lr: 3.96e-02, grad_scale: 32.0 2023-09-28 13:48:30,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:48:31,379 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.151e+02 2.761e+02 3.224e+02 4.160e+02 8.294e+02, threshold=6.447e+02, percent-clipped=1.0 2023-09-28 13:48:33,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:48:34,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:48:38,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-28 13:48:39,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:48:39,256 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=36746.666666666664, ans=0.0 2023-09-28 13:48:40,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:48:42,125 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-28 13:48:44,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-28 13:48:45,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:48:45,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:48:50,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:48:52,143 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:48:52,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:49:06,401 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=36880.0, ans=0.2 2023-09-28 13:49:10,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:49:10,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:49:11,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:49:13,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:49:13,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 13:49:13,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 13:49:13,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:49:15,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:49:15,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:49:16,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:49:19,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-28 13:49:19,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 13:49:19,158 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:49:21,468 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.90 vs. limit=22.5 2023-09-28 13:49:23,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:49:23,869 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.77 vs. limit=15.0 2023-09-28 13:49:24,852 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=36946.666666666664, ans=0.125 2023-09-28 13:49:27,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:49:35,040 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:49:35,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:49:35,979 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=3.94 vs. limit=12.0 2023-09-28 13:49:44,098 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:49:45,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-28 13:49:47,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:49:47,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-28 13:49:47,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:49:47,404 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:49:51,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-28 13:49:52,437 INFO [train.py:1039] (1/4) Epoch 2, batch 250, loss[loss=0.3614, simple_loss=0.3728, pruned_loss=0.175, over 23848.00 frames. ], tot_loss[loss=0.3318, simple_loss=0.3654, pruned_loss=0.1491, over 3385255.73 frames. ], batch size: 195, lr: 3.95e-02, grad_scale: 32.0 2023-09-28 13:49:52,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:49:52,560 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-28 13:49:54,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:49:54,296 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=37080.0, ans=0.0 2023-09-28 13:49:55,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:49:57,880 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:49:57,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:50:00,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:50:01,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:50:03,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:50:09,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:50:13,995 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=37146.666666666664, ans=0.125 2023-09-28 13:50:21,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:50:24,378 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:50:24,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:50:31,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-28 13:50:33,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-28 13:50:34,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:50:34,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:50:36,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 13:50:36,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:50:36,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:50:37,879 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:50:40,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-28 13:50:40,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:50:43,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:50:43,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:50:43,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:50:45,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:50:46,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:50:46,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:50:49,548 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:50:51,087 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:50:52,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:50:52,714 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=37280.0, ans=0.125 2023-09-28 13:50:57,567 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-28 13:50:58,003 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=37346.666666666664, ans=0.125 2023-09-28 13:51:01,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:51:02,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:51:08,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:51:11,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:51:14,097 INFO [train.py:1039] (1/4) Epoch 2, batch 300, loss[loss=0.3044, simple_loss=0.3187, pruned_loss=0.145, over 22655.00 frames. ], tot_loss[loss=0.3285, simple_loss=0.3634, pruned_loss=0.1468, over 3692419.97 frames. ], batch size: 322, lr: 3.95e-02, grad_scale: 32.0 2023-09-28 13:51:14,246 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-28 13:51:15,696 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.217e+02 3.009e+02 3.543e+02 4.126e+02 1.008e+03, threshold=7.086e+02, percent-clipped=8.0 2023-09-28 13:51:15,847 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:51:15,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:51:17,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-28 13:51:18,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-28 13:51:19,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:51:19,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-28 13:51:19,923 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=37413.333333333336, ans=0.125 2023-09-28 13:51:23,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:51:24,687 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:51:29,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:51:29,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-28 13:51:30,154 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=12.10 vs. limit=15.0 2023-09-28 13:51:31,310 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:51:31,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 13:51:31,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-28 13:51:32,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:51:39,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:51:41,538 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=37480.0, ans=0.125 2023-09-28 13:51:42,762 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:51:42,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-28 13:51:45,988 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-28 13:51:47,302 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:51:49,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:51:51,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:51:51,128 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-28 13:51:51,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:51:55,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:51:57,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:51:57,939 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:52:01,155 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-28 13:52:01,163 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-28 13:52:02,976 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_ff2.min_abs, batch_count=37613.333333333336, ans=0.1 2023-09-28 13:52:04,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:52:07,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:52:10,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-28 13:52:10,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:52:16,253 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:52:17,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:52:17,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-28 13:52:18,433 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.25 vs. limit=22.5 2023-09-28 13:52:22,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:52:22,428 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 13:52:25,446 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:52:27,025 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-28 13:52:27,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-28 13:52:29,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 13:52:29,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:52:30,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-28 13:52:32,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:52:33,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:52:33,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:52:35,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:52:36,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:52:37,456 INFO [train.py:1039] (1/4) Epoch 2, batch 350, loss[loss=0.3043, simple_loss=0.348, pruned_loss=0.1303, over 24685.00 frames. ], tot_loss[loss=0.3264, simple_loss=0.3616, pruned_loss=0.1456, over 3930182.31 frames. ], batch size: 65, lr: 3.94e-02, grad_scale: 32.0 2023-09-28 13:52:41,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:52:41,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 13:52:44,637 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:52:48,992 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.74 vs. limit=15.0 2023-09-28 13:52:51,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:52:54,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:52:54,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:52:54,977 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=37813.333333333336, ans=0.125 2023-09-28 13:52:56,386 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-28 13:52:57,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:52:58,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-28 13:52:59,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:53:01,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-28 13:53:03,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:53:06,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-28 13:53:08,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:53:10,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:53:12,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:53:13,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:53:13,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:53:15,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:53:15,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:53:15,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:53:17,712 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.64 vs. limit=15.0 2023-09-28 13:53:18,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:53:18,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:53:24,443 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.72 vs. limit=15.0 2023-09-28 13:53:25,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:53:25,411 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-28 13:53:26,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:53:26,845 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:53:31,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-28 13:53:31,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:53:31,967 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=37946.666666666664, ans=0.125 2023-09-28 13:53:38,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:53:38,207 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:53:38,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:53:39,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-28 13:53:41,683 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=37946.666666666664, ans=0.125 2023-09-28 13:53:42,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:53:42,918 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-28 13:53:44,465 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-28 13:53:44,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:53:48,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:53:48,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-28 13:53:49,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:53:51,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:53:55,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:53:56,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:53:56,672 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:53:58,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:54:01,050 INFO [train.py:1039] (1/4) Epoch 2, batch 400, loss[loss=0.2952, simple_loss=0.3353, pruned_loss=0.1275, over 24454.00 frames. ], tot_loss[loss=0.3265, simple_loss=0.3608, pruned_loss=0.1461, over 4087759.10 frames. ], batch size: 58, lr: 3.94e-02, grad_scale: 32.0 2023-09-28 13:54:01,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:54:02,564 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.015e+02 2.905e+02 3.509e+02 4.327e+02 7.986e+02, threshold=7.018e+02, percent-clipped=1.0 2023-09-28 13:54:05,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-28 13:54:07,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-28 13:54:07,824 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:54:07,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:54:09,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:54:09,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:54:12,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:54:14,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:54:15,078 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.85 vs. limit=15.0 2023-09-28 13:54:17,110 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-28 13:54:18,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-28 13:54:18,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:54:20,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-28 13:54:22,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:54:24,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:54:24,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:54:26,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-28 13:54:26,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:54:26,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:54:26,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:54:26,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:54:28,690 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=38146.666666666664, ans=0.125 2023-09-28 13:54:30,357 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-28 13:54:30,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-28 13:54:35,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:54:36,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:54:38,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-28 13:54:40,382 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-28 13:54:42,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:54:45,232 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:54:51,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-28 13:54:53,829 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-28 13:54:55,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-28 13:54:55,584 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=38280.0, ans=0.1 2023-09-28 13:55:00,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:55:00,823 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=38280.0, ans=0.07 2023-09-28 13:55:01,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:55:01,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-28 13:55:05,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:55:08,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 13:55:10,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:55:13,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:55:15,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-28 13:55:17,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-28 13:55:17,489 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=38346.666666666664, ans=0.125 2023-09-28 13:55:18,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-28 13:55:18,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:55:20,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:55:23,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-28 13:55:24,946 INFO [train.py:1039] (1/4) Epoch 2, batch 450, loss[loss=0.3379, simple_loss=0.3613, pruned_loss=0.1572, over 23519.00 frames. ], tot_loss[loss=0.328, simple_loss=0.3623, pruned_loss=0.1468, over 4222537.40 frames. ], batch size: 256, lr: 3.93e-02, grad_scale: 32.0 2023-09-28 13:55:25,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:55:26,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:55:26,649 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-28 13:55:30,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-28 13:55:30,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:55:31,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:55:31,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:55:31,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-28 13:55:31,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:55:33,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:55:37,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:55:41,931 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=9.79 vs. limit=15.0 2023-09-28 13:55:47,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:55:47,248 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:55:51,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-28 13:55:51,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-28 13:55:55,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-28 13:55:58,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:55:58,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:56:02,114 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=38546.666666666664, ans=0.125 2023-09-28 13:56:03,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:56:04,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:56:08,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-28 13:56:08,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-28 13:56:09,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-28 13:56:10,016 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:56:12,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:56:12,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:56:13,195 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=5.87 vs. limit=8.0 2023-09-28 13:56:15,215 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-28 13:56:15,231 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-28 13:56:15,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:56:18,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:56:20,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-28 13:56:23,543 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-28 13:56:23,616 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:56:25,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-28 13:56:25,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-28 13:56:27,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:56:30,814 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-28 13:56:30,886 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 13:56:32,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-28 13:56:36,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:56:37,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-28 13:56:38,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-28 13:56:39,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:56:45,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:56:45,396 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=38680.0, ans=0.125 2023-09-28 13:56:46,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:56:48,181 INFO [train.py:1039] (1/4) Epoch 2, batch 500, loss[loss=0.4571, simple_loss=0.4415, pruned_loss=0.2363, over 19443.00 frames. ], tot_loss[loss=0.3296, simple_loss=0.3635, pruned_loss=0.1479, over 4333960.56 frames. ], batch size: 388, lr: 3.92e-02, grad_scale: 32.0 2023-09-28 13:56:48,324 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:56:48,359 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-28 13:56:50,428 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.109e+02 2.855e+02 3.493e+02 4.304e+02 8.305e+02, threshold=6.986e+02, percent-clipped=1.0 2023-09-28 13:56:53,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:56:53,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:56:55,148 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:56:55,163 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-28 13:56:56,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-28 13:56:56,810 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:57:00,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 13:57:05,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 13:57:06,694 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:57:08,335 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:57:08,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:57:09,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:57:18,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:57:18,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-28 13:57:19,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:57:19,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:57:21,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-28 13:57:21,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:57:25,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:57:25,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:57:27,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:57:27,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:57:28,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-28 13:57:33,677 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-28 13:57:34,330 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.90 vs. limit=12.0 2023-09-28 13:57:35,474 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=38880.0, ans=0.0 2023-09-28 13:57:38,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:57:38,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:57:39,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:57:41,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:57:41,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-28 13:57:41,642 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=38946.666666666664, ans=0.125 2023-09-28 13:57:43,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-28 13:57:46,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:57:47,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:57:50,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:57:53,186 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=14.34 vs. limit=22.5 2023-09-28 13:57:53,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:57:58,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:58:02,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-28 13:58:02,489 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:58:02,519 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:58:06,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-28 13:58:08,187 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-28 13:58:08,566 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=39013.333333333336, ans=0.0 2023-09-28 13:58:09,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:58:11,442 INFO [train.py:1039] (1/4) Epoch 2, batch 550, loss[loss=0.3686, simple_loss=0.4019, pruned_loss=0.1677, over 24424.00 frames. ], tot_loss[loss=0.3327, simple_loss=0.3655, pruned_loss=0.15, over 4402744.94 frames. ], batch size: 69, lr: 3.92e-02, grad_scale: 32.0 2023-09-28 13:58:14,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-28 13:58:15,089 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=39080.0, ans=0.125 2023-09-28 13:58:16,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-28 13:58:16,293 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:58:16,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-28 13:58:17,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:58:17,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:58:19,183 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:58:19,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:58:20,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:58:20,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:58:22,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:58:23,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-28 13:58:23,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:58:29,148 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:58:30,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:58:30,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:58:32,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:58:35,564 WARNING [train.py:1197] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-28 13:58:37,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-28 13:58:38,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:58:43,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:58:43,118 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:58:44,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-28 13:58:47,938 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:58:47,951 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-28 13:58:48,247 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=39213.333333333336, ans=0.2 2023-09-28 13:58:48,269 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=39213.333333333336, ans=0.125 2023-09-28 13:58:49,432 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:58:50,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 13:58:52,659 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:58:54,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:58:54,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-28 13:58:55,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:58:55,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-28 13:58:59,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-28 13:58:59,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:58:59,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:59:01,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:59:01,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:59:04,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:59:06,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:59:11,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:59:14,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:59:14,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 13:59:15,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:59:17,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:59:18,970 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:59:20,490 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:59:21,205 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten.whitening_limit, batch_count=39346.666666666664, ans=15.0 2023-09-28 13:59:22,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-28 13:59:22,130 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-28 13:59:29,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-28 13:59:32,915 INFO [train.py:1039] (1/4) Epoch 2, batch 600, loss[loss=0.332, simple_loss=0.371, pruned_loss=0.1465, over 24511.00 frames. ], tot_loss[loss=0.3313, simple_loss=0.3651, pruned_loss=0.1488, over 4472568.92 frames. ], batch size: 66, lr: 3.91e-02, grad_scale: 32.0 2023-09-28 13:59:33,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-28 13:59:34,972 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.039e+02 2.924e+02 3.724e+02 4.722e+02 8.175e+02, threshold=7.448e+02, percent-clipped=4.0 2023-09-28 13:59:36,591 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:59:36,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 13:59:36,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:59:44,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:59:45,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 13:59:46,802 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=39413.333333333336, ans=0.0 2023-09-28 13:59:47,972 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-28 13:59:48,297 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=39413.333333333336, ans=0.125 2023-09-28 13:59:49,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-28 13:59:52,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:59:53,111 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=39480.0, ans=0.125 2023-09-28 13:59:53,387 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.57 vs. limit=15.0 2023-09-28 13:59:54,257 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:59:55,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-28 13:59:56,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:00:04,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-28 14:00:06,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:00:06,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:00:08,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:00:13,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:00:13,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:00:13,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:00:20,939 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:00:24,873 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2.whitening_limit, batch_count=39613.333333333336, ans=15.0 2023-09-28 14:00:27,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:00:27,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:00:27,258 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:00:34,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-28 14:00:39,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-28 14:00:39,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:00:44,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-28 14:00:46,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:00:50,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-28 14:00:50,689 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:00:50,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:00:57,132 INFO [train.py:1039] (1/4) Epoch 2, batch 650, loss[loss=0.3208, simple_loss=0.3675, pruned_loss=0.137, over 23726.00 frames. ], tot_loss[loss=0.3295, simple_loss=0.3638, pruned_loss=0.1476, over 4523523.13 frames. ], batch size: 85, lr: 3.90e-02, grad_scale: 32.0 2023-09-28 14:00:57,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 14:01:00,069 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-28 14:01:01,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:01:02,078 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=39746.666666666664, ans=0.0 2023-09-28 14:01:03,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:01:04,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:01:08,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-28 14:01:09,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:01:10,172 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.56 vs. limit=22.5 2023-09-28 14:01:14,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 14:01:14,246 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:01:17,843 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:01:21,528 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=39813.333333333336, ans=0.1 2023-09-28 14:01:22,854 WARNING [train.py:1197] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-28 14:01:25,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:01:25,134 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:01:30,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:01:31,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 14:01:32,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:01:33,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:01:35,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 14:01:36,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:01:38,068 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:01:39,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 14:01:39,750 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=39880.0, ans=0.125 2023-09-28 14:01:40,954 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-28 14:01:40,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:01:40,994 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:01:42,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:01:44,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:01:44,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:01:45,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:01:47,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-28 14:01:47,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:01:47,828 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=39946.666666666664, ans=0.125 2023-09-28 14:01:47,840 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=39946.666666666664, ans=0.125 2023-09-28 14:01:49,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:01:49,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-28 14:01:49,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:01:51,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 14:01:52,700 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-28 14:01:54,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-28 14:01:54,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:01:55,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:01:56,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:01:56,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:01:58,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:02:05,506 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:02:05,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:02:08,475 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:02:08,969 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=40013.333333333336, ans=0.125 2023-09-28 14:02:10,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:02:10,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 14:02:11,694 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:02:19,116 INFO [train.py:1039] (1/4) Epoch 2, batch 700, loss[loss=0.3302, simple_loss=0.3473, pruned_loss=0.1565, over 23803.00 frames. ], tot_loss[loss=0.3277, simple_loss=0.3622, pruned_loss=0.1466, over 4582283.07 frames. ], batch size: 212, lr: 3.90e-02, grad_scale: 32.0 2023-09-28 14:02:19,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 14:02:19,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:02:20,673 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.035e+02 2.820e+02 3.434e+02 4.210e+02 9.710e+02, threshold=6.868e+02, percent-clipped=2.0 2023-09-28 14:02:20,823 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:02:20,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:02:24,702 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-28 14:02:26,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-28 14:02:29,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-28 14:02:29,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:02:31,624 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=40080.0, ans=0.0021565217391304355 2023-09-28 14:02:33,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:02:35,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-28 14:02:37,668 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=40146.666666666664, ans=0.0 2023-09-28 14:02:39,010 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:02:42,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:02:43,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:02:45,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-28 14:02:46,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:02:48,530 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=40146.666666666664, ans=10.0 2023-09-28 14:02:48,541 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=40146.666666666664, ans=0.0 2023-09-28 14:02:49,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:02:50,687 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=6.15 vs. limit=12.0 2023-09-28 14:02:51,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 14:02:51,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:02:53,158 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=40213.333333333336, ans=0.2 2023-09-28 14:02:54,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-28 14:02:57,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-28 14:02:58,636 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.18 vs. limit=15.0 2023-09-28 14:03:01,955 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-28 14:03:02,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:03:03,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:03:05,314 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=40213.333333333336, ans=0.1 2023-09-28 14:03:08,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:03:08,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-28 14:03:14,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:03:15,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 14:03:15,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-28 14:03:21,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:03:21,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:03:23,596 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=40280.0, ans=0.0021130434782608704 2023-09-28 14:03:24,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:03:29,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:03:29,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-28 14:03:35,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-28 14:03:35,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-28 14:03:38,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:03:40,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:03:40,258 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:03:42,522 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:03:42,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-28 14:03:43,869 INFO [train.py:1039] (1/4) Epoch 2, batch 750, loss[loss=0.3416, simple_loss=0.3851, pruned_loss=0.149, over 24346.00 frames. ], tot_loss[loss=0.3263, simple_loss=0.3609, pruned_loss=0.1459, over 4613915.79 frames. ], batch size: 74, lr: 3.89e-02, grad_scale: 32.0 2023-09-28 14:03:45,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-28 14:03:47,160 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-28 14:03:47,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-28 14:03:48,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-28 14:03:48,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-28 14:03:48,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:03:51,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-28 14:03:53,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:03:53,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-28 14:03:54,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:03:56,380 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:03:56,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-28 14:03:57,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:03:59,549 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:04:00,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 14:04:02,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:04:05,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:04:06,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:04:07,402 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-28 14:04:09,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:04:11,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:04:14,098 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:04:14,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-28 14:04:16,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-28 14:04:16,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:04:19,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-28 14:04:19,654 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-28 14:04:19,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-28 14:04:21,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-28 14:04:21,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 14:04:22,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:04:28,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-28 14:04:28,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:04:28,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 14:04:32,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:04:35,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:04:35,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-28 14:04:35,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:04:36,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-28 14:04:36,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:04:40,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:04:42,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-28 14:04:42,472 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=40613.333333333336, ans=0.2 2023-09-28 14:04:44,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:04:48,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:04:50,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:04:51,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:04:54,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 14:04:57,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-28 14:04:57,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:04:59,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:05:02,448 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:05:02,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:05:05,496 INFO [train.py:1039] (1/4) Epoch 2, batch 800, loss[loss=0.3222, simple_loss=0.3717, pruned_loss=0.1363, over 23731.00 frames. ], tot_loss[loss=0.3262, simple_loss=0.3612, pruned_loss=0.1456, over 4640486.64 frames. ], batch size: 85, lr: 3.88e-02, grad_scale: 32.0 2023-09-28 14:05:05,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:05:05,638 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-28 14:05:07,076 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.053e+02 2.783e+02 3.464e+02 4.160e+02 6.985e+02, threshold=6.929e+02, percent-clipped=3.0 2023-09-28 14:05:14,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:05:14,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:05:17,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:05:17,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:05:17,340 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=40746.666666666664, ans=0.0 2023-09-28 14:05:18,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:05:20,803 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:05:22,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:05:26,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:05:26,650 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=40813.333333333336, ans=0.125 2023-09-28 14:05:27,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:05:30,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-28 14:05:31,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:05:31,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:05:33,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:05:33,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:05:34,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-28 14:05:34,684 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:05:34,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-28 14:05:37,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:05:39,581 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:05:41,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:05:41,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:05:44,359 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=40880.0, ans=0.1 2023-09-28 14:05:45,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:05:45,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:05:52,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:05:52,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:05:52,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-28 14:05:53,869 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-28 14:05:53,914 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-28 14:05:53,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 14:05:53,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:05:57,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:05:57,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:06:03,287 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-28 14:06:03,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-28 14:06:03,510 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=40946.666666666664, ans=0.1 2023-09-28 14:06:06,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-28 14:06:07,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 14:06:11,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:06:15,576 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:06:15,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-28 14:06:17,271 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:06:19,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-28 14:06:25,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:06:27,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:06:27,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-28 14:06:29,447 INFO [train.py:1039] (1/4) Epoch 2, batch 850, loss[loss=0.338, simple_loss=0.367, pruned_loss=0.1545, over 23247.00 frames. ], tot_loss[loss=0.3266, simple_loss=0.3619, pruned_loss=0.1456, over 4670434.70 frames. ], batch size: 105, lr: 3.88e-02, grad_scale: 32.0 2023-09-28 14:06:29,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:06:29,751 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:06:31,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-28 14:06:31,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:06:31,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:06:33,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:06:37,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 14:06:37,237 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:06:38,764 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-28 14:06:40,178 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-28 14:06:40,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-28 14:06:41,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:06:41,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:06:43,887 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.78 vs. limit=6.0 2023-09-28 14:06:44,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:06:44,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:06:46,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 14:06:48,567 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=6.07 vs. limit=12.0 2023-09-28 14:06:50,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:06:50,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:06:50,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-28 14:06:54,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-28 14:06:59,309 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:07:01,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-28 14:07:05,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-28 14:07:05,412 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-28 14:07:08,416 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-28 14:07:08,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:07:08,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:07:08,466 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 14:07:11,415 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:07:12,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:07:12,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-28 14:07:13,235 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=41213.333333333336, ans=0.0019101449275362309 2023-09-28 14:07:15,199 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=41213.333333333336, ans=0.125 2023-09-28 14:07:17,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:07:19,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:07:20,846 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:07:20,879 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-28 14:07:21,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:07:22,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-28 14:07:22,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-28 14:07:28,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:07:28,757 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:07:28,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:07:30,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:07:32,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:07:34,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:07:35,968 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=41346.666666666664, ans=0.0 2023-09-28 14:07:37,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-28 14:07:39,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-28 14:07:40,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:07:41,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-28 14:07:41,931 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=41346.666666666664, ans=0.0018811594202898553 2023-09-28 14:07:49,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-28 14:07:49,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:07:51,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-28 14:07:51,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:07:51,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:07:53,231 INFO [train.py:1039] (1/4) Epoch 2, batch 900, loss[loss=0.3192, simple_loss=0.3516, pruned_loss=0.1434, over 17183.00 frames. ], tot_loss[loss=0.3274, simple_loss=0.3622, pruned_loss=0.1463, over 4664339.05 frames. ], batch size: 37, lr: 3.87e-02, grad_scale: 32.0 2023-09-28 14:07:54,704 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.205e+02 2.862e+02 3.366e+02 4.167e+02 7.237e+02, threshold=6.733e+02, percent-clipped=1.0 2023-09-28 14:07:54,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-28 14:07:59,781 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:08:01,611 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=41413.333333333336, ans=0.125 2023-09-28 14:08:02,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:08:02,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-28 14:08:06,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:08:07,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-28 14:08:09,364 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-28 14:08:09,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:08:09,516 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:08:11,052 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 14:08:11,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:08:24,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:08:24,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:08:24,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 14:08:28,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:08:31,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-28 14:08:33,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:08:39,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-28 14:08:40,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-28 14:08:42,605 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-28 14:08:42,740 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-28 14:08:47,639 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-28 14:08:47,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:08:49,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 14:08:58,046 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:08:58,062 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:08:59,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-28 14:08:59,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:09:04,697 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-28 14:09:06,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:09:06,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:09:07,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:09:07,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:09:09,762 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-28 14:09:11,099 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-28 14:09:14,158 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-28 14:09:14,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-28 14:09:16,240 INFO [train.py:1039] (1/4) Epoch 2, batch 950, loss[loss=0.3137, simple_loss=0.3619, pruned_loss=0.1328, over 24303.00 frames. ], tot_loss[loss=0.3281, simple_loss=0.3634, pruned_loss=0.1464, over 4670694.97 frames. ], batch size: 74, lr: 3.87e-02, grad_scale: 32.0 2023-09-28 14:09:17,897 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:09:21,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-28 14:09:25,348 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.32 vs. limit=15.0 2023-09-28 14:09:26,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:09:30,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:09:30,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:09:31,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 14:09:33,444 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-28 14:09:37,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:09:38,592 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:09:39,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:09:40,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:09:40,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-28 14:09:41,688 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-28 14:09:43,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:09:44,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-28 14:09:46,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:09:49,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:09:49,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:09:51,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:09:51,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-28 14:09:52,094 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=41880.0, ans=0.125 2023-09-28 14:09:53,286 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 14:09:55,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:09:57,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:10:00,988 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=41880.0, ans=0.0 2023-09-28 14:10:02,153 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:10:02,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:10:06,557 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-28 14:10:09,620 WARNING [train.py:1197] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 14:10:09,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 14:10:09,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:10:09,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:10:09,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:10:13,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-28 14:10:16,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:10:18,192 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:10:19,597 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:10:19,623 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-28 14:10:19,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:10:19,649 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:10:21,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-28 14:10:26,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 14:10:30,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:10:35,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:10:37,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-28 14:10:37,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-28 14:10:39,875 INFO [train.py:1039] (1/4) Epoch 2, batch 1000, loss[loss=0.2844, simple_loss=0.3415, pruned_loss=0.1136, over 24430.00 frames. ], tot_loss[loss=0.3267, simple_loss=0.3624, pruned_loss=0.1455, over 4675037.99 frames. ], batch size: 63, lr: 3.86e-02, grad_scale: 16.0 2023-09-28 14:10:40,234 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:10:41,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:10:42,873 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.243e+02 2.891e+02 3.339e+02 3.802e+02 9.955e+02, threshold=6.678e+02, percent-clipped=4.0 2023-09-28 14:10:43,770 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=11.87 vs. limit=15.0 2023-09-28 14:10:44,569 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-28 14:10:44,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:10:47,678 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.33 vs. limit=15.0 2023-09-28 14:10:51,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:10:52,644 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-28 14:10:52,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-28 14:10:57,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:10:57,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:10:59,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:11:00,033 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.35 vs. limit=15.0 2023-09-28 14:11:03,262 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-28 14:11:06,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-28 14:11:09,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-28 14:11:09,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:11:11,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-28 14:11:13,021 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-28 14:11:13,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-28 14:11:14,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:11:16,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:11:24,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:11:24,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:11:26,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:11:27,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:11:27,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-28 14:11:27,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:11:29,837 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:11:29,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:11:31,377 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-28 14:11:34,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-28 14:11:36,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-28 14:11:37,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-28 14:11:39,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:11:44,016 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=42280.0, ans=0.125 2023-09-28 14:11:46,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:11:46,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:11:46,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:11:48,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:11:49,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-28 14:11:51,267 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:11:51,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-28 14:11:51,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-28 14:11:52,967 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:11:52,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:11:54,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:11:58,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 14:11:59,690 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:12:03,234 INFO [train.py:1039] (1/4) Epoch 2, batch 1050, loss[loss=0.34, simple_loss=0.3827, pruned_loss=0.1487, over 24391.00 frames. ], tot_loss[loss=0.3242, simple_loss=0.3603, pruned_loss=0.144, over 4687065.81 frames. ], batch size: 77, lr: 3.85e-02, grad_scale: 16.0 2023-09-28 14:12:04,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:12:06,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:12:08,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 14:12:09,696 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:12:11,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:12:13,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:12:15,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-28 14:12:15,827 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.63 vs. limit=6.0 2023-09-28 14:12:17,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:12:18,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:12:18,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-28 14:12:20,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:12:20,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-28 14:12:21,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:12:21,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-28 14:12:22,223 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=42480.0, ans=0.1 2023-09-28 14:12:24,953 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:12:24,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-28 14:12:24,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-28 14:12:26,045 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.61 vs. limit=22.5 2023-09-28 14:12:30,506 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=42480.0, ans=0.125 2023-09-28 14:12:31,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:12:33,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-28 14:12:35,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:12:38,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-28 14:12:38,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-28 14:12:39,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:12:42,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-28 14:12:44,032 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=42546.666666666664, ans=0.0016202898550724647 2023-09-28 14:12:45,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-28 14:12:46,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:12:50,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 14:12:53,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-28 14:12:53,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:12:55,078 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:12:58,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:13:03,075 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-28 14:13:04,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-28 14:13:04,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-28 14:13:06,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:13:06,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:13:08,074 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-28 14:13:12,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:13:15,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:13:15,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:13:17,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:13:17,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:13:19,853 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.72 vs. limit=22.5 2023-09-28 14:13:20,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:13:20,819 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-28 14:13:22,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:13:22,525 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-28 14:13:22,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-28 14:13:24,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:13:27,025 INFO [train.py:1039] (1/4) Epoch 2, batch 1100, loss[loss=0.3207, simple_loss=0.3656, pruned_loss=0.1378, over 23958.00 frames. ], tot_loss[loss=0.3234, simple_loss=0.36, pruned_loss=0.1434, over 4696265.12 frames. ], batch size: 80, lr: 3.85e-02, grad_scale: 16.0 2023-09-28 14:13:29,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:13:29,636 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=42746.666666666664, ans=0.125 2023-09-28 14:13:30,547 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.933e+02 2.833e+02 3.199e+02 3.709e+02 7.263e+02, threshold=6.397e+02, percent-clipped=1.0 2023-09-28 14:13:33,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:13:35,766 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=42746.666666666664, ans=0.0 2023-09-28 14:13:39,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 14:13:41,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:13:41,371 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:13:41,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-28 14:13:43,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:13:45,294 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=42813.333333333336, ans=0.0 2023-09-28 14:13:47,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-28 14:13:50,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:13:53,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:13:53,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-28 14:13:55,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 14:13:56,754 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:13:56,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:13:57,248 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=42813.333333333336, ans=0.125 2023-09-28 14:13:59,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:14:00,026 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-28 14:14:06,516 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:14:06,800 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_positive, batch_count=42880.0, ans=0.05 2023-09-28 14:14:08,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-28 14:14:08,656 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=42880.0, ans=0.2 2023-09-28 14:14:09,834 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-28 14:14:09,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:14:10,165 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=42880.0, ans=0.125 2023-09-28 14:14:13,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:14:14,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-28 14:14:14,674 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:14:16,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-28 14:14:17,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:14:17,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:14:17,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:14:19,073 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:14:19,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-28 14:14:23,525 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=25.55 vs. limit=22.5 2023-09-28 14:14:27,160 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:14:27,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-28 14:14:28,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 14:14:34,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 14:14:38,509 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-28 14:14:38,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-28 14:14:39,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:14:39,792 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=43013.333333333336, ans=0.1 2023-09-28 14:14:42,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:14:42,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:14:44,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-28 14:14:44,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:14:44,520 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=43013.333333333336, ans=0.125 2023-09-28 14:14:45,577 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:14:47,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-28 14:14:47,068 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:14:47,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-28 14:14:48,582 INFO [train.py:1039] (1/4) Epoch 2, batch 1150, loss[loss=0.3166, simple_loss=0.3707, pruned_loss=0.1313, over 24553.00 frames. ], tot_loss[loss=0.3237, simple_loss=0.3605, pruned_loss=0.1434, over 4702026.63 frames. ], batch size: 71, lr: 3.84e-02, grad_scale: 16.0 2023-09-28 14:14:48,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:14:48,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:14:50,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-28 14:14:55,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:14:59,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:15:00,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:15:00,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:15:00,823 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-28 14:15:02,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:15:04,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-28 14:15:05,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:15:05,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 14:15:10,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-28 14:15:12,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:15:17,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:15:18,650 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:15:18,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-28 14:15:18,743 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-28 14:15:18,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:15:21,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-28 14:15:23,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:15:25,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:15:25,917 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=43213.333333333336, ans=0.125 2023-09-28 14:15:35,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:15:42,184 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:15:43,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-28 14:15:45,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:15:45,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:15:52,052 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=43280.0, ans=0.125 2023-09-28 14:15:53,343 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-28 14:15:54,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:15:57,459 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.43 vs. limit=6.0 2023-09-28 14:16:01,850 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-28 14:16:06,987 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:16:08,467 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:16:08,522 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-28 14:16:09,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:16:11,794 INFO [train.py:1039] (1/4) Epoch 2, batch 1200, loss[loss=0.3885, simple_loss=0.3992, pruned_loss=0.189, over 22702.00 frames. ], tot_loss[loss=0.3244, simple_loss=0.3612, pruned_loss=0.1438, over 4704223.05 frames. ], batch size: 322, lr: 3.83e-02, grad_scale: 32.0 2023-09-28 14:16:13,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:16:14,999 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.963e+02 2.991e+02 3.527e+02 4.351e+02 6.174e+02, threshold=7.053e+02, percent-clipped=0.0 2023-09-28 14:16:17,449 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.00 vs. limit=15.0 2023-09-28 14:16:18,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:16:18,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-28 14:16:18,622 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=43413.333333333336, ans=0.1 2023-09-28 14:16:20,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:16:20,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:16:21,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:16:21,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:16:25,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 14:16:26,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:16:26,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:16:28,537 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-28 14:16:32,846 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-28 14:16:36,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:16:38,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:16:39,253 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=43480.0, ans=0.125 2023-09-28 14:16:40,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:16:44,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:16:44,845 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-28 14:16:44,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:16:54,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-28 14:16:54,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:16:54,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-28 14:16:56,184 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:16:59,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-28 14:17:04,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-28 14:17:04,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:17:06,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:17:07,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:17:07,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-28 14:17:09,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:17:09,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:17:12,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:17:12,282 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-28 14:17:14,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 14:17:14,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:17:14,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 14:17:17,924 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:17:17,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:17:21,332 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-28 14:17:24,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:17:27,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-28 14:17:28,881 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.31 vs. limit=12.0 2023-09-28 14:17:31,124 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-28 14:17:34,026 INFO [train.py:1039] (1/4) Epoch 2, batch 1250, loss[loss=0.3127, simple_loss=0.3626, pruned_loss=0.1314, over 24058.00 frames. ], tot_loss[loss=0.3243, simple_loss=0.3611, pruned_loss=0.1437, over 4707818.19 frames. ], batch size: 80, lr: 3.83e-02, grad_scale: 32.0 2023-09-28 14:17:34,126 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:17:35,144 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.72 vs. limit=15.0 2023-09-28 14:17:35,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:17:37,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:17:39,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:17:42,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-28 14:17:47,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:17:47,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:17:47,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-28 14:17:49,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:17:51,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 14:17:54,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 14:17:55,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:17:56,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 14:17:56,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:17:58,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-28 14:18:02,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 14:18:02,777 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-28 14:18:02,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:18:04,384 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:18:05,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:18:09,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:18:10,186 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.58 vs. limit=15.0 2023-09-28 14:18:11,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-28 14:18:18,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-28 14:18:19,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:18:21,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:18:21,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-28 14:18:23,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:18:23,332 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-28 14:18:23,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:18:23,367 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:18:29,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:18:32,020 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.67 vs. limit=15.0 2023-09-28 14:18:33,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:18:33,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:18:36,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-28 14:18:36,514 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-28 14:18:36,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-28 14:18:39,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:18:41,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-28 14:18:42,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:18:44,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-28 14:18:44,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:18:46,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-28 14:18:46,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-28 14:18:46,136 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:18:46,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-28 14:18:47,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:18:50,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-28 14:18:53,553 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:18:55,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:18:56,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 14:18:58,475 INFO [train.py:1039] (1/4) Epoch 2, batch 1300, loss[loss=0.3216, simple_loss=0.3664, pruned_loss=0.1383, over 24041.00 frames. ], tot_loss[loss=0.3238, simple_loss=0.3608, pruned_loss=0.1433, over 4720230.13 frames. ], batch size: 86, lr: 3.82e-02, grad_scale: 32.0 2023-09-28 14:18:58,743 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-28 14:19:01,542 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.024e+02 2.943e+02 3.508e+02 4.700e+02 1.321e+03, threshold=7.016e+02, percent-clipped=7.0 2023-09-28 14:19:01,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:19:01,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-28 14:19:03,818 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=44080.0, ans=0.2 2023-09-28 14:19:07,152 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:19:08,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-28 14:19:08,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:19:10,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:19:11,981 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:19:12,246 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=44080.0, ans=0.125 2023-09-28 14:19:13,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-28 14:19:15,217 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=44146.666666666664, ans=0.125 2023-09-28 14:19:18,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:19:18,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-28 14:19:21,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-28 14:19:23,850 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=44146.666666666664, ans=0.125 2023-09-28 14:19:25,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 14:19:30,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:19:32,438 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:19:32,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:19:34,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:19:35,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 14:19:35,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-28 14:19:37,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-28 14:19:37,522 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=44213.333333333336, ans=0.0 2023-09-28 14:19:42,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:19:43,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 14:19:44,026 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-28 14:19:45,481 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 14:19:47,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:19:50,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:19:51,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-28 14:19:51,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:19:51,491 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-28 14:19:54,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:19:57,122 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=44280.0, ans=0.1 2023-09-28 14:19:58,853 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:19:58,857 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:20:00,695 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=44280.0, ans=0.125 2023-09-28 14:20:03,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-28 14:20:04,740 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-28 14:20:04,889 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-28 14:20:09,968 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:20:11,899 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=44346.666666666664, ans=0.125 2023-09-28 14:20:13,440 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-28 14:20:15,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:20:20,906 INFO [train.py:1039] (1/4) Epoch 2, batch 1350, loss[loss=0.3346, simple_loss=0.3801, pruned_loss=0.1446, over 24054.00 frames. ], tot_loss[loss=0.3226, simple_loss=0.3594, pruned_loss=0.143, over 4715817.24 frames. ], batch size: 80, lr: 3.82e-02, grad_scale: 32.0 2023-09-28 14:20:21,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-28 14:20:21,539 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=44413.333333333336, ans=0.125 2023-09-28 14:20:24,674 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=44413.333333333336, ans=0.125 2023-09-28 14:20:25,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:20:27,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:20:31,161 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:20:31,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:20:32,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:20:34,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:20:38,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:20:41,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-28 14:20:43,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-28 14:20:43,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:20:44,046 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.24 vs. limit=10.0 2023-09-28 14:20:45,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-28 14:20:48,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:20:48,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:20:48,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-28 14:20:51,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-28 14:20:53,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-28 14:20:54,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:20:54,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-28 14:20:54,982 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=44546.666666666664, ans=0.125 2023-09-28 14:21:07,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:21:11,998 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=23.87 vs. limit=22.5 2023-09-28 14:21:16,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:21:16,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:21:16,540 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-28 14:21:22,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:21:24,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-28 14:21:24,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-28 14:21:24,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:21:27,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:21:30,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-28 14:21:31,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:21:38,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-28 14:21:40,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-28 14:21:43,259 INFO [train.py:1039] (1/4) Epoch 2, batch 1400, loss[loss=0.2543, simple_loss=0.3136, pruned_loss=0.09747, over 24627.00 frames. ], tot_loss[loss=0.3205, simple_loss=0.3575, pruned_loss=0.1417, over 4715537.64 frames. ], batch size: 60, lr: 3.81e-02, grad_scale: 32.0 2023-09-28 14:21:45,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-28 14:21:46,401 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.080e+02 2.861e+02 3.179e+02 3.709e+02 7.568e+02, threshold=6.358e+02, percent-clipped=1.0 2023-09-28 14:21:46,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:21:51,813 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:21:51,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:21:58,062 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-28 14:21:58,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-28 14:22:04,750 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=44813.333333333336, ans=0.0011275362318840573 2023-09-28 14:22:07,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:22:11,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:22:13,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:22:13,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-28 14:22:18,176 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:22:18,332 WARNING [train.py:1197] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 14:22:18,946 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.07 vs. limit=15.0 2023-09-28 14:22:28,662 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=44880.0, ans=0.125 2023-09-28 14:22:29,907 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:22:29,990 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:22:34,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-28 14:22:36,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:22:37,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:22:37,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:22:39,254 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:22:40,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:22:40,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:22:40,870 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:22:43,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-28 14:22:43,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:22:47,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:22:51,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:22:59,030 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=45013.333333333336, ans=0.125 2023-09-28 14:23:00,383 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-28 14:23:02,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 14:23:02,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:23:04,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 14:23:05,029 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.74 vs. limit=15.0 2023-09-28 14:23:05,538 INFO [train.py:1039] (1/4) Epoch 2, batch 1450, loss[loss=0.3376, simple_loss=0.3751, pruned_loss=0.1501, over 24640.00 frames. ], tot_loss[loss=0.3196, simple_loss=0.3571, pruned_loss=0.141, over 4719839.81 frames. ], batch size: 65, lr: 3.80e-02, grad_scale: 32.0 2023-09-28 14:23:05,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:23:05,881 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:23:10,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-28 14:23:10,458 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:23:10,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:23:10,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-28 14:23:16,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:23:18,810 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 14:23:20,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:23:20,423 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-28 14:23:20,729 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:23:21,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 14:23:24,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-28 14:23:24,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:23:25,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:23:25,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-28 14:23:27,238 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:23:28,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-28 14:23:28,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 14:23:28,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:23:30,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:23:32,006 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:23:35,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:23:35,988 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=45146.666666666664, ans=0.125 2023-09-28 14:23:37,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:23:37,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:23:41,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:23:41,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:23:44,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:23:44,250 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:23:44,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:23:45,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:23:48,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-28 14:23:52,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:23:55,744 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-28 14:23:57,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:23:59,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-28 14:24:00,967 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:24:01,250 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=45280.0, ans=0.0 2023-09-28 14:24:02,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-28 14:24:04,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:24:06,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-28 14:24:07,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-28 14:24:07,989 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=45280.0, ans=0.2 2023-09-28 14:24:09,567 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:24:11,459 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=45346.666666666664, ans=0.0010115942028985515 2023-09-28 14:24:13,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:24:13,324 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:24:14,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-28 14:24:17,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-28 14:24:17,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-28 14:24:19,406 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:24:20,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 14:24:25,316 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=45346.666666666664, ans=0.2 2023-09-28 14:24:30,452 INFO [train.py:1039] (1/4) Epoch 2, batch 1500, loss[loss=0.3307, simple_loss=0.3772, pruned_loss=0.1421, over 23273.00 frames. ], tot_loss[loss=0.3199, simple_loss=0.3575, pruned_loss=0.1412, over 4718560.84 frames. ], batch size: 93, lr: 3.80e-02, grad_scale: 32.0 2023-09-28 14:24:30,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-28 14:24:30,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:24:30,650 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:24:33,280 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.960e+02 2.694e+02 3.191e+02 3.911e+02 7.189e+02, threshold=6.382e+02, percent-clipped=1.0 2023-09-28 14:24:33,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:24:33,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:24:35,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:24:37,038 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-28 14:24:37,417 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=45413.333333333336, ans=0.000997101449275362 2023-09-28 14:24:38,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 14:24:38,972 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=45413.333333333336, ans=0.000997101449275362 2023-09-28 14:24:40,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-28 14:24:40,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:24:40,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:24:41,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:24:43,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:24:43,713 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=45413.333333333336, ans=0.0 2023-09-28 14:24:48,794 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=45480.0, ans=0.0 2023-09-28 14:24:49,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:24:49,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-28 14:24:50,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-28 14:24:51,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:24:51,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:24:51,814 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=45480.0, ans=0.1 2023-09-28 14:24:51,952 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=45480.0, ans=0.0 2023-09-28 14:24:53,297 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=45480.0, ans=0.125 2023-09-28 14:24:56,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-28 14:25:01,466 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.10 vs. limit=15.0 2023-09-28 14:25:02,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-28 14:25:04,243 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:25:06,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-28 14:25:08,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-28 14:25:08,383 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=45546.666666666664, ans=0.0 2023-09-28 14:25:11,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:25:11,311 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=45546.666666666664, ans=0.125 2023-09-28 14:25:13,016 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:25:13,050 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:25:15,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-28 14:25:16,039 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:25:16,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:25:17,583 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-28 14:25:17,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:25:21,165 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:25:22,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:25:22,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-28 14:25:30,456 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 14:25:30,982 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=45613.333333333336, ans=0.125 2023-09-28 14:25:31,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 14:25:32,321 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=45613.333333333336, ans=0.125 2023-09-28 14:25:34,400 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=15.44 vs. limit=15.0 2023-09-28 14:25:35,155 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-28 14:25:35,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:25:35,257 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-28 14:25:38,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:25:39,820 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.68 vs. limit=22.5 2023-09-28 14:25:40,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:25:42,001 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-28 14:25:42,144 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-28 14:25:45,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-28 14:25:46,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:25:50,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:25:51,722 INFO [train.py:1039] (1/4) Epoch 2, batch 1550, loss[loss=0.3295, simple_loss=0.3596, pruned_loss=0.1497, over 23772.00 frames. ], tot_loss[loss=0.323, simple_loss=0.3598, pruned_loss=0.1431, over 4714826.71 frames. ], batch size: 232, lr: 3.79e-02, grad_scale: 32.0 2023-09-28 14:25:51,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:25:51,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:25:52,294 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=45746.666666666664, ans=0.0009246376811594213 2023-09-28 14:25:53,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:25:53,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:25:53,697 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=45746.666666666664, ans=0.09899494936611666 2023-09-28 14:25:54,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-28 14:25:56,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-28 14:25:56,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:25:57,987 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-28 14:25:59,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-28 14:26:01,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:26:02,715 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:26:04,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:26:04,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:26:06,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:26:06,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:26:09,486 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-28 14:26:09,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:26:09,689 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=45813.333333333336, ans=0.125 2023-09-28 14:26:10,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:26:11,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 14:26:14,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-28 14:26:14,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-28 14:26:16,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:26:18,213 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-28 14:26:18,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-28 14:26:18,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-28 14:26:18,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:26:19,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:26:24,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:26:27,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-28 14:26:27,830 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-28 14:26:31,563 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.70 vs. limit=15.0 2023-09-28 14:26:35,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:26:37,616 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=45880.0, ans=0.2 2023-09-28 14:26:38,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:26:38,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-28 14:26:38,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:26:38,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-28 14:26:45,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 14:26:46,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:26:49,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:26:51,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:26:53,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:26:53,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-28 14:26:54,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 14:26:56,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:26:56,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:26:58,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-28 14:26:58,523 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-28 14:27:00,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:27:06,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-28 14:27:06,711 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=46013.333333333336, ans=0.0 2023-09-28 14:27:12,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:27:14,228 INFO [train.py:1039] (1/4) Epoch 2, batch 1600, loss[loss=0.3058, simple_loss=0.3541, pruned_loss=0.1288, over 24117.00 frames. ], tot_loss[loss=0.3237, simple_loss=0.3603, pruned_loss=0.1435, over 4708743.86 frames. ], batch size: 86, lr: 3.78e-02, grad_scale: 32.0 2023-09-28 14:27:15,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:27:15,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-28 14:27:17,231 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.915e+02 2.937e+02 3.574e+02 4.472e+02 6.126e+02, threshold=7.147e+02, percent-clipped=0.0 2023-09-28 14:27:17,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 14:27:18,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:27:18,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 14:27:18,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:27:19,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:27:23,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:27:23,802 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.92 vs. limit=22.5 2023-09-28 14:27:24,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-28 14:27:25,078 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.42 vs. limit=15.0 2023-09-28 14:27:25,216 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.82 vs. limit=15.0 2023-09-28 14:27:26,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-28 14:27:27,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-28 14:27:29,575 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:27:29,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-28 14:27:31,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:27:31,605 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=46146.666666666664, ans=0.05 2023-09-28 14:27:34,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:27:36,781 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=46146.666666666664, ans=0.000837681159420291 2023-09-28 14:27:38,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:27:42,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-28 14:27:45,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:27:45,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-28 14:27:47,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:27:49,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-28 14:27:55,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-28 14:28:02,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:28:03,090 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=46280.0, ans=0.125 2023-09-28 14:28:05,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-28 14:28:06,126 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=46280.0, ans=0.2 2023-09-28 14:28:07,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:28:07,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:28:07,244 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:28:10,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-28 14:28:15,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 14:28:15,829 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=46280.0, ans=0.125 2023-09-28 14:28:17,034 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:28:18,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:28:18,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:28:18,644 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:28:20,230 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:28:21,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:28:23,309 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 14:28:30,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:28:30,971 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=46346.666666666664, ans=0.1 2023-09-28 14:28:32,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:28:33,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-28 14:28:33,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:28:35,282 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-28 14:28:35,679 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=46413.333333333336, ans=0.0007797101449275347 2023-09-28 14:28:37,320 INFO [train.py:1039] (1/4) Epoch 2, batch 1650, loss[loss=0.3426, simple_loss=0.3572, pruned_loss=0.1641, over 23824.00 frames. ], tot_loss[loss=0.3241, simple_loss=0.3609, pruned_loss=0.1436, over 4707484.89 frames. ], batch size: 212, lr: 3.78e-02, grad_scale: 32.0 2023-09-28 14:28:40,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:28:42,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:28:43,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:28:43,748 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-28 14:28:43,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-28 14:28:43,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-28 14:28:45,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-28 14:28:46,341 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=16.47 vs. limit=15.0 2023-09-28 14:28:47,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:28:47,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:28:49,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:28:49,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-28 14:28:52,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:28:54,047 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-28 14:28:57,213 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:28:57,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:28:57,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:28:57,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:28:57,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-28 14:28:57,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-28 14:29:04,898 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 14:29:08,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-28 14:29:13,255 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=46546.666666666664, ans=0.0 2023-09-28 14:29:16,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-28 14:29:17,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:29:21,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-28 14:29:24,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:29:26,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:29:26,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:29:27,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:29:29,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:29:29,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:29:32,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:29:32,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:29:32,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:29:33,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:29:33,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:29:36,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 14:29:38,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:29:38,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-28 14:29:39,158 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=4.99 vs. limit=12.0 2023-09-28 14:29:41,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:29:41,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-28 14:29:43,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-28 14:29:43,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-28 14:29:43,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:29:44,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:29:44,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:29:46,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:29:46,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-28 14:29:46,473 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:29:51,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:29:54,341 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:29:54,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:29:56,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-28 14:29:56,471 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=46680.0, ans=0.125 2023-09-28 14:30:00,461 INFO [train.py:1039] (1/4) Epoch 2, batch 1700, loss[loss=0.2644, simple_loss=0.3123, pruned_loss=0.1082, over 24293.00 frames. ], tot_loss[loss=0.3244, simple_loss=0.3607, pruned_loss=0.1441, over 4696422.04 frames. ], batch size: 56, lr: 3.77e-02, grad_scale: 16.0 2023-09-28 14:30:00,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:30:00,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:30:00,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-28 14:30:02,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:30:02,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:30:02,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:30:05,061 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.163e+02 2.907e+02 3.272e+02 3.830e+02 6.451e+02, threshold=6.545e+02, percent-clipped=0.0 2023-09-28 14:30:05,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:30:05,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:30:05,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-28 14:30:10,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 14:30:15,992 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=46813.333333333336, ans=0.2 2023-09-28 14:30:18,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:30:21,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:30:28,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-28 14:30:28,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:30:28,748 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:30:30,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:30:31,840 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-28 14:30:32,106 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=46880.0, ans=0.05 2023-09-28 14:30:35,073 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:30:35,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:30:35,461 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=46880.0, ans=0.1 2023-09-28 14:30:36,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-28 14:30:38,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-28 14:30:41,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-28 14:30:41,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-28 14:30:43,133 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:30:44,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-28 14:30:45,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:30:54,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:30:54,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:30:55,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:30:57,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-28 14:30:57,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-28 14:30:57,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:30:58,083 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=46946.666666666664, ans=0.125 2023-09-28 14:31:00,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:31:00,877 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-28 14:31:00,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:31:00,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:31:01,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:31:02,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:31:05,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:31:05,532 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:31:05,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:31:07,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:31:07,329 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:31:10,458 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:31:11,938 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-28 14:31:15,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:31:17,271 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:31:18,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-28 14:31:24,028 INFO [train.py:1039] (1/4) Epoch 2, batch 1750, loss[loss=0.283, simple_loss=0.3403, pruned_loss=0.1128, over 24447.00 frames. ], tot_loss[loss=0.3217, simple_loss=0.3585, pruned_loss=0.1424, over 4707041.99 frames. ], batch size: 63, lr: 3.76e-02, grad_scale: 16.0 2023-09-28 14:31:25,047 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.94 vs. limit=15.0 2023-09-28 14:31:27,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:31:30,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:31:30,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-28 14:31:31,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-28 14:31:31,700 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:31:34,123 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=47080.0, ans=0.125 2023-09-28 14:31:35,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:31:35,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:31:40,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-28 14:31:43,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:31:44,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-28 14:31:46,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:31:47,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:31:49,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 14:31:51,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-28 14:31:53,158 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:31:53,214 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-28 14:32:01,719 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:32:05,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:32:05,235 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:32:08,361 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:32:08,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:32:11,339 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:32:12,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:32:14,516 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:32:14,756 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=47280.0, ans=0.2 2023-09-28 14:32:16,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:32:17,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-28 14:32:19,701 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.12 vs. limit=22.5 2023-09-28 14:32:20,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:32:21,474 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=47280.0, ans=0.0005913043478260865 2023-09-28 14:32:22,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-28 14:32:24,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:32:27,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:32:27,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:32:33,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 14:32:33,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-28 14:32:34,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:32:36,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:32:42,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:32:44,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:32:44,298 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:32:44,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-28 14:32:44,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:32:45,845 INFO [train.py:1039] (1/4) Epoch 2, batch 1800, loss[loss=0.3001, simple_loss=0.3338, pruned_loss=0.1332, over 24472.00 frames. ], tot_loss[loss=0.321, simple_loss=0.3578, pruned_loss=0.1421, over 4711964.78 frames. ], batch size: 58, lr: 3.76e-02, grad_scale: 16.0 2023-09-28 14:32:47,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-28 14:32:47,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:32:47,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-28 14:32:47,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:32:49,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:32:50,422 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.289e+02 2.760e+02 3.253e+02 3.974e+02 7.457e+02, threshold=6.506e+02, percent-clipped=1.0 2023-09-28 14:32:52,111 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:32:52,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:32:53,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 14:32:57,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:32:59,188 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=47413.333333333336, ans=0.0 2023-09-28 14:33:00,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 14:33:02,534 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:33:05,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:33:09,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:33:09,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:33:11,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:33:12,835 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:33:12,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-28 14:33:13,072 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=47480.0, ans=0.125 2023-09-28 14:33:16,185 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:33:19,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:33:22,800 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-28 14:33:25,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-28 14:33:25,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-28 14:33:25,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:33:27,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:33:27,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:33:27,612 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:33:35,878 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-28 14:33:37,419 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:33:39,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:33:41,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-28 14:33:41,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-28 14:33:43,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:33:44,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:33:45,146 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=47613.333333333336, ans=0.125 2023-09-28 14:33:45,149 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=47613.333333333336, ans=0.0 2023-09-28 14:33:46,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 14:33:48,371 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=47613.333333333336, ans=0.035 2023-09-28 14:33:49,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-28 14:33:55,346 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=47680.0, ans=0.0005043478260869563 2023-09-28 14:33:56,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:33:58,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-28 14:33:59,692 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:33:59,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:33:59,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:34:01,300 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-28 14:34:04,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-28 14:34:04,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:34:07,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-28 14:34:07,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:34:09,035 INFO [train.py:1039] (1/4) Epoch 2, batch 1850, loss[loss=0.3059, simple_loss=0.3557, pruned_loss=0.1281, over 23933.00 frames. ], tot_loss[loss=0.3198, simple_loss=0.3577, pruned_loss=0.1409, over 4714666.81 frames. ], batch size: 86, lr: 3.75e-02, grad_scale: 16.0 2023-09-28 14:34:10,659 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:34:10,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-28 14:34:10,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:34:11,825 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=47746.666666666664, ans=0.125 2023-09-28 14:34:12,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:34:12,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:34:16,087 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:34:16,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:34:19,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:34:21,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:34:29,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:34:29,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-28 14:34:32,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-28 14:34:34,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-28 14:34:35,485 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.11 vs. limit=22.5 2023-09-28 14:34:37,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:34:37,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-28 14:34:37,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 14:34:38,029 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=47813.333333333336, ans=0.00047536231884057895 2023-09-28 14:34:47,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:34:49,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-28 14:34:54,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:34:54,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:34:58,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-28 14:34:59,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:34:59,523 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 14:34:59,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:35:02,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:35:05,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:35:07,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-28 14:35:08,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:35:08,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 14:35:08,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:35:10,506 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:35:13,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:35:16,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-28 14:35:18,215 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:35:22,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:35:22,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:35:22,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-28 14:35:22,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-28 14:35:25,373 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-28 14:35:25,533 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-28 14:35:29,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 14:35:29,294 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:35:29,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:35:29,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:35:29,664 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=48013.333333333336, ans=10.0 2023-09-28 14:35:31,285 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-28 14:35:31,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:35:31,367 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:35:32,699 INFO [train.py:1039] (1/4) Epoch 2, batch 1900, loss[loss=0.339, simple_loss=0.3706, pruned_loss=0.1537, over 23793.00 frames. ], tot_loss[loss=0.3197, simple_loss=0.358, pruned_loss=0.1407, over 4719331.27 frames. ], batch size: 179, lr: 3.75e-02, grad_scale: 16.0 2023-09-28 14:35:32,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-28 14:35:32,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 14:35:34,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:35:34,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-28 14:35:34,730 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=48080.0, ans=0.125 2023-09-28 14:35:37,580 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.980e+02 2.945e+02 3.315e+02 3.866e+02 6.379e+02, threshold=6.630e+02, percent-clipped=0.0 2023-09-28 14:35:37,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:35:37,767 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-28 14:35:37,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:35:39,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:35:43,538 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.91 vs. limit=6.0 2023-09-28 14:35:45,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:35:48,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:35:50,170 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-28 14:35:50,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-28 14:35:51,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:35:53,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:35:53,288 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-28 14:35:53,345 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-28 14:35:58,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-28 14:36:00,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:36:02,634 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:36:04,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-28 14:36:06,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-28 14:36:14,668 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=48213.333333333336, ans=0.125 2023-09-28 14:36:16,125 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:36:17,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-28 14:36:19,818 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.31 vs. limit=15.0 2023-09-28 14:36:20,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-28 14:36:20,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:36:20,792 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-28 14:36:20,799 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-28 14:36:22,285 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-28 14:36:22,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-28 14:36:22,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:36:22,590 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=48280.0, ans=0.0 2023-09-28 14:36:27,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-28 14:36:30,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:36:35,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:36:35,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-28 14:36:36,159 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=48280.0, ans=0.125 2023-09-28 14:36:37,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:36:42,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-28 14:36:42,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:36:47,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:36:47,917 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:36:47,938 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:36:49,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:36:51,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 14:36:51,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-28 14:36:52,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:36:55,505 INFO [train.py:1039] (1/4) Epoch 2, batch 1950, loss[loss=0.2968, simple_loss=0.3537, pruned_loss=0.12, over 24659.00 frames. ], tot_loss[loss=0.3211, simple_loss=0.3594, pruned_loss=0.1414, over 4722413.02 frames. ], batch size: 73, lr: 3.74e-02, grad_scale: 16.0 2023-09-28 14:36:55,576 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:36:55,579 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-28 14:36:57,407 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:36:57,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:36:58,765 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:36:58,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:37:03,436 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 14:37:04,339 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=48413.333333333336, ans=0.2 2023-09-28 14:37:04,444 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:37:05,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:37:07,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:37:07,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 14:37:09,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-28 14:37:09,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 14:37:10,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:37:11,195 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=48480.0, ans=0.1 2023-09-28 14:37:12,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:37:14,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:37:14,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:37:16,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:37:17,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:37:21,481 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 14:37:21,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 14:37:21,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:37:21,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:37:25,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:37:29,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-28 14:37:29,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:37:30,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-28 14:37:30,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-28 14:37:31,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 14:37:31,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:37:31,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:37:33,296 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.09 vs. limit=15.0 2023-09-28 14:37:34,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:37:37,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:37:43,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 14:37:44,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:37:46,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-28 14:37:46,306 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-28 14:37:46,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:37:52,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:37:53,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:37:55,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-28 14:37:56,029 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=16.17 vs. limit=22.5 2023-09-28 14:38:03,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:38:04,690 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:38:07,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:38:09,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:38:12,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:38:14,243 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:38:15,642 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-28 14:38:15,651 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 14:38:15,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:38:17,821 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-28 14:38:19,189 INFO [train.py:1039] (1/4) Epoch 2, batch 2000, loss[loss=0.3384, simple_loss=0.3765, pruned_loss=0.1502, over 23584.00 frames. ], tot_loss[loss=0.3208, simple_loss=0.3589, pruned_loss=0.1414, over 4716843.13 frames. ], batch size: 106, lr: 3.73e-02, grad_scale: 32.0 2023-09-28 14:38:19,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:38:22,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-28 14:38:23,955 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.836e+02 2.784e+02 3.169e+02 3.809e+02 6.996e+02, threshold=6.339e+02, percent-clipped=1.0 2023-09-28 14:38:24,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:38:24,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:38:26,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:38:29,294 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:38:32,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-28 14:38:33,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-28 14:38:35,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:38:37,189 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-28 14:38:38,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 14:38:38,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:38:40,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:38:43,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-28 14:38:45,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:38:48,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:38:48,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:38:50,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-28 14:38:50,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 14:38:53,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-28 14:38:53,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:38:57,074 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:38:58,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-28 14:38:58,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:38:58,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:39:00,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:39:02,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-28 14:39:04,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-28 14:39:06,038 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:39:06,052 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:39:10,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:39:12,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:39:12,238 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 14:39:12,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:39:13,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:39:15,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:39:15,506 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 14:39:15,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:39:17,087 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:39:20,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:39:20,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-28 14:39:25,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 14:39:27,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:39:31,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:39:31,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:39:36,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:39:37,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:39:37,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:39:38,153 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=49013.333333333336, ans=0.1 2023-09-28 14:39:39,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 14:39:40,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 14:39:42,371 INFO [train.py:1039] (1/4) Epoch 2, batch 2050, loss[loss=0.3227, simple_loss=0.3727, pruned_loss=0.1363, over 24556.00 frames. ], tot_loss[loss=0.3193, simple_loss=0.3578, pruned_loss=0.1404, over 4717348.60 frames. ], batch size: 71, lr: 3.73e-02, grad_scale: 32.0 2023-09-28 14:39:42,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:39:44,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:39:47,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:39:48,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:39:52,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:39:55,169 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:39:56,578 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:39:56,694 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:39:59,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-28 14:40:00,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:40:00,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:40:00,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-28 14:40:09,716 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=49146.666666666664, ans=0.2 2023-09-28 14:40:12,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:40:12,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:40:12,939 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=49146.666666666664, ans=0.125 2023-09-28 14:40:14,128 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-28 14:40:15,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:40:17,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-28 14:40:17,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:40:20,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:40:23,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:40:25,068 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-28 14:40:25,149 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:40:26,655 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:40:29,571 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:40:29,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:40:32,241 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=49280.0, ans=0.0 2023-09-28 14:40:33,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:40:35,431 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:40:36,233 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=11.99 vs. limit=15.0 2023-09-28 14:40:38,873 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-28 14:40:39,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:40:45,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 14:40:49,536 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=49346.666666666664, ans=0.0 2023-09-28 14:40:50,698 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:40:50,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-28 14:40:56,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:40:57,081 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:41:00,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:41:01,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-28 14:41:04,327 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=49413.333333333336, ans=0.125 2023-09-28 14:41:04,427 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=49413.333333333336, ans=0.125 2023-09-28 14:41:05,371 INFO [train.py:1039] (1/4) Epoch 2, batch 2100, loss[loss=0.3254, simple_loss=0.3438, pruned_loss=0.1535, over 23381.00 frames. ], tot_loss[loss=0.3193, simple_loss=0.3566, pruned_loss=0.141, over 4699828.45 frames. ], batch size: 285, lr: 3.72e-02, grad_scale: 32.0 2023-09-28 14:41:06,965 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-28 14:41:06,966 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:41:07,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:41:08,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 14:41:09,922 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.967e+02 2.956e+02 3.430e+02 4.185e+02 6.974e+02, threshold=6.859e+02, percent-clipped=1.0 2023-09-28 14:41:10,098 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:41:10,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-28 14:41:10,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-28 14:41:12,375 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 14:41:16,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:41:17,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:41:20,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:41:20,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:41:20,753 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-28 14:41:22,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 14:41:23,714 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-28 14:41:23,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-28 14:41:25,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:41:26,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:41:26,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-28 14:41:26,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 14:41:32,974 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-28 14:41:32,976 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 14:41:34,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:41:36,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:41:40,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:41:40,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-28 14:41:40,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:41:40,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 14:41:43,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-28 14:41:44,524 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:41:44,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-28 14:41:44,583 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-28 14:41:44,694 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-28 14:41:46,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-28 14:41:49,066 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:41:52,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 14:41:55,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 14:41:56,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:41:56,663 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:41:56,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-28 14:41:58,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:41:58,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:41:58,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:41:58,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-28 14:41:59,779 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-28 14:42:01,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-28 14:42:05,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:42:10,307 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:42:10,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-28 14:42:17,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:42:18,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:42:18,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:42:18,964 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:42:20,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-28 14:42:20,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 14:42:22,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:42:22,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-28 14:42:24,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:42:24,827 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:42:26,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-28 14:42:26,701 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=49746.666666666664, ans=5.507246376811742e-05 2023-09-28 14:42:27,847 INFO [train.py:1039] (1/4) Epoch 2, batch 2150, loss[loss=0.3222, simple_loss=0.3501, pruned_loss=0.1472, over 23835.00 frames. ], tot_loss[loss=0.3163, simple_loss=0.3542, pruned_loss=0.1392, over 4710066.64 frames. ], batch size: 164, lr: 3.72e-02, grad_scale: 32.0 2023-09-28 14:42:27,990 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-28 14:42:28,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:42:30,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:42:30,924 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:42:30,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:42:31,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:42:37,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 14:42:38,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:42:38,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:42:40,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:42:40,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:42:40,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:42:45,234 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:42:45,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:42:45,323 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:42:50,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:42:50,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-28 14:42:54,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:42:54,499 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=49813.333333333336, ans=0.0 2023-09-28 14:42:57,121 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-28 14:42:58,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:42:58,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:42:58,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:42:58,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-28 14:42:58,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:43:00,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:43:00,239 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:43:00,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-28 14:43:01,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:43:03,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:43:03,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:43:05,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 14:43:06,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:43:09,670 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:43:09,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-28 14:43:11,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:43:11,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-28 14:43:11,343 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-28 14:43:11,554 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=49880.0, ans=0.0 2023-09-28 14:43:14,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:43:15,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:43:18,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:43:19,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 14:43:19,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:43:21,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:43:21,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-28 14:43:22,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-28 14:43:24,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:43:24,107 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-28 14:43:25,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:43:27,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:43:27,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-28 14:43:27,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:43:27,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-28 14:43:29,699 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-28 14:43:29,699 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-28 14:43:30,064 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=49946.666666666664, ans=0.1 2023-09-28 14:43:31,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-28 14:43:31,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:43:32,793 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:43:32,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:43:34,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:43:34,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 14:43:37,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:43:37,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:43:45,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:43:46,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-28 14:43:47,917 INFO [train.py:1039] (1/4) Epoch 2, batch 2200, loss[loss=0.3403, simple_loss=0.3637, pruned_loss=0.1584, over 23819.00 frames. ], tot_loss[loss=0.3174, simple_loss=0.355, pruned_loss=0.1399, over 4714049.79 frames. ], batch size: 212, lr: 3.71e-02, grad_scale: 32.0 2023-09-28 14:43:51,133 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:43:52,562 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.160e+02 2.946e+02 3.281e+02 3.928e+02 6.005e+02, threshold=6.562e+02, percent-clipped=0.0 2023-09-28 14:43:56,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:43:56,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:43:56,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:43:59,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-28 14:44:02,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:44:02,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:44:03,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-28 14:44:08,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-28 14:44:09,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 14:44:10,982 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.50 vs. limit=8.0 2023-09-28 14:44:16,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-28 14:44:19,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:44:19,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:44:19,366 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:44:19,765 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=50213.333333333336, ans=0.0 2023-09-28 14:44:22,462 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:44:22,516 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-28 14:44:27,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-28 14:44:28,993 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:44:29,107 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-28 14:44:34,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-28 14:44:34,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:44:38,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:44:40,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:44:41,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-28 14:44:42,179 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=50280.0, ans=0.1 2023-09-28 14:44:43,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:44:44,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-28 14:44:46,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:44:46,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-28 14:44:46,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:44:48,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:44:49,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:44:49,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:44:49,662 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:44:51,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-28 14:44:51,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:44:52,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 14:44:56,086 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 14:44:56,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:44:59,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-28 14:45:01,407 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-28 14:45:03,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 14:45:04,624 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-28 14:45:04,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-28 14:45:06,253 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-28 14:45:06,479 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=50346.666666666664, ans=0.0 2023-09-28 14:45:08,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:45:08,506 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-28 14:45:10,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:45:11,953 INFO [train.py:1039] (1/4) Epoch 2, batch 2250, loss[loss=0.3055, simple_loss=0.3516, pruned_loss=0.1296, over 24352.00 frames. ], tot_loss[loss=0.3194, simple_loss=0.3569, pruned_loss=0.141, over 4705621.66 frames. ], batch size: 61, lr: 3.70e-02, grad_scale: 32.0 2023-09-28 14:45:12,200 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-28 14:45:13,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:45:15,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-28 14:45:20,335 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=50413.333333333336, ans=0.0 2023-09-28 14:45:22,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:45:23,056 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:45:24,889 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=50413.333333333336, ans=0.125 2023-09-28 14:45:26,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:45:27,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:45:29,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-28 14:45:32,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-28 14:45:32,087 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:45:32,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:45:35,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-28 14:45:35,877 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:45:37,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:45:38,771 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:45:44,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:45:45,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 14:45:45,709 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-28 14:45:47,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-28 14:45:48,179 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=50546.666666666664, ans=0.0 2023-09-28 14:45:49,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:45:50,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:45:57,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:45:58,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:46:00,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:46:00,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:46:01,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:46:02,194 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=50613.333333333336, ans=0.2 2023-09-28 14:46:03,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:46:03,832 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=50613.333333333336, ans=0.2 2023-09-28 14:46:10,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:46:11,735 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-28 14:46:18,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 14:46:18,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-28 14:46:20,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:46:24,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 14:46:27,907 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=50680.0, ans=0.1 2023-09-28 14:46:29,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-28 14:46:29,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-28 14:46:30,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:46:30,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:46:32,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-28 14:46:33,545 INFO [train.py:1039] (1/4) Epoch 2, batch 2300, loss[loss=0.3496, simple_loss=0.3958, pruned_loss=0.1517, over 24450.00 frames. ], tot_loss[loss=0.3189, simple_loss=0.3574, pruned_loss=0.1402, over 4720526.83 frames. ], batch size: 77, lr: 3.70e-02, grad_scale: 32.0 2023-09-28 14:46:35,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:46:35,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:46:35,840 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=50746.666666666664, ans=0.0 2023-09-28 14:46:38,471 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.122e+02 3.000e+02 3.557e+02 4.160e+02 8.082e+02, threshold=7.115e+02, percent-clipped=2.0 2023-09-28 14:46:41,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:46:41,747 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:46:45,333 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-28 14:46:46,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:46:47,776 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.73 vs. limit=15.0 2023-09-28 14:46:53,492 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:46:53,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-28 14:46:53,679 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=50813.333333333336, ans=0.125 2023-09-28 14:46:54,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:46:54,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:46:54,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-28 14:46:57,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:47:00,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:47:00,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:47:03,362 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:47:06,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-28 14:47:10,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:47:16,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:47:17,004 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:47:20,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:47:22,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:47:27,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:47:27,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 14:47:27,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:47:27,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-28 14:47:33,099 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 14:47:33,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:47:33,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:47:33,193 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:47:33,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:47:34,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 14:47:34,732 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-28 14:47:36,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-28 14:47:36,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:47:36,280 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:47:37,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-28 14:47:44,120 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:47:47,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:47:52,445 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:47:52,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:47:52,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-28 14:47:52,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 14:47:54,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:47:55,607 INFO [train.py:1039] (1/4) Epoch 2, batch 2350, loss[loss=0.3086, simple_loss=0.3584, pruned_loss=0.1294, over 23994.00 frames. ], tot_loss[loss=0.3194, simple_loss=0.3576, pruned_loss=0.1406, over 4717097.32 frames. ], batch size: 80, lr: 3.69e-02, grad_scale: 32.0 2023-09-28 14:47:55,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 14:47:55,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-28 14:48:01,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:48:01,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-28 14:48:08,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-28 14:48:11,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:48:14,465 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:48:14,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:48:14,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:48:14,681 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=51146.666666666664, ans=0.04949747468305833 2023-09-28 14:48:15,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:48:16,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-28 14:48:19,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:48:24,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-28 14:48:27,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:48:28,424 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.12 vs. limit=15.0 2023-09-28 14:48:29,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 14:48:29,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:48:32,503 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:48:32,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-28 14:48:34,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:48:36,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:48:36,526 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:48:37,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:48:42,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:48:44,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-28 14:48:44,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:48:47,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:48:47,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:48:49,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-28 14:48:50,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-28 14:48:55,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-28 14:48:55,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-28 14:49:00,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-28 14:49:05,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-28 14:49:05,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:49:05,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-28 14:49:07,353 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-28 14:49:07,392 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-28 14:49:08,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-28 14:49:14,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:49:18,675 INFO [train.py:1039] (1/4) Epoch 2, batch 2400, loss[loss=0.3014, simple_loss=0.3488, pruned_loss=0.127, over 24649.00 frames. ], tot_loss[loss=0.3193, simple_loss=0.3575, pruned_loss=0.1405, over 4712336.33 frames. ], batch size: 65, lr: 3.68e-02, grad_scale: 32.0 2023-09-28 14:49:18,870 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:49:23,883 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.085e+02 2.787e+02 3.340e+02 4.281e+02 7.222e+02, threshold=6.680e+02, percent-clipped=0.0 2023-09-28 14:49:24,125 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:49:24,336 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:49:25,793 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-28 14:49:25,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-28 14:49:34,386 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 14:49:34,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:49:36,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-28 14:49:36,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:49:37,622 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:49:37,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-28 14:49:44,495 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:49:46,561 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-28 14:49:51,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-28 14:49:54,367 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=51546.666666666664, ans=0.0 2023-09-28 14:49:57,543 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-28 14:50:00,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:50:03,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:50:06,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:50:07,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-28 14:50:08,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 14:50:09,713 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=51613.333333333336, ans=0.125 2023-09-28 14:50:09,783 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=51613.333333333336, ans=0.1 2023-09-28 14:50:09,888 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=51613.333333333336, ans=0.0 2023-09-28 14:50:11,226 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=51613.333333333336, ans=0.09899494936611666 2023-09-28 14:50:17,395 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:50:18,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:50:22,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:50:24,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:50:24,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-28 14:50:24,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:50:24,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:50:24,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:50:24,446 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 14:50:27,726 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=51680.0, ans=0.07 2023-09-28 14:50:30,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:50:30,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 14:50:32,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-28 14:50:32,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-28 14:50:35,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:50:35,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:50:35,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-28 14:50:35,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-28 14:50:37,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-28 14:50:37,048 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-28 14:50:37,235 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-28 14:50:37,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:50:38,908 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:50:38,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:50:40,504 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-28 14:50:41,853 INFO [train.py:1039] (1/4) Epoch 2, batch 2450, loss[loss=0.3235, simple_loss=0.3648, pruned_loss=0.1411, over 23766.00 frames. ], tot_loss[loss=0.3164, simple_loss=0.3549, pruned_loss=0.139, over 4704311.75 frames. ], batch size: 85, lr: 3.68e-02, grad_scale: 32.0 2023-09-28 14:50:42,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:50:44,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-28 14:50:47,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-28 14:50:47,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:50:51,084 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=51746.666666666664, ans=0.07 2023-09-28 14:50:52,223 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:50:52,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:50:53,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-28 14:50:58,152 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=51813.333333333336, ans=0.125 2023-09-28 14:50:59,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:50:59,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:51:02,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:51:02,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:51:02,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:51:02,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-28 14:51:04,842 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=51813.333333333336, ans=0.125 2023-09-28 14:51:07,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:51:09,055 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 14:51:09,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:51:10,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-28 14:51:12,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:51:14,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:51:14,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:51:17,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-28 14:51:17,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:51:19,967 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=17.60 vs. limit=15.0 2023-09-28 14:51:29,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:51:30,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:51:31,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:51:32,428 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:51:32,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:51:34,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:51:35,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-28 14:51:37,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:51:37,421 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:51:42,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:51:42,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:51:49,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-28 14:51:49,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-28 14:51:49,642 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:51:51,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:51:51,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-28 14:51:51,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:51:52,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:51:57,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:52:01,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:52:01,067 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:52:04,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-28 14:52:05,483 INFO [train.py:1039] (1/4) Epoch 2, batch 2500, loss[loss=0.302, simple_loss=0.3626, pruned_loss=0.1207, over 24658.00 frames. ], tot_loss[loss=0.3148, simple_loss=0.353, pruned_loss=0.1383, over 4692128.23 frames. ], batch size: 73, lr: 3.67e-02, grad_scale: 32.0 2023-09-28 14:52:05,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:52:10,765 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.999e+02 2.754e+02 3.242e+02 3.766e+02 6.714e+02, threshold=6.484e+02, percent-clipped=2.0 2023-09-28 14:52:12,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:52:22,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 14:52:22,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:52:25,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:52:25,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-28 14:52:32,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 14:52:33,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:52:35,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-28 14:52:35,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 14:52:37,002 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-28 14:52:38,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:52:38,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:52:40,172 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-28 14:52:40,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:52:41,719 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-28 14:52:41,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:52:45,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:52:45,936 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=52213.333333333336, ans=0.0 2023-09-28 14:52:47,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:52:50,731 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 14:52:50,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-28 14:52:50,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:52:52,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:52:57,898 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:52:59,996 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.17 vs. limit=15.0 2023-09-28 14:53:01,016 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:53:05,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:53:10,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-28 14:53:12,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-28 14:53:12,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:53:12,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-28 14:53:16,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:53:16,337 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 14:53:17,919 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-28 14:53:17,919 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-28 14:53:17,928 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-28 14:53:19,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:53:22,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-28 14:53:22,648 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-28 14:53:22,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:53:24,739 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-28 14:53:29,224 INFO [train.py:1039] (1/4) Epoch 2, batch 2550, loss[loss=0.3174, simple_loss=0.3737, pruned_loss=0.1305, over 24462.00 frames. ], tot_loss[loss=0.3137, simple_loss=0.3524, pruned_loss=0.1375, over 4697337.45 frames. ], batch size: 69, lr: 3.67e-02, grad_scale: 32.0 2023-09-28 14:53:29,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-28 14:53:32,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:53:34,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:53:36,043 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:53:36,307 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:53:37,851 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-28 14:53:37,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-28 14:53:41,176 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-28 14:53:42,721 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:53:46,219 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:53:47,232 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.20 vs. limit=15.0 2023-09-28 14:53:49,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:53:49,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 14:53:49,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 14:53:49,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:53:49,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:53:53,140 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:53:53,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-28 14:53:53,400 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:53:54,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-28 14:53:54,618 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:53:54,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-28 14:54:01,970 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=52546.666666666664, ans=0.125 2023-09-28 14:54:03,756 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=8.17 vs. limit=15.0 2023-09-28 14:54:06,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:54:10,775 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.61 vs. limit=22.5 2023-09-28 14:54:11,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:54:11,588 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:54:11,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:54:13,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 14:54:21,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:54:24,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 14:54:24,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:54:24,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:54:24,661 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=52613.333333333336, ans=0.0 2023-09-28 14:54:25,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-28 14:54:25,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-28 14:54:31,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:54:31,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:54:37,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:54:39,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-28 14:54:39,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:54:39,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:54:41,525 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:54:43,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 14:54:44,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:54:52,139 INFO [train.py:1039] (1/4) Epoch 2, batch 2600, loss[loss=0.2895, simple_loss=0.3417, pruned_loss=0.1187, over 24435.00 frames. ], tot_loss[loss=0.3157, simple_loss=0.354, pruned_loss=0.1387, over 4699587.04 frames. ], batch size: 63, lr: 3.66e-02, grad_scale: 32.0 2023-09-28 14:54:52,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:54:53,948 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:54:54,232 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:54:57,273 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.968e+02 2.901e+02 3.329e+02 4.085e+02 7.147e+02, threshold=6.657e+02, percent-clipped=2.0 2023-09-28 14:54:57,481 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-28 14:54:59,271 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-28 14:54:59,297 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 14:54:59,344 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-28 14:55:00,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-28 14:55:00,847 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-28 14:55:02,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:55:02,616 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-28 14:55:06,187 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-28 14:55:07,643 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-28 14:55:09,406 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=52813.333333333336, ans=0.0 2023-09-28 14:55:11,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:55:12,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-28 14:55:12,918 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=52813.333333333336, ans=0.0 2023-09-28 14:55:14,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-28 14:55:16,061 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-28 14:55:16,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-28 14:55:19,589 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-28 14:55:19,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-28 14:55:20,226 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.87 vs. limit=22.5 2023-09-28 14:55:25,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:55:27,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:55:27,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:55:27,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-28 14:55:29,798 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=52880.0, ans=0.125 2023-09-28 14:55:29,816 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=52880.0, ans=0.125 2023-09-28 14:55:30,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:55:34,531 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=52880.0, ans=0.2 2023-09-28 14:55:35,698 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-28 14:55:42,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:55:42,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:55:42,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-28 14:55:42,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:55:42,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:55:44,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-28 14:55:47,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-28 14:55:47,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:55:49,561 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=52946.666666666664, ans=0.125 2023-09-28 14:55:50,043 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.33 vs. limit=15.0 2023-09-28 14:55:50,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:55:51,919 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=52946.666666666664, ans=0.1 2023-09-28 14:55:53,056 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-28 14:55:54,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:55:54,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:56:02,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:56:02,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:56:04,127 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-28 14:56:04,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:56:04,495 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=53013.333333333336, ans=0.125 2023-09-28 14:56:04,510 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=53013.333333333336, ans=0.125 2023-09-28 14:56:07,099 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:56:07,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:56:12,634 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=53013.333333333336, ans=0.1 2023-09-28 14:56:13,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-28 14:56:13,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:56:15,303 INFO [train.py:1039] (1/4) Epoch 2, batch 2650, loss[loss=0.319, simple_loss=0.3433, pruned_loss=0.1474, over 23871.00 frames. ], tot_loss[loss=0.3159, simple_loss=0.3544, pruned_loss=0.1387, over 4706018.76 frames. ], batch size: 195, lr: 3.65e-02, grad_scale: 32.0 2023-09-28 14:56:15,540 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 14:56:19,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-28 14:56:19,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:56:20,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 14:56:22,226 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-28 14:56:22,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:56:25,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:56:27,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 14:56:29,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:56:30,942 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:56:32,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-28 14:56:32,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 14:56:32,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:56:32,930 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=53146.666666666664, ans=0.1 2023-09-28 14:56:35,104 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=8.46 vs. limit=15.0 2023-09-28 14:56:35,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-28 14:56:35,951 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=53146.666666666664, ans=0.0 2023-09-28 14:56:39,011 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-28 14:56:42,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:56:43,993 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=53146.666666666664, ans=0.0 2023-09-28 14:56:45,128 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-28 14:56:45,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:56:47,347 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-28 14:56:50,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:56:50,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-28 14:56:50,891 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:56:52,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:56:57,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-28 14:56:57,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-28 14:56:59,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-28 14:57:02,850 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=53213.333333333336, ans=0.1 2023-09-28 14:57:03,971 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-28 14:57:04,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:57:04,101 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=53280.0, ans=0.125 2023-09-28 14:57:05,490 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:57:06,799 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:57:06,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:57:06,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:57:08,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:57:11,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:57:12,939 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:57:13,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-28 14:57:13,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:57:15,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:57:16,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:57:16,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:57:22,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:57:23,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-28 14:57:27,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:57:28,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:57:28,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:57:30,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-28 14:57:30,486 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=53346.666666666664, ans=0.95 2023-09-28 14:57:35,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:57:36,792 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:57:38,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:57:39,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:57:40,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:57:40,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:57:41,012 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=53413.333333333336, ans=0.09899494936611666 2023-09-28 14:57:41,240 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.33 vs. limit=15.0 2023-09-28 14:57:42,112 INFO [train.py:1039] (1/4) Epoch 2, batch 2700, loss[loss=0.263, simple_loss=0.3067, pruned_loss=0.1097, over 24342.00 frames. ], tot_loss[loss=0.3178, simple_loss=0.3561, pruned_loss=0.1398, over 4702709.61 frames. ], batch size: 56, lr: 3.65e-02, grad_scale: 32.0 2023-09-28 14:57:42,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:57:42,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-28 14:57:45,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:57:46,780 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.066e+02 2.772e+02 3.228e+02 4.080e+02 7.773e+02, threshold=6.457e+02, percent-clipped=3.0 2023-09-28 14:57:47,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 14:57:48,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:57:50,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:57:50,739 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:57:50,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:57:52,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:57:52,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:57:52,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-28 14:57:52,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-28 14:57:53,761 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:57:53,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-28 14:57:55,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 14:57:55,491 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:57:57,921 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=53480.0, ans=0.1 2023-09-28 14:57:59,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-28 14:58:00,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-28 14:58:00,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:58:09,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:58:09,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:58:14,065 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:58:14,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:58:14,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:58:14,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-28 14:58:15,811 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=53546.666666666664, ans=0.2 2023-09-28 14:58:19,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:58:22,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:58:22,917 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-28 14:58:22,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:58:29,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:58:29,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:58:38,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:58:38,191 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:58:41,299 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:58:42,685 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:58:47,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:58:49,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:58:50,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:58:50,796 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:58:52,404 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:58:52,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:58:56,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:58:58,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:58:58,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:59:00,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-28 14:59:02,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:59:03,496 INFO [train.py:1039] (1/4) Epoch 2, batch 2750, loss[loss=0.3278, simple_loss=0.3862, pruned_loss=0.1347, over 24334.00 frames. ], tot_loss[loss=0.3182, simple_loss=0.3561, pruned_loss=0.1401, over 4696809.67 frames. ], batch size: 74, lr: 3.64e-02, grad_scale: 16.0 2023-09-28 14:59:05,082 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:59:05,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-28 14:59:07,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-28 14:59:07,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:59:08,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:59:08,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:59:10,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=53746.666666666664, ans=0.125 2023-09-28 14:59:11,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:59:12,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-28 14:59:12,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:59:17,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:59:18,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 14:59:18,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:59:18,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:59:18,591 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-28 14:59:18,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:59:18,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:59:25,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-28 14:59:27,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:59:27,115 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:59:28,586 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:59:29,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-28 14:59:30,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:59:31,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:59:31,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:59:33,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:59:38,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 14:59:38,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 14:59:38,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:59:40,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:59:42,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 14:59:48,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:59:50,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 14:59:50,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:59:55,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:59:55,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-28 14:59:57,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 15:00:02,171 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:00:02,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:00:02,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-28 15:00:08,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:00:10,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-28 15:00:16,972 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-28 15:00:18,561 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:00:18,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-28 15:00:20,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:00:23,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:00:23,122 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-28 15:00:24,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:00:26,417 INFO [train.py:1039] (1/4) Epoch 2, batch 2800, loss[loss=0.3054, simple_loss=0.3028, pruned_loss=0.154, over 19237.00 frames. ], tot_loss[loss=0.3163, simple_loss=0.3541, pruned_loss=0.1393, over 4692205.28 frames. ], batch size: 389, lr: 3.64e-02, grad_scale: 32.0 2023-09-28 15:00:28,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-28 15:00:28,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:00:28,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:00:29,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-28 15:00:29,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:00:29,839 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:00:31,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:00:33,353 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.006e+02 2.948e+02 3.600e+02 4.282e+02 6.554e+02, threshold=7.201e+02, percent-clipped=1.0 2023-09-28 15:00:33,482 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-28 15:00:33,483 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-28 15:00:36,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:00:37,010 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=54080.0, ans=0.2 2023-09-28 15:00:38,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 15:00:38,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:00:41,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:00:43,567 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-28 15:00:47,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-28 15:00:48,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-28 15:00:50,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:00:50,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:00:50,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:00:53,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:00:53,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:00:53,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-28 15:00:53,970 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=54146.666666666664, ans=0.0 2023-09-28 15:00:56,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:01:03,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:01:05,247 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:01:10,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:01:10,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:01:12,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:01:12,693 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.77 vs. limit=15.0 2023-09-28 15:01:15,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:01:15,375 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-28 15:01:16,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:01:17,165 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=54280.0, ans=0.125 2023-09-28 15:01:18,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:01:18,961 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-28 15:01:22,115 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:01:23,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:01:26,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:01:27,413 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=23.18 vs. limit=22.5 2023-09-28 15:01:28,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:01:28,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:01:28,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 15:01:29,010 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 15:01:29,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:01:30,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:01:30,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-28 15:01:32,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:01:32,793 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.70 vs. limit=22.5 2023-09-28 15:01:34,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:01:34,835 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:01:35,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-28 15:01:35,353 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=54346.666666666664, ans=0.125 2023-09-28 15:01:36,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:01:36,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:01:38,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:01:39,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-28 15:01:46,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:01:46,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 15:01:46,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:01:49,672 INFO [train.py:1039] (1/4) Epoch 2, batch 2850, loss[loss=0.318, simple_loss=0.3483, pruned_loss=0.1438, over 23826.00 frames. ], tot_loss[loss=0.3147, simple_loss=0.3527, pruned_loss=0.1384, over 4689716.19 frames. ], batch size: 195, lr: 3.63e-02, grad_scale: 32.0 2023-09-28 15:01:49,873 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:01:56,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:01:56,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:01:56,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:01:59,787 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=54413.333333333336, ans=0.125 2023-09-28 15:02:01,120 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:02:01,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:02:01,587 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=54413.333333333336, ans=0.125 2023-09-28 15:02:02,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-28 15:02:02,896 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-28 15:02:03,207 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=54413.333333333336, ans=0.0 2023-09-28 15:02:08,170 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=54480.0, ans=0.0 2023-09-28 15:02:09,393 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-28 15:02:09,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:02:11,214 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=54480.0, ans=0.1 2023-09-28 15:02:12,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-28 15:02:13,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:02:15,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-28 15:02:17,111 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-28 15:02:17,288 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:02:21,188 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=54546.666666666664, ans=0.125 2023-09-28 15:02:23,100 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=54546.666666666664, ans=0.0 2023-09-28 15:02:31,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:02:32,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:02:32,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:02:34,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 15:02:34,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 15:02:34,629 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:02:36,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:02:36,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-28 15:02:36,537 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 15:02:39,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-28 15:02:39,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:02:39,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:02:41,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:02:44,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:02:44,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:02:45,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:02:48,888 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:02:49,045 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=54613.333333333336, ans=0.04949747468305833 2023-09-28 15:02:50,521 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:02:52,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:02:53,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:02:56,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:02:56,804 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=24.09 vs. limit=22.5 2023-09-28 15:02:58,692 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.55 vs. limit=15.0 2023-09-28 15:03:02,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:03:04,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-28 15:03:04,884 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-28 15:03:06,427 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 15:03:08,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:03:08,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-28 15:03:09,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:03:09,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:03:10,869 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:03:10,912 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:03:10,913 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-28 15:03:10,969 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-28 15:03:10,974 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:03:12,360 INFO [train.py:1039] (1/4) Epoch 2, batch 2900, loss[loss=0.2753, simple_loss=0.3299, pruned_loss=0.1103, over 24296.00 frames. ], tot_loss[loss=0.3142, simple_loss=0.3527, pruned_loss=0.1379, over 4705382.89 frames. ], batch size: 61, lr: 3.62e-02, grad_scale: 32.0 2023-09-28 15:03:12,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:03:14,442 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=54746.666666666664, ans=0.0 2023-09-28 15:03:15,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-28 15:03:15,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:03:18,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:03:19,518 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.086e+02 2.913e+02 3.691e+02 4.538e+02 7.186e+02, threshold=7.382e+02, percent-clipped=0.0 2023-09-28 15:03:19,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-28 15:03:22,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:03:22,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-28 15:03:24,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-28 15:03:26,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:03:26,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-28 15:03:29,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:03:30,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:03:34,555 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:03:34,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:03:39,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-28 15:03:39,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-28 15:03:39,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-28 15:03:42,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:03:44,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-28 15:03:44,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-28 15:03:47,550 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:03:47,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-28 15:03:47,585 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:03:50,676 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:03:50,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-28 15:03:55,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:03:57,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:04:00,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:04:01,760 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:04:05,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-28 15:04:06,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-28 15:04:06,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:04:09,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:04:12,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-28 15:04:14,171 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:04:20,176 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:04:30,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:04:30,158 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-28 15:04:31,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-28 15:04:34,434 INFO [train.py:1039] (1/4) Epoch 2, batch 2950, loss[loss=0.2977, simple_loss=0.338, pruned_loss=0.1287, over 23569.00 frames. ], tot_loss[loss=0.3131, simple_loss=0.3525, pruned_loss=0.1368, over 4708290.86 frames. ], batch size: 134, lr: 3.62e-02, grad_scale: 32.0 2023-09-28 15:04:34,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:04:34,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-28 15:04:34,690 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:04:36,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-28 15:04:41,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:04:42,915 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-28 15:04:44,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:04:44,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:04:46,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:04:46,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:04:48,957 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-28 15:04:49,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-28 15:04:50,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 15:04:50,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:04:52,847 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.18 vs. limit=22.5 2023-09-28 15:04:58,326 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:04:59,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:05:01,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:05:03,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:05:06,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:05:07,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:05:08,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:05:08,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:05:08,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:05:11,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-28 15:05:16,852 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-28 15:05:16,882 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-28 15:05:18,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:05:20,715 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-28 15:05:22,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-28 15:05:23,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:05:24,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:05:24,515 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-28 15:05:24,522 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-28 15:05:27,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-28 15:05:27,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:05:27,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:05:30,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:05:31,141 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=55280.0, ans=0.1 2023-09-28 15:05:32,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:05:33,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:05:33,862 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-28 15:05:33,934 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:05:35,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-28 15:05:37,823 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 15:05:39,302 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=55280.0, ans=0.1 2023-09-28 15:05:40,641 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:05:41,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-28 15:05:42,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-28 15:05:42,147 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:05:45,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-28 15:05:46,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:05:48,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:05:50,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 15:05:50,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:05:52,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 15:05:54,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:05:54,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:05:54,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-28 15:05:54,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:05:56,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:05:56,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:05:57,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:05:58,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-28 15:05:59,272 INFO [train.py:1039] (1/4) Epoch 2, batch 3000, loss[loss=0.3118, simple_loss=0.3595, pruned_loss=0.1321, over 24567.00 frames. ], tot_loss[loss=0.3135, simple_loss=0.353, pruned_loss=0.137, over 4717932.97 frames. ], batch size: 71, lr: 3.61e-02, grad_scale: 32.0 2023-09-28 15:05:59,273 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-28 15:06:14,701 INFO [train.py:1071] (1/4) Epoch 2, validation: loss=0.3279, simple_loss=0.3383, pruned_loss=0.1588, over 1125622.00 frames. 2023-09-28 15:06:14,702 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-28 15:06:14,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:06:17,854 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:06:17,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-28 15:06:20,430 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=26.65 vs. limit=22.5 2023-09-28 15:06:20,862 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.039e+02 2.946e+02 3.548e+02 4.220e+02 7.965e+02, threshold=7.096e+02, percent-clipped=1.0 2023-09-28 15:06:21,015 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-28 15:06:21,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-28 15:06:23,251 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-28 15:06:23,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:06:25,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-28 15:06:25,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:06:33,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 15:06:43,811 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:06:44,538 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.53 vs. limit=15.0 2023-09-28 15:06:49,547 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=55546.666666666664, ans=0.125 2023-09-28 15:06:50,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-28 15:06:52,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:06:55,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:06:55,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:06:55,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:06:58,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:06:58,932 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-28 15:07:01,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-28 15:07:02,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:07:02,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 15:07:04,437 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:07:04,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:07:06,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:07:06,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:07:09,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 15:07:10,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:07:10,615 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:07:13,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:07:16,396 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-28 15:07:16,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:07:16,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:07:17,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:07:23,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:07:23,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:07:24,822 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-28 15:07:24,891 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-28 15:07:24,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:07:25,003 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-28 15:07:26,419 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:07:28,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-28 15:07:30,241 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=55680.0, ans=0.125 2023-09-28 15:07:31,423 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-28 15:07:34,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 15:07:34,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-28 15:07:34,432 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-28 15:07:34,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 15:07:36,379 INFO [train.py:1039] (1/4) Epoch 2, batch 3050, loss[loss=0.32, simple_loss=0.3511, pruned_loss=0.1445, over 23758.00 frames. ], tot_loss[loss=0.3138, simple_loss=0.3534, pruned_loss=0.137, over 4722387.27 frames. ], batch size: 212, lr: 3.61e-02, grad_scale: 32.0 2023-09-28 15:07:37,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:07:39,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:07:39,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-28 15:07:39,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:07:39,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:07:41,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-28 15:07:41,613 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=55746.666666666664, ans=0.125 2023-09-28 15:07:42,905 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:07:45,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:07:46,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 15:07:51,816 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:07:54,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-28 15:07:57,589 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=55813.333333333336, ans=0.125 2023-09-28 15:08:00,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-28 15:08:02,266 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-28 15:08:02,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:08:04,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:08:07,161 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:08:07,987 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.41 vs. limit=15.0 2023-09-28 15:08:08,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:08:09,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:08:12,733 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=55880.0, ans=0.1 2023-09-28 15:08:13,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:08:13,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-28 15:08:15,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:08:15,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:08:15,419 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:08:16,970 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:08:17,358 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=55880.0, ans=0.07 2023-09-28 15:08:18,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:08:19,506 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=7.24 vs. limit=15.0 2023-09-28 15:08:21,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:08:21,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-28 15:08:21,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:08:21,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:08:24,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:08:26,322 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 15:08:27,779 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:08:27,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:08:32,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:08:33,797 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:08:35,478 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.min_positive, batch_count=55946.666666666664, ans=0.05 2023-09-28 15:08:37,853 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.04 vs. limit=15.0 2023-09-28 15:08:40,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:08:40,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:08:40,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:08:43,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:08:44,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 15:08:44,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:08:46,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-28 15:08:50,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:08:50,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:08:51,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-28 15:08:53,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:08:56,737 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=56080.0, ans=0.09899494936611666 2023-09-28 15:08:57,750 INFO [train.py:1039] (1/4) Epoch 2, batch 3100, loss[loss=0.3154, simple_loss=0.3604, pruned_loss=0.1352, over 23408.00 frames. ], tot_loss[loss=0.312, simple_loss=0.353, pruned_loss=0.1355, over 4739110.23 frames. ], batch size: 93, lr: 3.60e-02, grad_scale: 32.0 2023-09-28 15:08:59,276 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:09:00,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:09:01,683 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=14.31 vs. limit=15.0 2023-09-28 15:09:03,856 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.205e+02 2.748e+02 3.065e+02 3.838e+02 6.915e+02, threshold=6.130e+02, percent-clipped=0.0 2023-09-28 15:09:03,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 15:09:06,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-28 15:09:09,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-28 15:09:10,260 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=56080.0, ans=0.125 2023-09-28 15:09:11,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-28 15:09:11,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:09:14,714 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:09:14,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:09:16,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-28 15:09:21,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:09:27,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-28 15:09:30,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 15:09:30,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:09:32,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:09:32,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:09:33,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-28 15:09:35,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:09:35,415 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-28 15:09:35,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:09:38,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:09:38,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-28 15:09:40,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:09:42,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-28 15:09:44,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-28 15:09:44,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-28 15:09:46,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:09:47,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:09:49,230 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:09:49,264 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:09:49,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:09:52,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:09:52,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:09:56,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:09:56,537 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:09:56,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:09:56,549 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 15:10:01,747 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=6.33 vs. limit=10.0 2023-09-28 15:10:02,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:10:04,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-28 15:10:05,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-28 15:10:07,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-28 15:10:07,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:10:07,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:10:07,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-28 15:10:08,256 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=23.88 vs. limit=22.5 2023-09-28 15:10:16,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-28 15:10:19,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:10:19,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:10:20,595 INFO [train.py:1039] (1/4) Epoch 2, batch 3150, loss[loss=0.3308, simple_loss=0.3585, pruned_loss=0.1515, over 23736.00 frames. ], tot_loss[loss=0.311, simple_loss=0.3515, pruned_loss=0.1353, over 4717006.45 frames. ], batch size: 179, lr: 3.59e-02, grad_scale: 32.0 2023-09-28 15:10:22,178 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:10:22,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:10:25,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-28 15:10:25,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:10:27,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-28 15:10:29,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-28 15:10:29,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:10:30,769 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-28 15:10:33,906 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=56413.333333333336, ans=10.0 2023-09-28 15:10:35,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-28 15:10:35,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:10:36,721 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-28 15:10:36,998 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=56480.0, ans=0.0 2023-09-28 15:10:38,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-28 15:10:39,040 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.63 vs. limit=22.5 2023-09-28 15:10:39,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-28 15:10:41,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-28 15:10:41,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-28 15:10:41,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:10:41,184 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:10:42,767 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:10:44,435 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-28 15:10:47,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:10:47,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:10:48,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:10:50,515 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-28 15:10:53,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-28 15:10:55,326 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:10:58,280 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-28 15:10:59,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:10:59,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-28 15:11:03,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-28 15:11:05,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:11:05,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 15:11:05,445 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 15:11:05,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:11:05,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 15:11:08,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-28 15:11:08,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-28 15:11:08,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-28 15:11:10,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:11:10,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:11:11,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:11:11,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:11:13,289 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-28 15:11:13,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:11:16,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-28 15:11:16,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:11:16,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-28 15:11:18,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-28 15:11:18,286 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:11:20,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:11:22,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-28 15:11:22,551 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 15:11:24,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:11:27,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:11:28,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:11:28,635 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:11:33,532 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:11:36,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:11:40,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-28 15:11:43,301 INFO [train.py:1039] (1/4) Epoch 2, batch 3200, loss[loss=0.2824, simple_loss=0.333, pruned_loss=0.1159, over 20063.00 frames. ], tot_loss[loss=0.3103, simple_loss=0.3505, pruned_loss=0.1351, over 4700779.52 frames. ], batch size: 43, lr: 3.59e-02, grad_scale: 32.0 2023-09-28 15:11:45,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:11:45,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-28 15:11:49,720 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.102e+02 2.897e+02 3.504e+02 4.245e+02 7.793e+02, threshold=7.007e+02, percent-clipped=2.0 2023-09-28 15:11:49,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:11:50,155 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=56746.666666666664, ans=0.2 2023-09-28 15:11:51,568 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:11:51,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-28 15:11:53,322 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=56746.666666666664, ans=0.125 2023-09-28 15:11:54,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:11:57,601 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=4.72 vs. limit=12.0 2023-09-28 15:12:01,779 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:12:04,802 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:12:13,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:12:24,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-28 15:12:25,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:12:28,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-28 15:12:29,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 15:12:34,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:12:34,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 15:12:36,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:12:41,040 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-28 15:12:42,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-28 15:12:44,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-28 15:12:47,969 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-28 15:12:48,788 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.25 vs. limit=15.0 2023-09-28 15:12:49,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:12:54,792 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:12:56,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 15:12:56,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:12:57,386 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.90 vs. limit=6.0 2023-09-28 15:12:57,728 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-28 15:12:57,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 15:13:00,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:13:01,046 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-28 15:13:03,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-28 15:13:03,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-28 15:13:03,598 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=57013.333333333336, ans=0.125 2023-09-28 15:13:04,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-28 15:13:06,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:13:07,654 INFO [train.py:1039] (1/4) Epoch 2, batch 3250, loss[loss=0.367, simple_loss=0.364, pruned_loss=0.1849, over 19279.00 frames. ], tot_loss[loss=0.3113, simple_loss=0.351, pruned_loss=0.1358, over 4699450.75 frames. ], batch size: 388, lr: 3.58e-02, grad_scale: 32.0 2023-09-28 15:13:10,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-28 15:13:10,141 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-28 15:13:10,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:13:10,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:13:11,629 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-28 15:13:14,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:13:18,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:13:26,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:13:26,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-28 15:13:28,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:13:28,151 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:13:28,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:13:29,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:13:29,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:13:34,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:13:34,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-28 15:13:34,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:13:36,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:13:36,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:13:36,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:13:40,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:13:42,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:13:42,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:13:44,696 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:13:45,011 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.max_abs, batch_count=57213.333333333336, ans=10.0 2023-09-28 15:13:46,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:13:46,317 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:13:46,332 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:13:49,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-28 15:13:50,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:13:50,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:13:53,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:13:53,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:13:59,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:13:59,656 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=57280.0, ans=0.1 2023-09-28 15:14:09,588 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:14:11,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:14:11,309 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-28 15:14:11,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:14:11,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 15:14:11,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:14:15,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-28 15:14:16,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-28 15:14:16,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:14:17,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:14:19,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:14:19,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-28 15:14:21,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:14:22,452 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.69 vs. limit=22.5 2023-09-28 15:14:24,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:14:24,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:14:27,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-28 15:14:27,542 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:14:31,064 INFO [train.py:1039] (1/4) Epoch 2, batch 3300, loss[loss=0.2886, simple_loss=0.3396, pruned_loss=0.1188, over 24320.00 frames. ], tot_loss[loss=0.3131, simple_loss=0.3523, pruned_loss=0.1369, over 4683774.38 frames. ], batch size: 61, lr: 3.58e-02, grad_scale: 16.0 2023-09-28 15:14:31,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 15:14:31,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-28 15:14:33,192 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=57413.333333333336, ans=0.1 2023-09-28 15:14:34,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:14:34,492 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-28 15:14:36,054 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-28 15:14:37,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-28 15:14:37,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:14:38,875 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.841e+02 2.772e+02 3.522e+02 4.271e+02 9.362e+02, threshold=7.044e+02, percent-clipped=2.0 2023-09-28 15:14:41,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:14:42,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:14:44,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:14:46,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 15:14:46,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 15:14:47,147 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.44 vs. limit=15.0 2023-09-28 15:14:49,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:14:51,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:14:54,310 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-28 15:14:54,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:14:54,454 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:14:56,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:14:56,565 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-28 15:14:58,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:14:58,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 15:14:58,352 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=57480.0, ans=0.0 2023-09-28 15:14:59,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:14:59,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:14:59,660 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-28 15:15:02,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:15:02,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-28 15:15:05,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:15:05,176 WARNING [train.py:1197] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-28 15:15:06,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-28 15:15:06,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:15:08,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:15:09,889 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-28 15:15:12,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-28 15:15:15,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:15:17,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-28 15:15:20,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:15:22,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-28 15:15:22,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:15:22,528 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=57613.333333333336, ans=0.125 2023-09-28 15:15:25,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:15:25,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:15:25,564 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:15:27,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-28 15:15:29,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:15:29,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:15:29,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:15:30,843 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-28 15:15:32,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-28 15:15:34,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-28 15:15:36,018 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:15:36,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:15:39,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:15:39,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:15:39,205 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=57680.0, ans=0.125 2023-09-28 15:15:40,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 15:15:40,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:15:42,011 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-28 15:15:42,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:15:45,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 15:15:48,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-28 15:15:48,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:15:48,458 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=57680.0, ans=0.125 2023-09-28 15:15:49,556 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:15:53,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:15:53,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:15:54,733 INFO [train.py:1039] (1/4) Epoch 2, batch 3350, loss[loss=0.288, simple_loss=0.3209, pruned_loss=0.1275, over 23760.00 frames. ], tot_loss[loss=0.3143, simple_loss=0.3531, pruned_loss=0.1377, over 4683615.16 frames. ], batch size: 164, lr: 3.57e-02, grad_scale: 16.0 2023-09-28 15:15:54,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:15:55,408 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=57746.666666666664, ans=0.125 2023-09-28 15:15:56,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:15:56,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:15:59,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:16:01,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:16:03,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:16:06,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:16:07,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:16:09,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:16:09,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:16:11,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-28 15:16:13,416 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-28 15:16:13,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:16:15,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-28 15:16:15,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-28 15:16:16,740 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:16:18,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:16:18,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:16:19,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-28 15:16:19,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:16:19,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:16:21,465 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:16:24,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:16:25,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:16:28,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:16:31,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:16:34,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:16:34,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:16:39,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:16:39,827 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.89 vs. limit=15.0 2023-09-28 15:16:40,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:16:43,524 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:16:43,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:16:46,063 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=14.68 vs. limit=22.5 2023-09-28 15:16:46,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:16:48,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-28 15:16:48,893 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 15:16:48,935 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-28 15:16:49,006 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:16:49,762 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.45 vs. limit=15.0 2023-09-28 15:16:50,429 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-28 15:16:50,636 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 15:16:51,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:16:52,256 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=57946.666666666664, ans=0.125 2023-09-28 15:16:53,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:17:00,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:17:01,114 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-28 15:17:03,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 15:17:04,731 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:17:06,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:17:06,502 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=58013.333333333336, ans=0.0 2023-09-28 15:17:12,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:17:12,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-28 15:17:12,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 15:17:12,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:17:14,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:17:14,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-28 15:17:16,537 INFO [train.py:1039] (1/4) Epoch 2, batch 3400, loss[loss=0.2879, simple_loss=0.3285, pruned_loss=0.1236, over 23363.00 frames. ], tot_loss[loss=0.3135, simple_loss=0.3534, pruned_loss=0.1368, over 4694040.62 frames. ], batch size: 119, lr: 3.56e-02, grad_scale: 16.0 2023-09-28 15:17:16,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:17:16,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-28 15:17:18,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:17:18,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:17:19,648 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-28 15:17:21,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:17:21,753 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-28 15:17:24,560 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.046e+02 2.787e+02 3.091e+02 3.869e+02 5.571e+02, threshold=6.183e+02, percent-clipped=0.0 2023-09-28 15:17:26,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-28 15:17:26,298 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-28 15:17:26,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:17:28,150 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=58080.0, ans=0.0 2023-09-28 15:17:30,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:17:30,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 15:17:31,029 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:17:33,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-28 15:17:38,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:17:38,770 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=58146.666666666664, ans=0.125 2023-09-28 15:17:40,763 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.24 vs. limit=22.5 2023-09-28 15:17:41,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-28 15:17:41,700 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=58146.666666666664, ans=0.125 2023-09-28 15:17:46,673 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=58146.666666666664, ans=0.125 2023-09-28 15:17:47,892 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:17:50,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:17:50,969 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:17:51,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-28 15:17:59,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:17:59,654 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=58213.333333333336, ans=0.125 2023-09-28 15:18:03,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-28 15:18:04,156 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=58280.0, ans=0.0 2023-09-28 15:18:10,544 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:18:10,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:18:10,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-28 15:18:10,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:18:12,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:18:14,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:18:14,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:18:17,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:18:20,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 15:18:20,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:18:25,702 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:18:25,924 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-28 15:18:31,491 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=58346.666666666664, ans=0.125 2023-09-28 15:18:32,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 15:18:37,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-28 15:18:38,596 INFO [train.py:1039] (1/4) Epoch 2, batch 3450, loss[loss=0.3504, simple_loss=0.3808, pruned_loss=0.16, over 23590.00 frames. ], tot_loss[loss=0.3143, simple_loss=0.3541, pruned_loss=0.1372, over 4699691.29 frames. ], batch size: 93, lr: 3.56e-02, grad_scale: 16.0 2023-09-28 15:18:41,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-28 15:18:41,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:18:43,917 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:18:43,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-28 15:18:46,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:18:49,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-28 15:18:55,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:18:55,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:18:57,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:18:57,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:18:59,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:19:06,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-28 15:19:10,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-28 15:19:10,856 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 15:19:10,912 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:19:13,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:19:17,473 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=58546.666666666664, ans=0.125 2023-09-28 15:19:20,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-28 15:19:21,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 15:19:25,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:19:25,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:19:27,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-28 15:19:28,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:19:30,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-28 15:19:30,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:19:32,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:19:35,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:19:37,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-28 15:19:40,844 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=58613.333333333336, ans=0.125 2023-09-28 15:19:42,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:19:47,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:19:47,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:19:52,583 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:19:57,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:19:57,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:19:57,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:19:58,788 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:20:01,745 INFO [train.py:1039] (1/4) Epoch 2, batch 3500, loss[loss=0.3077, simple_loss=0.3439, pruned_loss=0.1358, over 23636.00 frames. ], tot_loss[loss=0.3127, simple_loss=0.3521, pruned_loss=0.1366, over 4698224.40 frames. ], batch size: 149, lr: 3.55e-02, grad_scale: 16.0 2023-09-28 15:20:01,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:20:06,892 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:20:07,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-28 15:20:08,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 15:20:09,934 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.044e+02 2.839e+02 3.369e+02 4.173e+02 9.194e+02, threshold=6.738e+02, percent-clipped=6.0 2023-09-28 15:20:11,641 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-28 15:20:13,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:20:13,924 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-28 15:20:18,708 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:20:20,286 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:20:22,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:20:22,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:20:24,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-28 15:20:24,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:20:24,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:20:25,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-28 15:20:30,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:20:32,224 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-28 15:20:33,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:20:38,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:20:38,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-28 15:20:38,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:20:43,879 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:20:45,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:20:46,776 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:20:48,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:20:48,505 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:20:52,126 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-28 15:20:52,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-28 15:20:52,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-28 15:20:53,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:20:55,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:20:57,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:20:57,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 15:21:01,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 15:21:01,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:21:06,909 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:21:07,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-28 15:21:07,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-28 15:21:07,083 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:21:08,873 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=59013.333333333336, ans=0.0 2023-09-28 15:21:11,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:21:12,791 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:21:14,952 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:21:16,668 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-28 15:21:16,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:21:16,963 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=59013.333333333336, ans=0.0 2023-09-28 15:21:19,773 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:21:19,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-28 15:21:22,858 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-28 15:21:24,843 INFO [train.py:1039] (1/4) Epoch 2, batch 3550, loss[loss=0.2726, simple_loss=0.3188, pruned_loss=0.1132, over 24285.00 frames. ], tot_loss[loss=0.3113, simple_loss=0.3516, pruned_loss=0.1355, over 4711281.29 frames. ], batch size: 56, lr: 3.55e-02, grad_scale: 16.0 2023-09-28 15:21:25,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:21:27,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:21:28,009 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:21:28,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:21:31,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:21:39,484 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=59080.0, ans=0.125 2023-09-28 15:21:41,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:21:43,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 15:21:46,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:21:47,879 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:21:49,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:21:51,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:21:51,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 15:21:53,787 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=22.14 vs. limit=22.5 2023-09-28 15:21:54,450 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:21:54,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:21:55,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:21:55,892 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-28 15:21:56,208 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=59213.333333333336, ans=0.07 2023-09-28 15:21:57,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 15:22:02,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:22:02,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:22:05,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:22:05,640 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:22:05,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:22:05,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-28 15:22:05,791 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:22:07,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:22:07,525 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=59213.333333333336, ans=0.125 2023-09-28 15:22:08,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 15:22:14,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:22:14,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:22:15,114 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=59280.0, ans=0.5 2023-09-28 15:22:15,460 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.97 vs. limit=22.5 2023-09-28 15:22:16,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:22:18,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-28 15:22:18,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:22:19,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-28 15:22:19,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:22:21,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:22:23,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:22:25,041 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-28 15:22:26,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:22:33,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:22:33,490 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-28 15:22:34,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:22:35,342 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=59346.666666666664, ans=0.125 2023-09-28 15:22:36,621 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=59346.666666666664, ans=0.0 2023-09-28 15:22:38,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:22:39,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-28 15:22:48,134 INFO [train.py:1039] (1/4) Epoch 2, batch 3600, loss[loss=0.307, simple_loss=0.3668, pruned_loss=0.1235, over 24652.00 frames. ], tot_loss[loss=0.3102, simple_loss=0.3508, pruned_loss=0.1349, over 4711690.80 frames. ], batch size: 73, lr: 3.54e-02, grad_scale: 32.0 2023-09-28 15:22:48,239 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-28 15:22:48,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:22:49,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:22:51,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:22:51,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:22:53,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:22:56,192 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.024e+02 2.598e+02 2.903e+02 3.548e+02 6.359e+02, threshold=5.806e+02, percent-clipped=0.0 2023-09-28 15:22:57,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:22:59,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:23:00,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:23:01,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:23:02,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:23:02,552 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-28 15:23:08,942 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 15:23:09,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:23:12,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:23:13,739 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:23:15,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:23:15,431 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:23:15,473 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-28 15:23:16,923 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:23:19,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:23:19,647 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=59546.666666666664, ans=0.125 2023-09-28 15:23:20,628 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:23:23,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:23:24,279 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=3.17 vs. limit=12.0 2023-09-28 15:23:25,313 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:23:25,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:23:26,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-28 15:23:27,170 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=59546.666666666664, ans=0.0 2023-09-28 15:23:28,711 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=59546.666666666664, ans=0.2 2023-09-28 15:23:35,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:23:36,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 15:23:36,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-28 15:23:43,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:23:48,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:23:51,238 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:23:58,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-28 15:23:58,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:23:58,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-28 15:24:00,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-28 15:24:01,547 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-28 15:24:05,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:24:05,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:24:06,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-28 15:24:06,838 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:24:08,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:24:08,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:24:08,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-28 15:24:09,693 INFO [train.py:1039] (1/4) Epoch 2, batch 3650, loss[loss=0.2662, simple_loss=0.3218, pruned_loss=0.1053, over 24358.00 frames. ], tot_loss[loss=0.3098, simple_loss=0.351, pruned_loss=0.1343, over 4723068.69 frames. ], batch size: 61, lr: 3.53e-02, grad_scale: 32.0 2023-09-28 15:24:09,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-28 15:24:13,051 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=59746.666666666664, ans=0.1 2023-09-28 15:24:14,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:24:14,384 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-28 15:24:19,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-28 15:24:21,356 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:24:24,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-28 15:24:24,839 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=59813.333333333336, ans=0.125 2023-09-28 15:24:25,051 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.70 vs. limit=6.0 2023-09-28 15:24:26,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-28 15:24:31,912 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:24:31,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-28 15:24:33,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:24:36,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-28 15:24:36,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:24:36,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-28 15:24:38,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:24:39,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:24:39,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-28 15:24:40,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 15:24:41,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:24:41,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:24:43,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:24:46,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-28 15:24:47,818 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-28 15:24:47,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:24:49,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-28 15:24:50,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:24:51,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:24:57,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:24:59,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:24:59,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-28 15:25:02,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:25:04,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:25:08,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:25:09,774 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:25:11,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:25:11,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:25:15,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 15:25:15,924 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.58 vs. limit=15.0 2023-09-28 15:25:16,539 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:25:16,641 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:25:24,090 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-28 15:25:27,375 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:25:27,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:25:27,557 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-28 15:25:29,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:25:31,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-28 15:25:32,594 INFO [train.py:1039] (1/4) Epoch 2, batch 3700, loss[loss=0.3384, simple_loss=0.3634, pruned_loss=0.1567, over 23696.00 frames. ], tot_loss[loss=0.3104, simple_loss=0.3518, pruned_loss=0.1345, over 4730841.85 frames. ], batch size: 232, lr: 3.53e-02, grad_scale: 32.0 2023-09-28 15:25:32,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:25:34,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-28 15:25:34,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:25:37,279 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 15:25:39,583 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:25:41,569 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.121e+02 2.788e+02 3.403e+02 4.126e+02 8.216e+02, threshold=6.806e+02, percent-clipped=7.0 2023-09-28 15:25:41,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:25:43,490 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:25:43,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-28 15:25:43,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:25:43,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 15:25:45,071 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 15:25:46,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 15:25:47,201 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=60080.0, ans=0.025 2023-09-28 15:25:50,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:25:51,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:25:51,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:25:53,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:25:53,515 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 15:25:55,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:25:58,226 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-28 15:26:02,085 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.84 vs. limit=12.0 2023-09-28 15:26:06,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:26:08,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 15:26:09,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 15:26:09,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-28 15:26:09,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:26:14,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:26:15,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-28 15:26:17,156 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:26:18,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:26:23,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:26:23,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 15:26:24,097 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.whiten.whitening_limit, batch_count=60280.0, ans=12.0 2023-09-28 15:26:24,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 15:26:29,728 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:26:29,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-28 15:26:30,288 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.68 vs. limit=15.0 2023-09-28 15:26:31,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:26:31,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-28 15:26:34,028 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.16 vs. limit=15.0 2023-09-28 15:26:36,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:26:37,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:26:41,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:26:41,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-28 15:26:44,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:26:44,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-28 15:26:44,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:26:44,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:26:47,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:26:48,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-28 15:26:49,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-28 15:26:50,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:26:50,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:26:51,419 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=60346.666666666664, ans=0.0 2023-09-28 15:26:52,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:26:54,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:26:55,336 INFO [train.py:1039] (1/4) Epoch 2, batch 3750, loss[loss=0.4417, simple_loss=0.4332, pruned_loss=0.2251, over 19506.00 frames. ], tot_loss[loss=0.3128, simple_loss=0.3536, pruned_loss=0.136, over 4719124.65 frames. ], batch size: 388, lr: 3.52e-02, grad_scale: 32.0 2023-09-28 15:26:57,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:26:58,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:27:00,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:27:02,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-28 15:27:02,818 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten.whitening_limit, batch_count=60413.333333333336, ans=15.0 2023-09-28 15:27:03,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 15:27:06,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-28 15:27:06,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-28 15:27:08,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:27:08,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:27:09,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:27:11,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:27:15,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:27:17,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:27:20,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 15:27:24,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:27:27,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:27:28,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-28 15:27:30,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:27:31,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:27:31,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:27:34,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-28 15:27:39,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-28 15:27:41,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:27:41,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:27:43,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:27:48,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:27:50,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-28 15:27:53,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-28 15:27:57,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:28:00,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:28:00,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:28:03,972 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:28:04,101 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=60680.0, ans=0.125 2023-09-28 15:28:08,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 15:28:10,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-28 15:28:13,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 15:28:15,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:28:18,198 INFO [train.py:1039] (1/4) Epoch 2, batch 3800, loss[loss=0.2487, simple_loss=0.3026, pruned_loss=0.09743, over 24233.00 frames. ], tot_loss[loss=0.3119, simple_loss=0.3528, pruned_loss=0.1355, over 4725408.90 frames. ], batch size: 56, lr: 3.52e-02, grad_scale: 32.0 2023-09-28 15:28:18,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-28 15:28:18,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=60746.666666666664, ans=0.015 2023-09-28 15:28:25,197 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:28:26,486 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.996e+02 2.661e+02 3.070e+02 3.841e+02 5.617e+02, threshold=6.140e+02, percent-clipped=0.0 2023-09-28 15:28:30,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:28:31,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 15:28:32,353 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-28 15:28:35,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:28:36,906 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:28:38,475 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-28 15:28:40,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 15:28:40,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:28:40,332 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:28:40,674 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=60813.333333333336, ans=0.0 2023-09-28 15:28:41,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:28:41,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:28:43,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:28:44,039 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=32.27 vs. limit=22.5 2023-09-28 15:28:44,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-28 15:28:49,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-28 15:28:49,721 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:28:49,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:28:54,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:28:54,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 15:28:56,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-28 15:28:56,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:28:58,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:28:59,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:29:05,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 15:29:06,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-28 15:29:08,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:29:11,861 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=60946.666666666664, ans=0.07 2023-09-28 15:29:16,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:29:22,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:29:24,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-28 15:29:25,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-28 15:29:26,097 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=61013.333333333336, ans=0.0 2023-09-28 15:29:26,192 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=61013.333333333336, ans=0.125 2023-09-28 15:29:27,419 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:29:29,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:29:29,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:29:32,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-28 15:29:34,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-28 15:29:34,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-28 15:29:34,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:29:34,805 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=61013.333333333336, ans=0.125 2023-09-28 15:29:36,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:29:41,186 INFO [train.py:1039] (1/4) Epoch 2, batch 3850, loss[loss=0.2745, simple_loss=0.3236, pruned_loss=0.1127, over 24334.00 frames. ], tot_loss[loss=0.3099, simple_loss=0.351, pruned_loss=0.1344, over 4722556.19 frames. ], batch size: 56, lr: 3.51e-02, grad_scale: 32.0 2023-09-28 15:29:41,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:29:41,488 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 15:29:45,312 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=61080.0, ans=0.1 2023-09-28 15:29:49,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:29:49,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-28 15:29:51,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:29:52,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:29:54,063 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.12 vs. limit=12.0 2023-09-28 15:29:55,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 15:29:57,569 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:29:59,395 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=61146.666666666664, ans=0.125 2023-09-28 15:30:01,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-28 15:30:02,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-28 15:30:09,032 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:30:10,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:30:14,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:30:15,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 15:30:17,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:30:19,493 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:30:19,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:30:19,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:30:21,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:30:23,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:30:24,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:30:24,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:30:24,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-28 15:30:24,899 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-28 15:30:25,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:30:25,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:30:28,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:30:28,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:30:29,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-28 15:30:31,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-28 15:30:34,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:30:37,812 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-28 15:30:39,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-28 15:30:44,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:30:45,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:30:50,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:30:50,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-28 15:30:54,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-28 15:30:56,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:30:57,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:30:59,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 15:31:00,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:31:02,015 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:31:02,127 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:31:02,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:31:02,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-28 15:31:03,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:31:03,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-28 15:31:05,394 INFO [train.py:1039] (1/4) Epoch 2, batch 3900, loss[loss=0.307, simple_loss=0.3405, pruned_loss=0.1368, over 23692.00 frames. ], tot_loss[loss=0.3091, simple_loss=0.3503, pruned_loss=0.1339, over 4727519.50 frames. ], batch size: 232, lr: 3.51e-02, grad_scale: 32.0 2023-09-28 15:31:05,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:31:05,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:31:09,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:31:09,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:31:10,692 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:31:12,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:31:12,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:31:13,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:31:13,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-28 15:31:14,947 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.111e+02 3.017e+02 3.758e+02 4.866e+02 8.103e+02, threshold=7.517e+02, percent-clipped=9.0 2023-09-28 15:31:15,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:31:18,749 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.15 vs. limit=22.5 2023-09-28 15:31:19,656 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:31:19,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 15:31:19,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:31:21,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:31:23,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 15:31:23,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:31:25,239 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-28 15:31:26,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-28 15:31:26,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:31:26,938 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=7.30 vs. limit=15.0 2023-09-28 15:31:29,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-28 15:31:29,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:31:30,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-28 15:31:32,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-28 15:31:37,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:31:37,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:31:37,307 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:31:37,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:31:43,323 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=23.39 vs. limit=22.5 2023-09-28 15:31:43,334 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.63 vs. limit=6.0 2023-09-28 15:31:44,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:31:45,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:31:47,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:31:47,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:31:48,677 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:31:54,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:31:56,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:32:03,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 15:32:06,734 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:32:15,309 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:32:18,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:32:18,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-28 15:32:18,547 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-28 15:32:18,579 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:32:18,815 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=61680.0, ans=0.1 2023-09-28 15:32:21,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-28 15:32:22,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:32:23,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-28 15:32:23,282 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=61680.0, ans=0.125 2023-09-28 15:32:27,645 INFO [train.py:1039] (1/4) Epoch 2, batch 3950, loss[loss=0.2789, simple_loss=0.3346, pruned_loss=0.1116, over 24672.00 frames. ], tot_loss[loss=0.3082, simple_loss=0.3498, pruned_loss=0.1333, over 4730192.92 frames. ], batch size: 65, lr: 3.50e-02, grad_scale: 16.0 2023-09-28 15:32:30,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:32:32,911 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-28 15:32:33,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:32:36,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:32:36,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:32:42,766 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-28 15:32:42,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:32:42,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-28 15:32:44,380 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-28 15:32:44,440 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:32:47,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:32:47,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-28 15:32:47,638 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:32:50,040 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=61813.333333333336, ans=0.07 2023-09-28 15:32:51,249 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-28 15:32:54,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:32:54,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:32:54,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:32:55,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:32:55,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:33:06,264 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=61880.0, ans=0.125 2023-09-28 15:33:10,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:33:10,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:33:15,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-28 15:33:21,203 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-28 15:33:21,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-28 15:33:21,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:33:21,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:33:24,637 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=61946.666666666664, ans=0.125 2023-09-28 15:33:31,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:33:31,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-28 15:33:31,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:33:31,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:33:32,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-28 15:33:38,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:33:38,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:33:43,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-28 15:33:50,915 INFO [train.py:1039] (1/4) Epoch 2, batch 4000, loss[loss=0.3334, simple_loss=0.3624, pruned_loss=0.1522, over 23631.00 frames. ], tot_loss[loss=0.3103, simple_loss=0.3513, pruned_loss=0.1346, over 4719972.20 frames. ], batch size: 256, lr: 3.49e-02, grad_scale: 32.0 2023-09-28 15:33:53,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:33:54,870 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=9.02 vs. limit=15.0 2023-09-28 15:34:00,519 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.115e+02 2.667e+02 3.102e+02 3.739e+02 5.797e+02, threshold=6.204e+02, percent-clipped=0.0 2023-09-28 15:34:02,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:34:02,433 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=62080.0, ans=0.0 2023-09-28 15:34:05,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:34:05,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:34:05,919 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=62146.666666666664, ans=0.125 2023-09-28 15:34:06,920 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:34:06,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-28 15:34:08,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-28 15:34:09,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-28 15:34:09,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:34:09,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-28 15:34:10,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:34:15,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:34:15,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:34:15,714 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:34:15,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:34:15,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-28 15:34:16,172 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=62146.666666666664, ans=0.0 2023-09-28 15:34:17,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:34:19,457 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-28 15:34:20,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:34:21,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:34:24,120 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-28 15:34:24,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 15:34:24,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:34:29,012 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=62213.333333333336, ans=0.0 2023-09-28 15:34:33,313 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-28 15:34:33,404 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:34:35,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:34:37,108 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-28 15:34:38,732 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:34:38,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-28 15:34:38,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:34:41,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:34:41,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:34:43,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:34:45,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-28 15:34:45,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:34:47,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-28 15:34:47,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:34:49,400 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=16.09 vs. limit=15.0 2023-09-28 15:34:50,132 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-28 15:34:52,174 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=62280.0, ans=0.0 2023-09-28 15:34:53,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 15:34:57,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 15:34:59,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:34:59,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:35:01,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:35:01,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:35:07,520 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:35:11,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-28 15:35:11,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-28 15:35:12,608 INFO [train.py:1039] (1/4) Epoch 2, batch 4050, loss[loss=0.3173, simple_loss=0.3471, pruned_loss=0.1438, over 23696.00 frames. ], tot_loss[loss=0.3095, simple_loss=0.3508, pruned_loss=0.1341, over 4733823.66 frames. ], batch size: 232, lr: 3.49e-02, grad_scale: 32.0 2023-09-28 15:35:14,168 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 15:35:14,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:35:15,699 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:35:15,970 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=62413.333333333336, ans=0.1 2023-09-28 15:35:17,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-28 15:35:17,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:35:22,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:35:24,170 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:35:25,628 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 15:35:27,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:35:27,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:35:32,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:35:34,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-28 15:35:37,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 15:35:38,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-28 15:35:38,832 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-28 15:35:41,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:35:42,924 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=62480.0, ans=0.125 2023-09-28 15:35:49,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-28 15:35:50,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:35:54,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:35:57,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:35:57,460 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:35:57,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:36:00,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:36:03,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-28 15:36:03,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 15:36:05,879 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=62613.333333333336, ans=0.125 2023-09-28 15:36:07,145 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:36:07,409 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=62613.333333333336, ans=0.035 2023-09-28 15:36:09,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-28 15:36:13,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:36:23,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-28 15:36:25,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:36:25,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 15:36:26,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-28 15:36:26,078 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-28 15:36:26,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:36:29,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:36:30,189 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=62680.0, ans=0.0 2023-09-28 15:36:31,142 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:36:31,168 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:36:35,778 INFO [train.py:1039] (1/4) Epoch 2, batch 4100, loss[loss=0.2766, simple_loss=0.3302, pruned_loss=0.1115, over 24384.00 frames. ], tot_loss[loss=0.3086, simple_loss=0.3505, pruned_loss=0.1334, over 4735984.61 frames. ], batch size: 61, lr: 3.48e-02, grad_scale: 16.0 2023-09-28 15:36:39,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-28 15:36:39,200 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-28 15:36:42,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-28 15:36:44,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-28 15:36:44,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:36:44,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:36:44,450 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:36:44,470 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:36:45,981 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-28 15:36:47,354 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.099e+02 2.677e+02 3.262e+02 4.112e+02 6.784e+02, threshold=6.525e+02, percent-clipped=2.0 2023-09-28 15:36:49,106 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:36:49,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:36:49,279 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:36:51,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 15:36:56,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 15:36:58,111 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:36:58,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:36:58,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-28 15:36:59,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:36:59,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:36:59,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:37:01,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:37:01,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-28 15:37:05,028 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=62813.333333333336, ans=0.0 2023-09-28 15:37:06,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:37:07,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-28 15:37:07,882 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:37:11,451 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.19 vs. limit=22.5 2023-09-28 15:37:12,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:37:12,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-28 15:37:13,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:37:13,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:37:13,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:37:15,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-28 15:37:19,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-28 15:37:19,215 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:37:22,168 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-28 15:37:23,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:37:23,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:37:25,902 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=7.25 vs. limit=12.0 2023-09-28 15:37:27,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:37:32,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:37:32,563 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=62946.666666666664, ans=0.125 2023-09-28 15:37:33,018 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.47 vs. limit=15.0 2023-09-28 15:37:35,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:37:35,868 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:37:38,201 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=62946.666666666664, ans=0.125 2023-09-28 15:37:46,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:37:46,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:37:48,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:37:51,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:37:58,100 INFO [train.py:1039] (1/4) Epoch 2, batch 4150, loss[loss=0.3238, simple_loss=0.3682, pruned_loss=0.1397, over 24673.00 frames. ], tot_loss[loss=0.3093, simple_loss=0.351, pruned_loss=0.1338, over 4731210.62 frames. ], batch size: 65, lr: 3.48e-02, grad_scale: 16.0 2023-09-28 15:37:58,160 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:37:58,321 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 15:37:59,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:37:59,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:38:03,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-28 15:38:03,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:38:03,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-28 15:38:05,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-28 15:38:05,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-28 15:38:05,581 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=63080.0, ans=0.125 2023-09-28 15:38:06,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:38:07,103 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=63080.0, ans=0.025 2023-09-28 15:38:11,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:38:11,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:38:12,993 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=63080.0, ans=0.125 2023-09-28 15:38:15,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:38:17,200 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:38:18,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-28 15:38:18,966 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 15:38:20,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 15:38:20,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:38:21,705 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-28 15:38:25,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:38:30,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:38:31,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-28 15:38:34,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-28 15:38:34,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:38:36,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-28 15:38:36,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:38:36,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:38:39,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:38:39,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:38:45,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-28 15:38:48,993 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-28 15:38:51,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:38:52,061 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-28 15:38:52,217 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=63280.0, ans=0.0 2023-09-28 15:38:53,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:38:55,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-28 15:38:55,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:38:58,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:38:59,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:38:59,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-28 15:38:59,738 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:38:59,742 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-28 15:39:03,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 15:39:03,393 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=63346.666666666664, ans=0.125 2023-09-28 15:39:04,952 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=63346.666666666664, ans=0.0 2023-09-28 15:39:06,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-28 15:39:07,476 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:39:07,483 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:39:07,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 15:39:07,645 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-28 15:39:09,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:39:09,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 15:39:09,414 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=63346.666666666664, ans=0.5 2023-09-28 15:39:10,194 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.25 vs. limit=10.0 2023-09-28 15:39:10,552 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:39:14,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:39:14,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-28 15:39:14,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-28 15:39:19,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:39:21,076 INFO [train.py:1039] (1/4) Epoch 2, batch 4200, loss[loss=0.2931, simple_loss=0.3561, pruned_loss=0.115, over 24281.00 frames. ], tot_loss[loss=0.3077, simple_loss=0.3499, pruned_loss=0.1327, over 4724860.30 frames. ], batch size: 74, lr: 3.47e-02, grad_scale: 16.0 2023-09-28 15:39:21,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-28 15:39:21,879 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.12 vs. limit=15.0 2023-09-28 15:39:24,697 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 15:39:26,224 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:39:27,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:39:27,832 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:39:27,834 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:39:30,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-28 15:39:32,117 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.187e+02 2.868e+02 3.365e+02 4.143e+02 5.998e+02, threshold=6.730e+02, percent-clipped=0.0 2023-09-28 15:39:33,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-28 15:39:33,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:39:35,532 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=63480.0, ans=0.125 2023-09-28 15:39:37,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:39:39,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:39:42,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-28 15:39:44,803 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=7.27 vs. limit=15.0 2023-09-28 15:39:46,022 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:39:46,060 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:39:47,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-28 15:39:47,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:39:49,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:39:49,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:39:49,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 15:39:49,612 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=63480.0, ans=0.125 2023-09-28 15:39:52,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 15:39:55,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-28 15:39:55,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:39:59,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-28 15:40:01,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:40:02,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:40:04,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:40:07,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:40:07,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-28 15:40:07,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:40:08,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:40:14,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-28 15:40:17,178 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:40:22,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:40:25,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-28 15:40:29,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:40:34,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 15:40:34,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:40:37,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-28 15:40:42,438 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-28 15:40:43,861 INFO [train.py:1039] (1/4) Epoch 2, batch 4250, loss[loss=0.2963, simple_loss=0.3526, pruned_loss=0.12, over 24652.00 frames. ], tot_loss[loss=0.305, simple_loss=0.3472, pruned_loss=0.1314, over 4715932.32 frames. ], batch size: 73, lr: 3.47e-02, grad_scale: 16.0 2023-09-28 15:40:45,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:40:45,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-28 15:40:47,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:40:49,922 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=63746.666666666664, ans=0.0 2023-09-28 15:40:54,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-28 15:40:55,056 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-28 15:40:55,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:40:58,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:41:02,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:41:05,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:41:07,065 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:41:08,660 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:41:08,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:41:10,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:41:10,458 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:41:11,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:41:13,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:41:13,975 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=2.611e-03 2023-09-28 15:41:15,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:41:16,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-28 15:41:20,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-28 15:41:21,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:41:22,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:41:22,116 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:41:23,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:41:23,702 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:41:23,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:41:28,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-28 15:41:30,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:41:34,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:41:35,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:41:35,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-28 15:41:35,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:41:37,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-28 15:41:39,075 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:41:40,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-28 15:41:43,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:41:43,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:41:45,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-28 15:41:46,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 15:41:48,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-28 15:41:48,967 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=63946.666666666664, ans=0.2 2023-09-28 15:41:51,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:41:55,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:41:56,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:41:59,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:42:00,655 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:42:02,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:42:02,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:42:02,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-28 15:42:05,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:42:09,198 INFO [train.py:1039] (1/4) Epoch 2, batch 4300, loss[loss=0.2819, simple_loss=0.3315, pruned_loss=0.1161, over 24472.00 frames. ], tot_loss[loss=0.3048, simple_loss=0.347, pruned_loss=0.1313, over 4725456.63 frames. ], batch size: 63, lr: 3.46e-02, grad_scale: 16.0 2023-09-28 15:42:09,576 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_positive, batch_count=64080.0, ans=0.05 2023-09-28 15:42:12,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:42:12,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:42:15,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:42:19,715 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.044e+02 2.736e+02 3.234e+02 3.981e+02 6.423e+02, threshold=6.467e+02, percent-clipped=0.0 2023-09-28 15:42:23,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:42:23,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-28 15:42:24,686 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:42:26,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-28 15:42:26,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:42:26,333 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-28 15:42:29,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 15:42:32,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 15:42:36,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-28 15:42:36,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:42:36,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-28 15:42:39,273 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=64146.666666666664, ans=0.035 2023-09-28 15:42:40,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 15:42:42,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:42:45,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:42:45,692 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:42:47,053 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 15:42:47,266 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=64213.333333333336, ans=0.125 2023-09-28 15:42:48,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:42:48,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:42:48,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-28 15:42:50,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-28 15:42:53,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:42:56,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:42:56,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 15:42:56,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:42:56,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:42:56,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-28 15:42:56,411 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-28 15:42:57,912 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-28 15:42:59,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:43:00,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-28 15:43:00,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-28 15:43:06,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:43:08,208 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-28 15:43:10,458 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:43:10,815 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=64280.0, ans=0.125 2023-09-28 15:43:12,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:43:12,079 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:43:15,530 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-28 15:43:16,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 15:43:16,989 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:43:17,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:43:19,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:43:19,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:43:22,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:43:22,755 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=14.85 vs. limit=15.0 2023-09-28 15:43:23,761 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=64346.666666666664, ans=0.0 2023-09-28 15:43:25,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:43:25,328 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=64346.666666666664, ans=0.09899494936611666 2023-09-28 15:43:26,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:43:26,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:43:29,607 INFO [train.py:1039] (1/4) Epoch 2, batch 4350, loss[loss=0.2534, simple_loss=0.3119, pruned_loss=0.09748, over 24334.00 frames. ], tot_loss[loss=0.3055, simple_loss=0.3476, pruned_loss=0.1317, over 4722785.45 frames. ], batch size: 61, lr: 3.46e-02, grad_scale: 16.0 2023-09-28 15:43:32,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-28 15:43:34,264 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-28 15:43:38,956 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:43:40,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:43:44,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:43:44,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:43:49,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:43:51,896 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=64480.0, ans=0.09899494936611666 2023-09-28 15:43:53,097 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:43:55,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:43:55,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:43:55,574 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=64480.0, ans=0.2 2023-09-28 15:43:59,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-28 15:44:02,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:44:04,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-28 15:44:09,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-28 15:44:11,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:44:12,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:44:12,353 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=64546.666666666664, ans=0.0 2023-09-28 15:44:16,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:44:20,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-28 15:44:22,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:44:24,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 15:44:31,249 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-28 15:44:32,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:44:32,801 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:44:32,898 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-28 15:44:34,354 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-28 15:44:34,376 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:44:34,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:44:35,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:44:37,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:44:37,481 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:44:38,896 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:44:40,482 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-28 15:44:40,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:44:40,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:44:40,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:44:42,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-28 15:44:42,302 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-28 15:44:42,309 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-28 15:44:42,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-28 15:44:46,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:44:46,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 15:44:46,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:44:48,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:44:48,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=64680.0, ans=0.125 2023-09-28 15:44:50,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-28 15:44:51,882 INFO [train.py:1039] (1/4) Epoch 2, batch 4400, loss[loss=0.3214, simple_loss=0.3734, pruned_loss=0.1347, over 24385.00 frames. ], tot_loss[loss=0.3061, simple_loss=0.3484, pruned_loss=0.1319, over 4728222.15 frames. ], batch size: 77, lr: 3.45e-02, grad_scale: 32.0 2023-09-28 15:44:52,102 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-28 15:44:52,114 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:44:56,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:44:56,596 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:44:58,160 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:45:00,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-28 15:45:02,321 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-28 15:45:02,379 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-28 15:45:02,425 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-28 15:45:03,802 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.169e+02 2.849e+02 3.157e+02 3.871e+02 7.582e+02, threshold=6.315e+02, percent-clipped=2.0 2023-09-28 15:45:03,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 15:45:03,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:45:05,672 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-28 15:45:07,741 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.56 vs. limit=6.0 2023-09-28 15:45:08,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:45:10,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:45:10,175 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-28 15:45:10,588 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=64813.333333333336, ans=0.125 2023-09-28 15:45:13,277 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:45:13,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-28 15:45:13,340 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-28 15:45:17,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-28 15:45:17,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-28 15:45:17,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-28 15:45:19,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:45:19,475 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:45:20,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:45:20,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:45:22,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-28 15:45:22,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-28 15:45:22,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:45:26,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:45:26,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:45:27,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:45:29,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:45:29,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-28 15:45:29,432 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-28 15:45:33,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:45:40,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:45:40,390 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=64946.666666666664, ans=0.125 2023-09-28 15:45:43,322 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-28 15:45:48,077 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 15:45:51,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:45:54,141 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:45:54,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-28 15:45:54,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:45:54,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-28 15:45:54,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:45:55,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-28 15:46:00,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-28 15:46:04,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-28 15:46:05,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-28 15:46:05,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:46:05,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-28 15:46:07,400 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:46:12,879 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:46:13,999 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=13.24 vs. limit=15.0 2023-09-28 15:46:14,486 INFO [train.py:1039] (1/4) Epoch 2, batch 4450, loss[loss=0.3544, simple_loss=0.3743, pruned_loss=0.1673, over 23611.00 frames. ], tot_loss[loss=0.307, simple_loss=0.3491, pruned_loss=0.1324, over 4734263.36 frames. ], batch size: 256, lr: 3.44e-02, grad_scale: 32.0 2023-09-28 15:46:14,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-28 15:46:17,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:46:20,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:46:22,142 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:46:25,651 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=65080.0, ans=0.125 2023-09-28 15:46:26,940 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=65080.0, ans=0.125 2023-09-28 15:46:29,887 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:46:29,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:46:34,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:46:37,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:46:40,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:46:41,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:46:42,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-28 15:46:42,873 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:46:42,983 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:46:43,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:46:43,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-28 15:46:46,599 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 15:46:50,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:46:51,961 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:46:53,500 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:46:53,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:46:55,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:46:57,198 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.22 vs. limit=10.0 2023-09-28 15:46:58,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 15:46:59,759 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-28 15:46:59,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-28 15:46:59,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:47:01,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:47:02,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-28 15:47:03,243 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=65280.0, ans=0.125 2023-09-28 15:47:07,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-28 15:47:09,125 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:47:11,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-28 15:47:11,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:47:11,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:47:11,195 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:47:11,207 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:47:12,735 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=24.28 vs. limit=22.5 2023-09-28 15:47:13,745 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=65280.0, ans=0.125 2023-09-28 15:47:14,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:47:20,019 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-28 15:47:21,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-28 15:47:22,272 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=65346.666666666664, ans=0.125 2023-09-28 15:47:22,276 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=65346.666666666664, ans=0.125 2023-09-28 15:47:23,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 15:47:23,543 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=65346.666666666664, ans=0.125 2023-09-28 15:47:23,731 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=65346.666666666664, ans=0.0 2023-09-28 15:47:26,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:47:26,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:47:28,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:47:28,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 15:47:31,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-28 15:47:35,449 INFO [train.py:1039] (1/4) Epoch 2, batch 4500, loss[loss=0.3087, simple_loss=0.3352, pruned_loss=0.1411, over 23846.00 frames. ], tot_loss[loss=0.309, simple_loss=0.3502, pruned_loss=0.134, over 4724909.60 frames. ], batch size: 179, lr: 3.44e-02, grad_scale: 32.0 2023-09-28 15:47:35,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-28 15:47:37,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 15:47:41,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:47:43,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-28 15:47:43,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-28 15:47:44,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:47:46,459 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.128e+02 2.918e+02 3.364e+02 4.065e+02 7.320e+02, threshold=6.729e+02, percent-clipped=3.0 2023-09-28 15:47:50,251 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:47:50,337 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:47:51,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 15:47:53,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:47:53,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:47:53,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:47:55,777 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 15:48:06,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:48:06,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:48:09,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:48:11,357 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:48:11,499 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 15:48:16,237 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 15:48:22,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:48:26,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 15:48:28,518 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:48:29,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-28 15:48:30,016 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:48:31,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:48:33,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:48:33,621 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:48:35,297 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=65613.33333333333, ans=0.125 2023-09-28 15:48:35,426 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=65613.33333333333, ans=0.0 2023-09-28 15:48:36,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:48:36,608 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-28 15:48:36,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 15:48:36,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:48:42,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:48:42,576 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 15:48:44,333 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:48:45,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-28 15:48:46,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:48:47,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-28 15:48:49,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-28 15:48:51,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-28 15:48:56,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-28 15:48:57,840 INFO [train.py:1039] (1/4) Epoch 2, batch 4550, loss[loss=0.338, simple_loss=0.3556, pruned_loss=0.1602, over 23947.00 frames. ], tot_loss[loss=0.3083, simple_loss=0.3496, pruned_loss=0.1334, over 4717221.05 frames. ], batch size: 195, lr: 3.43e-02, grad_scale: 32.0 2023-09-28 15:48:58,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-28 15:48:58,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:49:01,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:49:01,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:49:06,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:49:08,704 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=65746.66666666667, ans=0.125 2023-09-28 15:49:11,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:49:13,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:49:16,045 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:49:16,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:49:16,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:49:19,123 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:49:19,194 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:49:23,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:49:26,279 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-28 15:49:27,566 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-28 15:49:27,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:49:29,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-28 15:49:33,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-28 15:49:33,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:49:37,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-28 15:49:38,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 15:49:41,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:49:42,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:49:42,055 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:49:45,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-28 15:49:47,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:49:49,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:49:49,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:49:51,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:49:52,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-28 15:49:52,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-28 15:49:54,214 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:49:55,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-28 15:49:56,598 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-28 15:49:58,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:49:58,672 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:49:58,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:50:00,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:50:01,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:50:01,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 15:50:03,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-28 15:50:06,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:50:06,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 15:50:06,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-28 15:50:06,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:50:06,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-28 15:50:10,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:50:11,355 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:50:13,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:50:13,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:50:14,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-28 15:50:17,511 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:50:18,868 INFO [train.py:1039] (1/4) Epoch 2, batch 4600, loss[loss=0.2987, simple_loss=0.341, pruned_loss=0.1282, over 23195.00 frames. ], tot_loss[loss=0.306, simple_loss=0.3477, pruned_loss=0.1322, over 4708664.30 frames. ], batch size: 119, lr: 3.43e-02, grad_scale: 32.0 2023-09-28 15:50:19,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-28 15:50:19,738 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.79 vs. limit=6.0 2023-09-28 15:50:20,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:50:22,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:50:25,446 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:50:25,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 15:50:25,548 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=66080.0, ans=0.125 2023-09-28 15:50:26,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:50:27,345 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=66080.0, ans=0.5 2023-09-28 15:50:28,405 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-28 15:50:29,711 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.215e+02 2.622e+02 3.070e+02 3.813e+02 6.355e+02, threshold=6.141e+02, percent-clipped=0.0 2023-09-28 15:50:30,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:50:33,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:50:33,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:50:35,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:50:44,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-28 15:50:44,089 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=66146.66666666667, ans=0.125 2023-09-28 15:50:45,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:50:47,319 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=66146.66666666667, ans=0.0 2023-09-28 15:50:48,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:50:51,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:50:51,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:50:57,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-28 15:50:57,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 15:50:57,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:51:04,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:51:05,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-28 15:51:06,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:51:11,175 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-28 15:51:11,500 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=66280.0, ans=0.0 2023-09-28 15:51:12,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-28 15:51:16,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:51:16,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:51:20,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:51:20,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 15:51:20,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:51:22,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-28 15:51:22,082 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:51:23,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:51:24,896 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:51:26,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:51:27,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:51:27,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-28 15:51:29,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-28 15:51:29,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-28 15:51:29,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:51:32,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:51:32,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:51:33,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:51:41,410 INFO [train.py:1039] (1/4) Epoch 2, batch 4650, loss[loss=0.3071, simple_loss=0.3657, pruned_loss=0.1243, over 24441.00 frames. ], tot_loss[loss=0.3053, simple_loss=0.3468, pruned_loss=0.1319, over 4707780.95 frames. ], batch size: 69, lr: 3.42e-02, grad_scale: 32.0 2023-09-28 15:51:44,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:51:47,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:51:48,233 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=66413.33333333333, ans=0.125 2023-09-28 15:51:49,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:51:49,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:51:50,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:51:50,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:51:50,970 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:51:56,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-28 15:51:58,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:51:58,633 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=66480.0, ans=0.05 2023-09-28 15:51:59,918 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-28 15:51:59,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:52:01,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-28 15:52:01,497 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:52:02,921 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-28 15:52:02,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-28 15:52:02,988 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:52:04,435 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:52:07,494 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 15:52:08,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:52:09,026 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-28 15:52:14,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:52:16,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-28 15:52:18,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:52:18,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:52:19,656 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-28 15:52:21,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:52:24,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:52:28,317 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=66546.66666666667, ans=0.0 2023-09-28 15:52:29,469 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:52:33,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:52:36,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:52:36,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:52:36,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 15:52:39,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-28 15:52:39,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-28 15:52:39,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 15:52:39,860 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-28 15:52:42,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:52:48,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:52:48,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:52:48,874 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-28 15:52:50,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:52:51,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:52:51,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:52:52,862 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.53 vs. limit=15.0 2023-09-28 15:52:53,503 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:52:55,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:52:55,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:52:55,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:53:00,363 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=66680.0, ans=0.125 2023-09-28 15:53:01,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:53:01,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:53:01,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 15:53:03,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-28 15:53:03,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:53:05,173 INFO [train.py:1039] (1/4) Epoch 2, batch 4700, loss[loss=0.267, simple_loss=0.3236, pruned_loss=0.1052, over 24334.00 frames. ], tot_loss[loss=0.305, simple_loss=0.3469, pruned_loss=0.1316, over 4715436.95 frames. ], batch size: 61, lr: 3.42e-02, grad_scale: 32.0 2023-09-28 15:53:06,760 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-28 15:53:14,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:53:15,875 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.037e+02 2.802e+02 3.291e+02 3.873e+02 6.346e+02, threshold=6.582e+02, percent-clipped=1.0 2023-09-28 15:53:16,032 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:53:16,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:53:18,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:53:18,375 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=66746.66666666667, ans=0.0 2023-09-28 15:53:19,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 15:53:26,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-28 15:53:26,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-28 15:53:29,363 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:53:30,777 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:53:30,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:53:34,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:53:40,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:53:41,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 15:53:41,309 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=66880.0, ans=0.125 2023-09-28 15:53:44,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:53:51,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-28 15:53:54,085 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:53:55,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:53:57,007 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.61 vs. limit=22.5 2023-09-28 15:53:59,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-28 15:53:59,679 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=66946.66666666667, ans=0.09899494936611666 2023-09-28 15:54:00,985 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:54:04,986 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.26 vs. limit=22.5 2023-09-28 15:54:05,545 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:54:06,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-28 15:54:07,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:54:08,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:54:10,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:54:10,905 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:54:10,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-28 15:54:12,393 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-28 15:54:13,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:54:17,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:54:17,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:54:17,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-28 15:54:18,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:54:22,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-28 15:54:23,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:54:24,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:54:27,490 INFO [train.py:1039] (1/4) Epoch 2, batch 4750, loss[loss=0.3026, simple_loss=0.3591, pruned_loss=0.123, over 24302.00 frames. ], tot_loss[loss=0.3062, simple_loss=0.3484, pruned_loss=0.1321, over 4723458.69 frames. ], batch size: 74, lr: 3.41e-02, grad_scale: 32.0 2023-09-28 15:54:31,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:54:31,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:54:32,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-28 15:54:34,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:54:36,043 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=67080.0, ans=0.04949747468305833 2023-09-28 15:54:38,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-28 15:54:40,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:54:40,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:54:41,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:54:45,764 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=67146.66666666667, ans=0.1 2023-09-28 15:54:47,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-28 15:54:49,023 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 15:54:51,790 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:54:52,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-28 15:54:54,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:54:55,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:54:55,842 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:54:57,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:54:59,159 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-28 15:54:59,176 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-28 15:54:59,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=67213.33333333333, ans=0.5 2023-09-28 15:55:04,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-28 15:55:06,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:55:07,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:55:10,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:55:10,688 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-28 15:55:10,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:55:12,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:55:14,079 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=67213.33333333333, ans=0.09899494936611666 2023-09-28 15:55:17,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:55:18,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-28 15:55:18,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-28 15:55:19,224 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=67280.0, ans=0.125 2023-09-28 15:55:20,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:55:20,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:55:20,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:55:23,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 15:55:23,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-28 15:55:24,268 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=18.37 vs. limit=15.0 2023-09-28 15:55:26,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-28 15:55:28,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:55:33,081 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:55:33,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-28 15:55:33,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:55:35,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:55:37,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-28 15:55:38,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:55:38,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:55:43,339 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:55:43,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-28 15:55:44,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-28 15:55:45,194 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=67346.66666666667, ans=0.125 2023-09-28 15:55:46,407 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-28 15:55:48,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-28 15:55:48,153 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:55:49,554 INFO [train.py:1039] (1/4) Epoch 2, batch 4800, loss[loss=0.2793, simple_loss=0.3338, pruned_loss=0.1124, over 24305.00 frames. ], tot_loss[loss=0.3095, simple_loss=0.3507, pruned_loss=0.1342, over 4716456.51 frames. ], batch size: 61, lr: 3.41e-02, grad_scale: 32.0 2023-09-28 15:55:51,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-28 15:55:56,331 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:55:56,983 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.04 vs. limit=22.5 2023-09-28 15:55:57,754 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:56:00,824 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=67413.33333333333, ans=0.125 2023-09-28 15:56:01,839 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.159e+02 2.813e+02 3.481e+02 4.018e+02 6.093e+02, threshold=6.961e+02, percent-clipped=0.0 2023-09-28 15:56:03,562 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:56:04,305 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.70 vs. limit=15.0 2023-09-28 15:56:05,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:56:05,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:56:07,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-28 15:56:08,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:56:08,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:56:12,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:56:15,458 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:56:18,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:56:18,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:56:20,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:56:20,191 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 15:56:20,213 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:56:21,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:56:21,842 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=67546.66666666667, ans=0.125 2023-09-28 15:56:24,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:56:26,719 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=67546.66666666667, ans=0.2 2023-09-28 15:56:27,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:56:29,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:56:29,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-28 15:56:32,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 15:56:33,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:56:34,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-28 15:56:36,115 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-28 15:56:37,688 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:56:37,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:56:37,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:56:37,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:56:37,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:56:39,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:56:41,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:56:41,491 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 15:56:43,621 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=67613.33333333333, ans=0.1 2023-09-28 15:56:46,313 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:56:46,608 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=67613.33333333333, ans=0.1 2023-09-28 15:56:48,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:56:50,287 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:56:55,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-28 15:56:55,812 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:56:57,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:56:57,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:56:58,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:57:02,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:57:04,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 15:57:04,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:57:04,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:57:05,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:57:05,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:57:09,024 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=67680.0, ans=0.1 2023-09-28 15:57:10,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:57:10,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:57:10,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:57:11,580 INFO [train.py:1039] (1/4) Epoch 2, batch 4850, loss[loss=0.2851, simple_loss=0.3577, pruned_loss=0.1063, over 24305.00 frames. ], tot_loss[loss=0.3086, simple_loss=0.3504, pruned_loss=0.1334, over 4712200.42 frames. ], batch size: 74, lr: 3.40e-02, grad_scale: 32.0 2023-09-28 15:57:11,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-28 15:57:14,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-28 15:57:14,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:57:14,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:57:15,092 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:57:15,094 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:57:16,914 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=67746.66666666667, ans=0.0 2023-09-28 15:57:18,175 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:57:23,132 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.48 vs. limit=10.0 2023-09-28 15:57:26,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-28 15:57:27,657 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:57:32,404 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=67813.33333333333, ans=0.2 2023-09-28 15:57:33,686 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:57:33,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 15:57:33,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:57:35,601 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=67813.33333333333, ans=0.125 2023-09-28 15:57:40,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:57:40,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:57:41,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:57:41,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-28 15:57:46,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:57:47,904 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:57:47,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 15:57:48,052 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=67880.0, ans=0.125 2023-09-28 15:57:49,471 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:57:49,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-28 15:57:50,268 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.01 vs. limit=22.5 2023-09-28 15:57:52,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:57:52,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:57:56,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:57:56,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-28 15:57:56,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-28 15:57:57,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 15:58:00,603 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=67946.66666666667, ans=0.125 2023-09-28 15:58:03,539 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=67946.66666666667, ans=0.125 2023-09-28 15:58:06,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:58:06,457 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-28 15:58:08,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:58:08,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:58:11,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-28 15:58:14,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-28 15:58:14,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:58:16,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-28 15:58:16,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:58:16,579 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:58:18,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-28 15:58:25,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:58:30,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:58:30,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:58:34,385 INFO [train.py:1039] (1/4) Epoch 2, batch 4900, loss[loss=0.3139, simple_loss=0.3443, pruned_loss=0.1418, over 23371.00 frames. ], tot_loss[loss=0.3065, simple_loss=0.3484, pruned_loss=0.1323, over 4697563.71 frames. ], batch size: 119, lr: 3.39e-02, grad_scale: 32.0 2023-09-28 15:58:37,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-28 15:58:37,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:58:42,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:58:42,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:58:42,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:58:43,255 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.78 vs. limit=10.0 2023-09-28 15:58:46,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-28 15:58:47,479 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.052e+02 2.694e+02 3.057e+02 3.718e+02 7.972e+02, threshold=6.114e+02, percent-clipped=1.0 2023-09-28 15:58:50,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-28 15:58:54,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-28 15:58:55,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-28 15:58:57,043 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-28 15:58:57,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:58:57,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:58:57,160 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:58:57,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:58:59,231 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-28 15:59:02,291 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.93 vs. limit=22.5 2023-09-28 15:59:03,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-28 15:59:03,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 15:59:06,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-28 15:59:06,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-28 15:59:06,328 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=68213.33333333333, ans=0.05 2023-09-28 15:59:09,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:59:09,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:59:10,839 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:59:10,856 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-28 15:59:12,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:59:14,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:59:14,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-28 15:59:14,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-28 15:59:14,382 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=68213.33333333333, ans=0.125 2023-09-28 15:59:17,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-28 15:59:20,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:59:22,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:59:22,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:59:22,387 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=68280.0, ans=0.125 2023-09-28 15:59:23,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:59:23,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 15:59:25,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:59:25,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-28 15:59:26,859 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:59:27,073 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=68280.0, ans=0.125 2023-09-28 15:59:28,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-28 15:59:31,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:59:33,164 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=68280.0, ans=0.2 2023-09-28 15:59:35,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-28 15:59:35,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:59:37,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-28 15:59:38,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-28 15:59:40,367 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=68346.66666666667, ans=0.0 2023-09-28 15:59:44,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:59:46,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 15:59:47,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-28 15:59:47,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 15:59:47,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:59:50,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:59:53,409 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=68346.66666666667, ans=0.125 2023-09-28 15:59:54,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:59:54,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:59:54,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:59:54,718 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-28 15:59:55,940 INFO [train.py:1039] (1/4) Epoch 2, batch 4950, loss[loss=0.2765, simple_loss=0.3416, pruned_loss=0.1057, over 24347.00 frames. ], tot_loss[loss=0.3042, simple_loss=0.3469, pruned_loss=0.1308, over 4711156.78 frames. ], batch size: 74, lr: 3.39e-02, grad_scale: 32.0 2023-09-28 15:59:57,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 16:00:00,524 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:00:00,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 16:00:03,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-28 16:00:03,792 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-28 16:00:03,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-28 16:00:04,129 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=68413.33333333333, ans=0.0 2023-09-28 16:00:05,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-28 16:00:05,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:00:05,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:00:06,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:00:08,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:00:10,459 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:00:12,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:00:13,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:00:15,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:00:15,447 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=68480.0, ans=0.0 2023-09-28 16:00:16,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:00:18,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:00:19,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 16:00:25,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:00:26,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 16:00:28,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:00:29,980 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:00:31,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:00:31,759 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-28 16:00:33,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-28 16:00:36,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:00:36,623 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=68546.66666666667, ans=0.125 2023-09-28 16:00:37,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:00:37,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-28 16:00:38,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:00:38,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:00:39,563 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-28 16:00:40,517 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.89 vs. limit=12.0 2023-09-28 16:00:41,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:00:46,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-28 16:00:46,491 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=68613.33333333333, ans=0.125 2023-09-28 16:00:47,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:00:48,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:00:49,182 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=68613.33333333333, ans=0.125 2023-09-28 16:00:50,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:00:50,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-28 16:00:51,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:00:53,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 16:00:58,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:01:00,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:01:01,369 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:01:01,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:01:01,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 16:01:03,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:01:04,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:01:04,852 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=68680.0, ans=0.125 2023-09-28 16:01:05,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:01:06,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:01:07,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-28 16:01:10,747 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:01:15,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-28 16:01:15,354 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-28 16:01:16,657 INFO [train.py:1039] (1/4) Epoch 2, batch 5000, loss[loss=0.33, simple_loss=0.3611, pruned_loss=0.1494, over 23841.00 frames. ], tot_loss[loss=0.3023, simple_loss=0.3453, pruned_loss=0.1296, over 4712654.93 frames. ], batch size: 179, lr: 3.38e-02, grad_scale: 32.0 2023-09-28 16:01:17,866 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=68746.66666666667, ans=0.05 2023-09-28 16:01:22,581 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:01:22,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:01:22,927 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=68746.66666666667, ans=0.125 2023-09-28 16:01:24,255 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-28 16:01:25,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-28 16:01:27,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:01:30,660 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.879e+02 2.855e+02 3.346e+02 4.050e+02 6.399e+02, threshold=6.691e+02, percent-clipped=1.0 2023-09-28 16:01:30,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-28 16:01:30,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:01:31,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 16:01:33,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-28 16:01:33,883 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:01:33,994 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:01:35,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-28 16:01:35,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:01:36,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:01:38,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-28 16:01:38,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-28 16:01:38,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:01:40,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-28 16:01:40,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 16:01:40,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:01:41,701 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 16:01:41,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-28 16:01:41,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-28 16:01:44,220 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.26 vs. limit=15.0 2023-09-28 16:01:44,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-28 16:01:44,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:01:45,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:01:46,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-28 16:01:48,047 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:01:51,063 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:01:52,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:01:54,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-28 16:01:56,892 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-28 16:01:58,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:01:58,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:01:58,958 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=68880.0, ans=0.125 2023-09-28 16:02:00,465 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=68880.0, ans=0.125 2023-09-28 16:02:03,357 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-28 16:02:08,825 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:02:10,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:02:10,463 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:02:13,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-28 16:02:13,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:02:13,623 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:02:15,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:02:16,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-28 16:02:17,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:02:19,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:02:21,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:02:27,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-28 16:02:31,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:02:39,041 INFO [train.py:1039] (1/4) Epoch 2, batch 5050, loss[loss=0.2995, simple_loss=0.364, pruned_loss=0.1176, over 24319.00 frames. ], tot_loss[loss=0.302, simple_loss=0.3455, pruned_loss=0.1293, over 4705554.56 frames. ], batch size: 74, lr: 3.38e-02, grad_scale: 32.0 2023-09-28 16:02:39,523 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=69080.0, ans=0.125 2023-09-28 16:02:39,554 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=69080.0, ans=0.1 2023-09-28 16:02:41,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:02:41,506 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:02:42,813 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:02:42,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:02:42,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 16:02:42,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:02:43,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:02:47,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:02:47,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-28 16:02:47,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:02:51,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:02:53,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-28 16:02:53,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-28 16:02:53,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:02:55,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:02:56,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 16:02:58,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 16:02:58,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-28 16:03:09,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-28 16:03:09,444 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-28 16:03:09,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:03:11,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-28 16:03:11,521 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:03:13,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:03:14,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:03:14,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:03:14,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-28 16:03:16,174 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-28 16:03:17,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:03:19,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:03:23,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:03:23,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-28 16:03:24,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:03:28,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-28 16:03:28,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:03:28,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:03:30,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:03:31,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:03:31,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:03:35,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:03:36,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:03:36,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:03:36,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:03:36,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-28 16:03:37,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:03:39,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:03:43,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:03:43,423 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-28 16:03:43,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-28 16:03:44,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:03:46,261 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:03:46,324 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-28 16:03:50,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:03:50,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-28 16:03:50,707 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:03:53,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:03:54,112 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=69346.66666666667, ans=0.1 2023-09-28 16:03:55,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:03:55,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-28 16:03:56,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-28 16:04:00,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:04:00,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:04:00,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:04:01,406 INFO [train.py:1039] (1/4) Epoch 2, batch 5100, loss[loss=0.3095, simple_loss=0.3615, pruned_loss=0.1288, over 24685.00 frames. ], tot_loss[loss=0.3034, simple_loss=0.347, pruned_loss=0.1299, over 4708047.72 frames. ], batch size: 73, lr: 3.37e-02, grad_scale: 32.0 2023-09-28 16:04:03,219 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-28 16:04:04,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:04:10,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-28 16:04:10,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-28 16:04:11,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:04:13,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:04:15,245 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.986e+02 2.824e+02 3.084e+02 3.697e+02 6.472e+02, threshold=6.168e+02, percent-clipped=0.0 2023-09-28 16:04:17,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:04:17,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-28 16:04:17,640 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-28 16:04:24,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:04:24,273 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:04:27,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:04:32,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-28 16:04:32,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:04:32,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:04:32,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-28 16:04:35,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:04:36,873 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:04:36,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-28 16:04:39,796 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-28 16:04:39,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:04:41,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-28 16:04:41,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-28 16:04:46,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:04:55,475 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:04:59,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-28 16:04:59,789 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-28 16:04:59,811 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-28 16:05:01,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-28 16:05:01,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:05:04,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-28 16:05:06,473 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=69680.0, ans=0.0 2023-09-28 16:05:07,760 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-28 16:05:10,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 16:05:12,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-28 16:05:15,801 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-28 16:05:17,297 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-28 16:05:18,738 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-28 16:05:22,718 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=69746.66666666667, ans=0.125 2023-09-28 16:05:23,790 INFO [train.py:1039] (1/4) Epoch 2, batch 5150, loss[loss=0.3336, simple_loss=0.3609, pruned_loss=0.1532, over 22748.00 frames. ], tot_loss[loss=0.3063, simple_loss=0.3487, pruned_loss=0.132, over 4691718.75 frames. ], batch size: 322, lr: 3.37e-02, grad_scale: 32.0 2023-09-28 16:05:25,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:05:25,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:05:25,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:05:25,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:05:25,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 16:05:27,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:05:30,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-28 16:05:30,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-28 16:05:30,350 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-28 16:05:30,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:05:30,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-28 16:05:31,950 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:05:33,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 16:05:35,509 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:05:37,018 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:05:41,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 16:05:41,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-28 16:05:44,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:05:45,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:05:47,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-28 16:05:47,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:05:47,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:05:48,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:05:48,893 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 16:05:48,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-28 16:05:51,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:05:51,440 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=69813.33333333333, ans=0.125 2023-09-28 16:05:52,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 16:05:53,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 16:05:54,141 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-28 16:05:55,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:06:02,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:06:02,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-28 16:06:06,206 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:06:08,090 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=69880.0, ans=0.07 2023-09-28 16:06:12,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:06:13,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:06:16,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:06:18,443 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:06:20,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-28 16:06:25,559 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:06:27,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:06:27,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 16:06:30,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:06:30,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:06:32,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-28 16:06:37,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:06:39,015 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 16:06:42,089 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:06:42,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:06:43,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-28 16:06:43,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-28 16:06:43,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:06:43,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:06:45,233 INFO [train.py:1039] (1/4) Epoch 2, batch 5200, loss[loss=0.329, simple_loss=0.36, pruned_loss=0.149, over 23303.00 frames. ], tot_loss[loss=0.307, simple_loss=0.3493, pruned_loss=0.1323, over 4690415.15 frames. ], batch size: 119, lr: 3.36e-02, grad_scale: 32.0 2023-09-28 16:06:48,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:06:48,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:06:53,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:06:55,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-28 16:06:57,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:06:58,451 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.056e+02 2.942e+02 3.378e+02 4.176e+02 6.037e+02, threshold=6.756e+02, percent-clipped=0.0 2023-09-28 16:06:58,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:07:00,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:07:02,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:07:02,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:07:04,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-28 16:07:07,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 16:07:08,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:07:12,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-28 16:07:14,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-28 16:07:14,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-28 16:07:16,252 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-28 16:07:17,703 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-28 16:07:20,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-28 16:07:22,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:07:22,265 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-28 16:07:22,287 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:07:23,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:07:23,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:07:25,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-28 16:07:25,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:07:27,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:07:32,232 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-28 16:07:32,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-28 16:07:32,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-28 16:07:35,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-28 16:07:37,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 16:07:42,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:07:44,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:07:45,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-28 16:07:47,300 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:07:47,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-28 16:07:47,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:07:47,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 16:07:52,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:07:53,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:07:55,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:07:58,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:07:58,498 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:08:03,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:08:04,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-28 16:08:06,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:08:06,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:08:08,685 INFO [train.py:1039] (1/4) Epoch 2, batch 5250, loss[loss=0.3081, simple_loss=0.3417, pruned_loss=0.1372, over 23648.00 frames. ], tot_loss[loss=0.3056, simple_loss=0.3486, pruned_loss=0.1313, over 4709064.01 frames. ], batch size: 149, lr: 3.36e-02, grad_scale: 32.0 2023-09-28 16:08:08,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:08:08,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-28 16:08:10,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:08:12,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:08:14,153 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=70413.33333333333, ans=0.0 2023-09-28 16:08:16,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:08:16,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:08:18,990 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:08:25,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:08:26,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 16:08:28,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:08:30,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 16:08:33,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-28 16:08:33,261 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:08:34,754 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:09:22,884 INFO [train.py:1039] (1/4) Epoch 2, batch 5300, loss[loss=0.2894, simple_loss=0.3561, pruned_loss=0.1113, over 24626.00 frames. ], tot_loss[loss=0.3024, simple_loss=0.3451, pruned_loss=0.1298, over 4703535.73 frames. ], batch size: 68, lr: 3.35e-02, grad_scale: 32.0 2023-09-28 16:09:24,472 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=70746.66666666667, ans=0.125 2023-09-28 16:09:26,035 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=70746.66666666667, ans=0.07 2023-09-28 16:09:28,889 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=70746.66666666667, ans=0.125 2023-09-28 16:09:31,883 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.min_positive, batch_count=70746.66666666667, ans=0.025 2023-09-28 16:09:34,305 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.804e+02 2.707e+02 3.072e+02 3.599e+02 7.324e+02, threshold=6.143e+02, percent-clipped=3.0 2023-09-28 16:09:37,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:09:37,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 16:09:37,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 16:09:37,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:09:38,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:09:38,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:09:38,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:09:38,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:09:39,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:09:39,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:09:39,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 16:09:39,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:09:39,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 16:09:39,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 16:09:40,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 16:09:40,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 16:09:40,234 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 16:09:40,360 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 16:09:40,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:09:41,013 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:09:41,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:09:41,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:09:41,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:09:41,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:09:41,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:09:41,908 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:09:42,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:09:42,504 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:09:42,510 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:09:42,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:09:42,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:09:43,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 16:09:43,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:09:44,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:09:44,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 16:09:44,070 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 16:09:44,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:09:44,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:09:44,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 16:09:44,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 16:09:44,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 16:09:45,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:09:45,578 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:09:45,737 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 16:09:45,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 16:09:45,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 16:09:46,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:09:46,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 16:09:46,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 16:09:46,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 16:09:47,184 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-28 16:09:56,614 INFO [train.py:1039] (1/4) Epoch 3, batch 0, loss[loss=0.2863, simple_loss=0.3331, pruned_loss=0.1197, over 24435.00 frames. ], tot_loss[loss=0.2863, simple_loss=0.3331, pruned_loss=0.1197, over 24435.00 frames. ], batch size: 58, lr: 3.18e-02, grad_scale: 32.0 2023-09-28 16:09:56,615 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-28 16:10:11,598 INFO [train.py:1071] (1/4) Epoch 3, validation: loss=0.3974, simple_loss=0.3654, pruned_loss=0.2147, over 1125622.00 frames. 2023-09-28 16:10:11,599 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-28 16:10:14,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-28 16:10:15,012 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=70826.66666666667, ans=0.0 2023-09-28 16:10:16,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:10:17,799 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:10:19,617 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=70826.66666666667, ans=0.125 2023-09-28 16:10:23,553 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:10:23,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:10:23,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:10:23,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-28 16:10:26,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-28 16:10:29,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:10:31,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:10:35,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:10:36,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:10:37,389 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=70893.33333333333, ans=0.2 2023-09-28 16:10:37,752 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.30 vs. limit=22.5 2023-09-28 16:10:38,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:10:38,320 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:10:39,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-28 16:10:43,755 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=70960.0, ans=0.2 2023-09-28 16:10:44,833 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:10:51,227 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 16:10:51,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:10:53,461 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-28 16:10:56,825 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer_ff3.min_abs, batch_count=70960.0, ans=0.2 2023-09-28 16:10:57,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:10:57,990 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:10:58,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:11:02,069 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:11:04,607 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.31 vs. limit=15.0 2023-09-28 16:11:05,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:11:09,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-28 16:11:11,517 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=71026.66666666667, ans=0.0 2023-09-28 16:11:12,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-28 16:11:12,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:11:12,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:11:15,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:11:15,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:11:17,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-28 16:11:19,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:11:21,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:11:24,687 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-28 16:11:30,093 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-28 16:11:31,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:11:32,932 INFO [train.py:1039] (1/4) Epoch 3, batch 50, loss[loss=0.3165, simple_loss=0.3664, pruned_loss=0.1333, over 24696.00 frames. ], tot_loss[loss=0.3085, simple_loss=0.3504, pruned_loss=0.1332, over 1048686.85 frames. ], batch size: 65, lr: 3.18e-02, grad_scale: 32.0 2023-09-28 16:11:33,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:11:36,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:11:36,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-28 16:11:37,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 16:11:37,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:11:39,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:11:39,595 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:11:43,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:11:44,195 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=71160.0, ans=0.125 2023-09-28 16:11:45,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-28 16:11:45,723 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:11:51,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-28 16:11:52,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-28 16:11:54,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-28 16:11:57,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:11:59,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:11:59,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:12:01,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:12:03,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-28 16:12:03,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 16:12:03,426 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:12:03,717 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=71226.66666666667, ans=0.125 2023-09-28 16:12:11,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:12:12,824 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-28 16:12:12,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 16:12:14,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-28 16:12:16,000 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:12:17,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 16:12:17,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-28 16:12:17,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:12:19,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-28 16:12:27,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:12:27,692 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:12:27,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:12:28,563 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.62 vs. limit=15.0 2023-09-28 16:12:29,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:12:31,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:12:33,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-28 16:12:33,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-28 16:12:35,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:12:36,622 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:12:37,420 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.49 vs. limit=15.0 2023-09-28 16:12:38,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:12:40,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:12:40,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-28 16:12:42,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-28 16:12:43,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-28 16:12:44,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:12:46,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:12:47,269 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.83 vs. limit=15.0 2023-09-28 16:12:47,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-28 16:12:47,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-28 16:12:49,334 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.184e+02 2.852e+02 3.312e+02 4.404e+02 9.515e+02, threshold=6.623e+02, percent-clipped=7.0 2023-09-28 16:12:49,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:12:49,581 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-28 16:12:51,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-28 16:12:51,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:12:54,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:12:55,747 INFO [train.py:1039] (1/4) Epoch 3, batch 100, loss[loss=0.2603, simple_loss=0.3157, pruned_loss=0.1025, over 24574.00 frames. ], tot_loss[loss=0.3054, simple_loss=0.3484, pruned_loss=0.1312, over 1868178.34 frames. ], batch size: 60, lr: 3.17e-02, grad_scale: 32.0 2023-09-28 16:12:57,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:13:01,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:13:03,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-28 16:13:03,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:13:08,677 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:13:08,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:13:08,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-28 16:13:08,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:13:08,817 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:13:10,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-28 16:13:15,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-28 16:13:15,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:13:16,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:13:16,708 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:13:20,735 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.68 vs. limit=12.0 2023-09-28 16:13:21,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-28 16:13:22,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:13:23,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:13:24,602 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-28 16:13:26,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 16:13:26,467 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=71560.0, ans=0.125 2023-09-28 16:13:30,730 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-28 16:13:30,754 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-28 16:13:32,285 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:13:32,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 16:13:36,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-28 16:13:38,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:13:40,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:13:45,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:13:45,826 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.51 vs. limit=22.5 2023-09-28 16:13:47,220 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-28 16:13:49,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-28 16:13:54,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:13:54,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:13:56,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:13:56,679 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 16:14:00,907 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:14:03,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:14:05,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:14:08,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:14:10,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:14:10,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:14:11,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:14:11,767 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:14:11,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-28 16:14:11,924 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-28 16:14:11,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:14:13,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:14:15,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:14:15,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:14:15,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 16:14:15,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 16:14:15,419 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-28 16:14:16,740 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:14:16,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:14:18,158 INFO [train.py:1039] (1/4) Epoch 3, batch 150, loss[loss=0.3082, simple_loss=0.3554, pruned_loss=0.1305, over 23937.00 frames. ], tot_loss[loss=0.3012, simple_loss=0.3469, pruned_loss=0.1277, over 2511207.99 frames. ], batch size: 80, lr: 3.17e-02, grad_scale: 32.0 2023-09-28 16:14:18,318 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:14:19,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:14:19,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:14:23,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:14:26,301 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.99 vs. limit=6.0 2023-09-28 16:14:27,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:14:27,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:14:27,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:14:33,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:14:33,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:14:33,895 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 16:14:35,464 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=71893.33333333333, ans=0.125 2023-09-28 16:14:38,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:14:38,175 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:14:41,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-28 16:14:41,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-28 16:14:41,415 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-28 16:14:44,659 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:14:44,668 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 16:14:46,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:14:48,995 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:14:49,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:14:49,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:14:49,190 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:14:52,712 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-28 16:14:54,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:14:54,524 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=71960.0, ans=0.125 2023-09-28 16:14:59,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:15:06,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:15:07,964 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-28 16:15:11,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:15:11,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:15:11,825 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:15:13,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:15:15,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:15:16,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:15:16,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:15:18,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-28 16:15:22,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:15:22,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:15:22,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:15:23,158 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=72093.33333333333, ans=0.0 2023-09-28 16:15:24,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:15:25,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:15:27,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 16:15:29,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:15:31,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:15:33,451 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:15:34,798 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.108e+02 2.675e+02 3.139e+02 3.901e+02 5.670e+02, threshold=6.278e+02, percent-clipped=0.0 2023-09-28 16:15:37,114 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:15:37,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-28 16:15:37,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:15:38,556 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-28 16:15:42,124 INFO [train.py:1039] (1/4) Epoch 3, batch 200, loss[loss=0.3184, simple_loss=0.3536, pruned_loss=0.1416, over 22722.00 frames. ], tot_loss[loss=0.3022, simple_loss=0.3466, pruned_loss=0.1289, over 2994965.13 frames. ], batch size: 322, lr: 3.16e-02, grad_scale: 32.0 2023-09-28 16:15:42,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:15:46,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:15:46,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:15:49,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-28 16:15:51,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:15:51,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:15:53,107 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=72160.0, ans=0.0 2023-09-28 16:15:54,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-28 16:15:54,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-28 16:15:56,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:15:57,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:16:02,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:16:02,297 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:16:02,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:16:07,521 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=72226.66666666667, ans=0.125 2023-09-28 16:16:25,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:16:26,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:16:26,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:16:26,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:16:27,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 16:16:27,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 16:16:28,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:16:31,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 16:16:31,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:16:31,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:16:33,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-28 16:16:34,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 16:16:34,665 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:16:37,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:16:46,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:16:51,444 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=72426.66666666667, ans=0.2 2023-09-28 16:16:51,557 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=72426.66666666667, ans=0.125 2023-09-28 16:16:55,820 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:16:55,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:17:00,768 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:17:03,670 INFO [train.py:1039] (1/4) Epoch 3, batch 250, loss[loss=0.2989, simple_loss=0.3335, pruned_loss=0.1322, over 23320.00 frames. ], tot_loss[loss=0.2994, simple_loss=0.3452, pruned_loss=0.1268, over 3388012.24 frames. ], batch size: 119, lr: 3.16e-02, grad_scale: 32.0 2023-09-28 16:17:03,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-28 16:17:05,356 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:17:05,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:17:05,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:17:05,841 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=72493.33333333333, ans=0.0 2023-09-28 16:17:06,875 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 16:17:07,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-28 16:17:07,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:17:08,543 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-28 16:17:10,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:17:11,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:17:13,305 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:17:13,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:17:15,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:17:16,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:17:16,902 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=72493.33333333333, ans=0.125 2023-09-28 16:17:18,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:17:24,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:17:29,770 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=72560.0, ans=0.2 2023-09-28 16:17:34,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:17:36,101 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:17:36,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:17:42,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-28 16:17:42,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-28 16:17:44,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:17:45,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:17:47,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 16:17:47,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:17:48,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:17:50,606 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:17:50,791 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=72626.66666666667, ans=0.05 2023-09-28 16:17:55,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-28 16:17:55,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:17:57,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:17:57,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:17:57,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:17:58,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:17:58,996 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=72693.33333333333, ans=0.1 2023-09-28 16:18:00,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 16:18:01,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:18:04,482 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:18:05,957 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:18:06,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:18:09,250 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:18:13,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:18:16,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:18:17,174 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=72760.0, ans=0.04949747468305833 2023-09-28 16:18:19,837 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.841e+02 2.635e+02 3.105e+02 3.716e+02 7.443e+02, threshold=6.210e+02, percent-clipped=1.0 2023-09-28 16:18:22,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:18:25,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:18:26,485 INFO [train.py:1039] (1/4) Epoch 3, batch 300, loss[loss=0.274, simple_loss=0.3322, pruned_loss=0.1079, over 24486.00 frames. ], tot_loss[loss=0.2962, simple_loss=0.3417, pruned_loss=0.1254, over 3676894.52 frames. ], batch size: 63, lr: 3.15e-02, grad_scale: 32.0 2023-09-28 16:18:28,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-28 16:18:28,688 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=72826.66666666667, ans=0.0 2023-09-28 16:18:29,884 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:18:29,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:18:30,681 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.98 vs. limit=22.5 2023-09-28 16:18:31,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-28 16:18:32,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-28 16:18:34,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:18:34,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-28 16:18:37,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:18:40,460 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:18:42,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:18:42,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-28 16:18:43,803 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:18:45,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 16:18:45,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-28 16:18:45,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:18:50,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-28 16:18:52,112 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=72893.33333333333, ans=0.125 2023-09-28 16:18:54,773 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 16:18:54,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-28 16:18:59,965 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-28 16:19:01,398 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:19:03,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:19:06,279 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=72960.0, ans=0.2 2023-09-28 16:19:07,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:19:07,346 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-28 16:19:07,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:19:08,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:19:10,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:19:12,025 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:19:15,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-28 16:19:15,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-28 16:19:16,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:19:18,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:19:19,051 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=73026.66666666667, ans=0.125 2023-09-28 16:19:20,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-28 16:19:20,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:19:21,881 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=73026.66666666667, ans=0.0 2023-09-28 16:19:24,902 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:19:28,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:19:28,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-28 16:19:33,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:19:33,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 16:19:33,412 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=73093.33333333333, ans=0.0 2023-09-28 16:19:36,005 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:19:37,572 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:19:37,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-28 16:19:37,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 16:19:37,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:19:39,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-28 16:19:40,264 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=73093.33333333333, ans=0.1 2023-09-28 16:19:41,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:19:42,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:19:44,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:19:44,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:19:44,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:19:49,023 INFO [train.py:1039] (1/4) Epoch 3, batch 350, loss[loss=0.2532, simple_loss=0.3072, pruned_loss=0.09962, over 20395.00 frames. ], tot_loss[loss=0.295, simple_loss=0.3402, pruned_loss=0.1249, over 3907135.70 frames. ], batch size: 44, lr: 3.15e-02, grad_scale: 32.0 2023-09-28 16:19:49,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:19:49,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 16:19:50,950 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:19:58,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:20:01,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:20:01,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:20:05,048 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-28 16:20:06,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:20:06,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-28 16:20:11,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:20:11,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-28 16:20:12,267 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.12 vs. limit=15.0 2023-09-28 16:20:13,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:20:14,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-28 16:20:15,233 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=73226.66666666667, ans=0.0 2023-09-28 16:20:16,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:20:18,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:20:19,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:20:21,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:20:21,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:20:21,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:20:21,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:20:22,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:20:24,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:20:24,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:20:31,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:20:31,342 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:20:32,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:20:34,117 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:20:40,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-28 16:20:40,317 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:20:45,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:20:45,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:20:45,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:20:47,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-28 16:20:51,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:20:52,725 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-28 16:20:52,882 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-28 16:20:52,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:20:55,436 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.30 vs. limit=15.0 2023-09-28 16:20:57,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:20:57,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-28 16:21:00,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:21:02,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:21:03,441 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.805e+02 2.765e+02 3.239e+02 3.985e+02 6.243e+02, threshold=6.477e+02, percent-clipped=2.0 2023-09-28 16:21:03,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:21:05,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:21:05,154 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:21:08,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:21:10,296 INFO [train.py:1039] (1/4) Epoch 3, batch 400, loss[loss=0.3042, simple_loss=0.3666, pruned_loss=0.1209, over 24463.00 frames. ], tot_loss[loss=0.2954, simple_loss=0.3395, pruned_loss=0.1256, over 4054206.24 frames. ], batch size: 69, lr: 3.14e-02, grad_scale: 32.0 2023-09-28 16:21:10,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:21:13,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:21:15,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-28 16:21:15,092 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:21:15,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:21:17,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:21:18,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:21:20,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:21:22,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:21:22,999 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-28 16:21:25,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-28 16:21:25,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:21:25,479 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=73493.33333333333, ans=0.125 2023-09-28 16:21:26,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-28 16:21:26,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:21:30,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:21:30,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:21:30,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-28 16:21:30,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:21:30,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:21:30,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:21:31,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:21:33,267 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-28 16:21:34,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-28 16:21:39,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:21:41,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:21:42,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-28 16:21:43,021 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-28 16:21:46,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:21:49,794 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:21:57,234 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-28 16:22:00,344 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-28 16:22:03,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-28 16:22:06,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:22:07,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:22:07,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-28 16:22:11,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:22:14,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 16:22:16,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:22:16,350 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=73760.0, ans=0.0 2023-09-28 16:22:19,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:22:19,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-28 16:22:19,395 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-28 16:22:20,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-28 16:22:23,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 16:22:23,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:22:24,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-28 16:22:26,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 16:22:26,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:22:27,225 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=73760.0, ans=0.125 2023-09-28 16:22:28,952 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-28 16:22:30,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-28 16:22:30,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:22:32,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:22:32,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:22:32,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-28 16:22:32,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:22:33,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 16:22:35,089 INFO [train.py:1039] (1/4) Epoch 3, batch 450, loss[loss=0.2844, simple_loss=0.3458, pruned_loss=0.1115, over 24374.00 frames. ], tot_loss[loss=0.2973, simple_loss=0.3414, pruned_loss=0.1266, over 4197739.29 frames. ], batch size: 77, lr: 3.14e-02, grad_scale: 32.0 2023-09-28 16:22:36,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 16:22:46,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:22:46,635 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:22:48,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-28 16:22:49,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-28 16:22:52,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-28 16:22:56,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:22:59,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:23:05,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:23:06,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:23:09,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-28 16:23:09,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-28 16:23:11,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-28 16:23:11,536 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:23:12,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:23:13,781 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten.whitening_limit, batch_count=73960.0, ans=22.5 2023-09-28 16:23:14,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:23:16,043 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-28 16:23:16,057 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-28 16:23:16,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:23:17,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:23:19,635 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-28 16:23:22,899 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-28 16:23:22,950 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:23:24,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-28 16:23:24,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-28 16:23:26,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:23:26,262 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=74026.66666666667, ans=0.1 2023-09-28 16:23:29,017 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-28 16:23:29,083 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 16:23:32,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-28 16:23:36,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:23:38,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-28 16:23:40,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-28 16:23:40,412 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=74093.33333333333, ans=0.07 2023-09-28 16:23:41,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:23:46,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:23:47,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:23:49,473 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:23:51,038 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-28 16:23:52,497 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.112e+02 2.606e+02 2.993e+02 3.540e+02 4.868e+02, threshold=5.986e+02, percent-clipped=0.0 2023-09-28 16:23:54,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:23:54,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:23:54,802 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:23:54,819 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-28 16:23:57,536 INFO [train.py:1039] (1/4) Epoch 3, batch 500, loss[loss=0.2765, simple_loss=0.335, pruned_loss=0.109, over 24640.00 frames. ], tot_loss[loss=0.297, simple_loss=0.3416, pruned_loss=0.1262, over 4306526.97 frames. ], batch size: 65, lr: 3.13e-02, grad_scale: 16.0 2023-09-28 16:23:57,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-28 16:23:57,663 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:24:00,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 16:24:05,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 16:24:07,369 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-28 16:24:10,978 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:24:11,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:24:11,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:24:21,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:24:22,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-28 16:24:22,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-28 16:24:22,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:24:24,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-28 16:24:24,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 16:24:27,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:24:28,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:24:28,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:24:30,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:24:30,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-28 16:24:35,596 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-28 16:24:37,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:24:38,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:24:39,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:24:40,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:24:40,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-28 16:24:44,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-28 16:24:47,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:24:47,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:24:53,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:24:54,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:24:55,246 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=74360.0, ans=0.0 2023-09-28 16:24:59,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:25:04,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-28 16:25:04,026 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:25:04,056 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:25:10,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-28 16:25:10,448 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-28 16:25:12,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:25:16,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-28 16:25:18,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-28 16:25:18,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:25:18,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-28 16:25:20,332 INFO [train.py:1039] (1/4) Epoch 3, batch 550, loss[loss=0.2832, simple_loss=0.337, pruned_loss=0.1147, over 23404.00 frames. ], tot_loss[loss=0.2978, simple_loss=0.3426, pruned_loss=0.1265, over 4396989.15 frames. ], batch size: 93, lr: 3.13e-02, grad_scale: 16.0 2023-09-28 16:25:20,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:25:20,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:25:21,999 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:25:23,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:25:23,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:25:25,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:25:28,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:25:30,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-28 16:25:30,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:25:34,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:25:36,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:25:39,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:25:40,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:25:44,678 WARNING [train.py:1197] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-28 16:25:44,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-28 16:25:47,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:25:52,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:25:52,324 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:25:54,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-28 16:25:57,390 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=13.31 vs. limit=15.0 2023-09-28 16:25:58,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:25:58,230 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-28 16:26:00,397 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:26:01,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 16:26:05,250 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:26:06,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 16:26:06,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-28 16:26:07,139 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=74626.66666666667, ans=0.1 2023-09-28 16:26:08,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:26:09,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-28 16:26:11,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-28 16:26:12,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:26:12,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:26:14,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:26:14,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:26:17,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:26:18,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:26:22,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:26:22,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:26:24,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 16:26:24,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 16:26:27,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:26:28,687 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:26:30,764 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:26:32,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-28 16:26:32,423 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-28 16:26:36,868 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.890e+02 2.622e+02 3.187e+02 4.101e+02 6.995e+02, threshold=6.373e+02, percent-clipped=4.0 2023-09-28 16:26:38,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-28 16:26:41,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-28 16:26:42,524 INFO [train.py:1039] (1/4) Epoch 3, batch 600, loss[loss=0.2803, simple_loss=0.3297, pruned_loss=0.1155, over 19803.00 frames. ], tot_loss[loss=0.2965, simple_loss=0.342, pruned_loss=0.1255, over 4477536.97 frames. ], batch size: 43, lr: 3.13e-02, grad_scale: 16.0 2023-09-28 16:26:42,740 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:26:44,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 16:26:44,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:26:50,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:26:51,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 16:26:53,417 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-28 16:26:56,928 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-28 16:26:58,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:26:58,799 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:27:02,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-28 16:27:03,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:27:06,993 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys.whitening_limit, batch_count=74893.33333333333, ans=6.0 2023-09-28 16:27:11,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-28 16:27:11,592 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=74893.33333333333, ans=0.125 2023-09-28 16:27:14,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:27:14,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:27:14,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:27:19,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:27:21,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:27:21,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:27:27,293 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:27:33,648 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:27:33,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:27:33,679 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:27:42,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-28 16:27:44,276 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.21 vs. limit=22.5 2023-09-28 16:27:48,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-28 16:27:48,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:27:53,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-28 16:27:53,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:27:56,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-28 16:27:56,620 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:27:58,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:28:03,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 16:28:04,673 INFO [train.py:1039] (1/4) Epoch 3, batch 650, loss[loss=0.2891, simple_loss=0.3412, pruned_loss=0.1185, over 24503.00 frames. ], tot_loss[loss=0.2959, simple_loss=0.3413, pruned_loss=0.1252, over 4522890.18 frames. ], batch size: 66, lr: 3.12e-02, grad_scale: 16.0 2023-09-28 16:28:04,885 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-28 16:28:06,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:28:09,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:28:12,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:28:13,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-28 16:28:15,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:28:15,987 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=75160.0, ans=0.2 2023-09-28 16:28:20,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:28:20,967 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:28:24,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:28:26,533 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.97 vs. limit=22.5 2023-09-28 16:28:27,433 WARNING [train.py:1197] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-28 16:28:30,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:28:32,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:28:35,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:28:35,766 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 16:28:36,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 16:28:37,157 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=75293.33333333333, ans=0.125 2023-09-28 16:28:38,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:28:39,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:28:39,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 16:28:41,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:28:43,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:28:45,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:28:45,339 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-28 16:28:45,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:28:45,373 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:28:50,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:28:51,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:28:53,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:28:53,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:28:55,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-28 16:28:56,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:28:56,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:28:58,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-28 16:28:58,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:28:59,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 16:29:01,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-28 16:29:03,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-28 16:29:03,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:29:03,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:29:03,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:29:04,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:29:04,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:29:08,597 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=75360.0, ans=0.125 2023-09-28 16:29:11,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:29:11,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:29:12,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:29:15,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:29:15,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 16:29:15,998 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:29:23,200 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.199e+02 2.685e+02 3.190e+02 3.569e+02 4.758e+02, threshold=6.380e+02, percent-clipped=0.0 2023-09-28 16:29:23,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 16:29:23,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:29:24,841 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:29:24,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:29:27,704 INFO [train.py:1039] (1/4) Epoch 3, batch 700, loss[loss=0.2697, simple_loss=0.327, pruned_loss=0.1062, over 24451.00 frames. ], tot_loss[loss=0.2949, simple_loss=0.34, pruned_loss=0.1249, over 4559896.87 frames. ], batch size: 63, lr: 3.12e-02, grad_scale: 16.0 2023-09-28 16:29:30,034 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-28 16:29:31,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-28 16:29:33,405 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=75493.33333333333, ans=0.125 2023-09-28 16:29:34,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-28 16:29:34,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:29:36,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:29:39,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-28 16:29:44,414 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:29:45,325 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.09 vs. limit=15.0 2023-09-28 16:29:46,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:29:47,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:29:48,031 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=75560.0, ans=0.2 2023-09-28 16:29:49,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-28 16:29:50,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:29:53,401 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=75560.0, ans=0.0 2023-09-28 16:29:54,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:29:56,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 16:29:57,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:29:59,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-28 16:30:03,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-28 16:30:06,430 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-28 16:30:06,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:30:08,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:30:11,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:30:11,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-28 16:30:16,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:30:17,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:30:17,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-28 16:30:19,659 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=75693.33333333333, ans=0.2 2023-09-28 16:30:21,253 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=75693.33333333333, ans=0.2 2023-09-28 16:30:22,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:30:24,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:30:27,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:30:28,039 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=75693.33333333333, ans=0.125 2023-09-28 16:30:32,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:30:34,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-28 16:30:36,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-28 16:30:36,492 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-28 16:30:40,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:30:42,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:30:44,477 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:30:44,713 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:30:44,722 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-28 16:30:50,479 INFO [train.py:1039] (1/4) Epoch 3, batch 750, loss[loss=0.3008, simple_loss=0.3517, pruned_loss=0.1249, over 24468.00 frames. ], tot_loss[loss=0.2939, simple_loss=0.339, pruned_loss=0.1244, over 4593689.35 frames. ], batch size: 69, lr: 3.11e-02, grad_scale: 16.0 2023-09-28 16:30:50,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-28 16:30:50,623 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-28 16:30:50,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-28 16:30:52,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-28 16:30:52,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-28 16:30:53,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:30:55,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-28 16:30:56,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:30:58,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-28 16:31:00,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:31:01,904 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:31:01,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-28 16:31:02,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:31:05,089 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:31:06,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 16:31:08,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:31:12,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:31:12,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:31:14,082 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-28 16:31:15,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:31:15,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:31:16,096 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=75893.33333333333, ans=0.125 2023-09-28 16:31:19,172 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:31:19,547 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=75893.33333333333, ans=0.0 2023-09-28 16:31:20,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-28 16:31:22,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-28 16:31:22,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:31:22,533 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 16:31:25,466 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=75960.0, ans=0.0 2023-09-28 16:31:26,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-28 16:31:26,670 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-28 16:31:28,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-28 16:31:28,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:31:28,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 16:31:29,914 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=75960.0, ans=0.1 2023-09-28 16:31:31,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:31:38,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-28 16:31:38,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:31:38,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 16:31:41,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:31:42,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:31:44,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-28 16:31:44,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:31:47,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-28 16:31:49,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:31:52,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:31:53,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-28 16:31:53,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:31:56,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:31:57,173 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=76093.33333333333, ans=0.2 2023-09-28 16:31:58,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:31:59,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:32:01,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:32:03,331 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=76093.33333333333, ans=0.125 2023-09-28 16:32:04,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-28 16:32:04,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:32:06,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:32:07,687 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.054e+02 2.659e+02 2.971e+02 3.538e+02 5.180e+02, threshold=5.942e+02, percent-clipped=0.0 2023-09-28 16:32:07,852 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:32:07,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:32:10,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:32:10,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-28 16:32:12,844 INFO [train.py:1039] (1/4) Epoch 3, batch 800, loss[loss=0.3122, simple_loss=0.3715, pruned_loss=0.1265, over 24316.00 frames. ], tot_loss[loss=0.2934, simple_loss=0.3395, pruned_loss=0.1236, over 4632647.82 frames. ], batch size: 74, lr: 3.11e-02, grad_scale: 32.0 2023-09-28 16:32:14,768 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=76160.0, ans=0.1 2023-09-28 16:32:23,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:32:23,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:32:25,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:32:25,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:32:26,049 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.84 vs. limit=15.0 2023-09-28 16:32:26,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:32:26,665 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:32:28,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:32:31,442 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.20 vs. limit=5.0 2023-09-28 16:32:31,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:32:33,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 16:32:36,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-28 16:32:37,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:32:39,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:32:39,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:32:39,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:32:39,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-28 16:32:41,078 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:32:41,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-28 16:32:45,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:32:48,695 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:32:50,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:32:50,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:32:55,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:32:56,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:33:00,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:33:00,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:33:00,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-28 16:33:02,941 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-28 16:33:02,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-28 16:33:04,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 16:33:04,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:33:06,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:33:06,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:33:11,293 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=76360.0, ans=0.125 2023-09-28 16:33:12,650 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-28 16:33:12,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-28 16:33:14,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-28 16:33:15,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 16:33:20,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:33:23,581 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:33:25,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-28 16:33:25,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:33:30,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-28 16:33:33,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 16:33:35,165 INFO [train.py:1039] (1/4) Epoch 3, batch 850, loss[loss=0.2677, simple_loss=0.3257, pruned_loss=0.1049, over 24453.00 frames. ], tot_loss[loss=0.2931, simple_loss=0.3399, pruned_loss=0.1231, over 4665171.65 frames. ], batch size: 63, lr: 3.10e-02, grad_scale: 32.0 2023-09-28 16:33:37,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:33:38,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-28 16:33:38,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:33:40,788 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:33:42,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-28 16:33:43,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:33:43,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:33:45,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:33:46,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 16:33:48,558 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:33:50,120 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-28 16:33:50,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-28 16:33:50,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-28 16:33:50,518 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=76560.0, ans=0.0 2023-09-28 16:33:51,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 16:33:53,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:33:54,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:33:54,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:33:54,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:33:58,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:33:58,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:34:00,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-28 16:34:04,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-28 16:34:05,733 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:34:09,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-28 16:34:12,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-28 16:34:13,035 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-28 16:34:15,985 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-28 16:34:16,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:34:16,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:34:16,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 16:34:19,005 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:34:20,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:34:20,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-28 16:34:23,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:34:25,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:34:25,289 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:34:25,333 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-28 16:34:28,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:34:29,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-28 16:34:29,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-28 16:34:35,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:34:35,688 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:34:35,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:34:35,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:34:37,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:34:40,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:34:43,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:34:44,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-28 16:34:44,459 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=76760.0, ans=0.0 2023-09-28 16:34:46,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:34:47,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-28 16:34:53,428 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.777e+02 2.514e+02 2.970e+02 3.562e+02 5.095e+02, threshold=5.941e+02, percent-clipped=0.0 2023-09-28 16:34:55,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-28 16:34:57,901 INFO [train.py:1039] (1/4) Epoch 3, batch 900, loss[loss=0.2927, simple_loss=0.3491, pruned_loss=0.1181, over 24028.00 frames. ], tot_loss[loss=0.2955, simple_loss=0.3415, pruned_loss=0.1248, over 4667917.33 frames. ], batch size: 80, lr: 3.10e-02, grad_scale: 32.0 2023-09-28 16:34:57,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:34:58,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-28 16:34:58,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:34:59,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:35:01,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-28 16:35:07,884 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:35:10,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:35:12,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-28 16:35:17,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:35:17,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-28 16:35:18,075 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-28 16:35:18,972 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=11.05 vs. limit=10.0 2023-09-28 16:35:19,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:35:19,857 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:35:19,921 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 16:35:19,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:35:31,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:35:31,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:35:31,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 16:35:34,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:35:39,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-28 16:35:41,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:35:41,714 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=76960.0, ans=0.125 2023-09-28 16:35:46,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-28 16:35:46,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-28 16:35:48,185 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-28 16:35:48,304 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-28 16:35:55,191 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-28 16:35:55,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:35:55,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 16:36:00,602 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=77026.66666666667, ans=0.0 2023-09-28 16:36:01,894 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:36:01,912 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:36:03,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-28 16:36:03,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:36:06,900 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=77093.33333333333, ans=0.1 2023-09-28 16:36:07,877 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-28 16:36:09,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:36:09,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:36:11,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:36:11,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:36:16,390 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-28 16:36:16,453 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-28 16:36:19,430 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-28 16:36:19,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-28 16:36:22,417 INFO [train.py:1039] (1/4) Epoch 3, batch 950, loss[loss=0.267, simple_loss=0.3268, pruned_loss=0.1036, over 24641.00 frames. ], tot_loss[loss=0.2951, simple_loss=0.3407, pruned_loss=0.1247, over 4678547.69 frames. ], batch size: 65, lr: 3.09e-02, grad_scale: 32.0 2023-09-28 16:36:22,598 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:36:27,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-28 16:36:27,789 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=77160.0, ans=0.0 2023-09-28 16:36:31,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:36:32,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:36:33,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:36:35,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 16:36:36,635 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-28 16:36:39,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:36:41,215 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:36:42,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:36:42,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:36:42,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-28 16:36:43,188 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=77226.66666666667, ans=0.125 2023-09-28 16:36:44,384 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-28 16:36:46,264 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.93 vs. limit=15.0 2023-09-28 16:36:47,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:36:47,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-28 16:36:49,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:36:52,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:36:52,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:36:52,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:36:54,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-28 16:36:57,305 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 16:36:59,123 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=77293.33333333333, ans=0.1 2023-09-28 16:37:00,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:37:03,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:37:07,686 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:37:07,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:37:10,902 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-28 16:37:12,720 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=77360.0, ans=0.0 2023-09-28 16:37:15,217 WARNING [train.py:1197] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 16:37:15,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 16:37:15,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:37:15,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:37:15,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:37:21,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-28 16:37:21,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:37:25,322 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:37:27,414 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:37:27,470 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-28 16:37:27,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:37:27,502 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:37:27,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-28 16:37:33,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:37:35,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:37:40,047 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.993e+02 2.741e+02 3.253e+02 3.972e+02 7.741e+02, threshold=6.506e+02, percent-clipped=1.0 2023-09-28 16:37:40,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:37:43,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-28 16:37:43,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-28 16:37:45,288 INFO [train.py:1039] (1/4) Epoch 3, batch 1000, loss[loss=0.2951, simple_loss=0.3293, pruned_loss=0.1304, over 23576.00 frames. ], tot_loss[loss=0.2942, simple_loss=0.34, pruned_loss=0.1242, over 4697824.87 frames. ], batch size: 149, lr: 3.09e-02, grad_scale: 32.0 2023-09-28 16:37:47,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:37:50,155 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-28 16:37:50,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:37:50,737 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=77493.33333333333, ans=0.125 2023-09-28 16:37:53,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:37:56,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-28 16:37:56,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-28 16:38:01,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:38:02,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:38:02,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:38:07,263 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-28 16:38:12,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-28 16:38:13,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-28 16:38:13,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:38:15,524 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-28 16:38:18,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-28 16:38:18,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-28 16:38:20,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:38:20,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:38:28,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:38:29,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:38:29,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:38:29,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:38:29,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-28 16:38:29,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:38:31,444 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:38:33,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:38:33,609 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-28 16:38:36,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-28 16:38:38,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-28 16:38:39,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-28 16:38:43,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:38:50,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:38:50,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:38:50,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:38:51,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:38:53,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-28 16:38:54,945 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:38:55,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-28 16:38:55,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-28 16:38:57,896 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:38:57,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:39:00,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:39:02,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 16:39:04,116 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:39:08,074 INFO [train.py:1039] (1/4) Epoch 3, batch 1050, loss[loss=0.3092, simple_loss=0.345, pruned_loss=0.1367, over 23817.00 frames. ], tot_loss[loss=0.2929, simple_loss=0.3384, pruned_loss=0.1237, over 4690764.52 frames. ], batch size: 179, lr: 3.08e-02, grad_scale: 32.0 2023-09-28 16:39:09,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:39:09,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:39:11,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 16:39:13,388 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:39:14,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:39:16,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 16:39:19,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-28 16:39:22,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:39:22,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:39:22,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-28 16:39:24,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:39:25,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-28 16:39:26,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:39:26,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-28 16:39:29,070 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:39:29,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-28 16:39:29,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-28 16:39:32,407 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=77893.33333333333, ans=0.1 2023-09-28 16:39:35,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:39:37,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-28 16:39:37,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:39:40,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-28 16:39:40,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-28 16:39:40,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:39:45,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-28 16:39:47,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-28 16:39:48,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:39:52,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 16:39:53,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-28 16:39:53,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:39:55,085 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:39:58,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:40:03,300 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-28 16:40:03,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-28 16:40:03,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-28 16:40:05,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:40:05,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:40:08,102 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-28 16:40:14,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:40:15,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:40:15,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:40:15,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:40:15,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:40:19,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:40:19,106 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-28 16:40:20,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:40:20,694 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-28 16:40:22,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-28 16:40:22,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:40:25,623 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.910e+02 2.729e+02 3.108e+02 3.500e+02 5.269e+02, threshold=6.215e+02, percent-clipped=0.0 2023-09-28 16:40:25,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:40:31,241 INFO [train.py:1039] (1/4) Epoch 3, batch 1100, loss[loss=0.3352, simple_loss=0.3832, pruned_loss=0.1436, over 24543.00 frames. ], tot_loss[loss=0.2925, simple_loss=0.3385, pruned_loss=0.1233, over 4708800.16 frames. ], batch size: 71, lr: 3.08e-02, grad_scale: 32.0 2023-09-28 16:40:31,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:40:34,554 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=78160.0, ans=0.1 2023-09-28 16:40:37,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:40:38,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:40:38,968 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:40:40,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-28 16:40:40,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:40:45,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:40:49,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:40:54,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:40:54,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-28 16:40:55,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 16:40:57,750 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:40:57,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:40:59,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:41:01,092 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-28 16:41:04,416 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=78293.33333333333, ans=0.125 2023-09-28 16:41:05,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:41:09,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-28 16:41:09,311 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-28 16:41:10,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:41:12,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:41:13,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-28 16:41:13,898 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:41:16,122 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.15 vs. limit=22.5 2023-09-28 16:41:16,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-28 16:41:16,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:41:16,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:41:17,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:41:17,088 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:41:19,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-28 16:41:24,577 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:41:24,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-28 16:41:27,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:41:32,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 16:41:35,609 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-28 16:41:36,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-28 16:41:38,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:41:40,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:41:40,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:41:43,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-28 16:41:43,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:41:43,925 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:41:45,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-28 16:41:45,588 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:41:45,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-28 16:41:47,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:41:47,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:41:47,413 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=78426.66666666667, ans=0.125 2023-09-28 16:41:49,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-28 16:41:53,549 INFO [train.py:1039] (1/4) Epoch 3, batch 1150, loss[loss=0.3833, simple_loss=0.3913, pruned_loss=0.1877, over 19570.00 frames. ], tot_loss[loss=0.293, simple_loss=0.3393, pruned_loss=0.1234, over 4711725.14 frames. ], batch size: 388, lr: 3.07e-02, grad_scale: 32.0 2023-09-28 16:41:55,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:41:58,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:42:00,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:42:00,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:42:01,806 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-28 16:42:01,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:42:04,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-28 16:42:04,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:42:05,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 16:42:05,609 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.42 vs. limit=15.0 2023-09-28 16:42:10,736 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.04 vs. limit=15.0 2023-09-28 16:42:13,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-28 16:42:16,143 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:42:19,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:42:21,328 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:42:21,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-28 16:42:21,436 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-28 16:42:21,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:42:24,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-28 16:42:26,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:42:28,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:42:28,694 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=78626.66666666667, ans=0.0 2023-09-28 16:42:30,371 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 16:42:38,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:42:41,808 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=78693.33333333333, ans=0.125 2023-09-28 16:42:46,428 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:42:46,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-28 16:42:46,648 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=78693.33333333333, ans=0.1 2023-09-28 16:42:48,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:42:48,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:42:54,751 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-28 16:42:56,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:43:04,597 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-28 16:43:07,702 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:43:09,214 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:43:09,263 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-28 16:43:11,194 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.090e+02 2.632e+02 2.933e+02 3.650e+02 8.073e+02, threshold=5.867e+02, percent-clipped=1.0 2023-09-28 16:43:11,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:43:14,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:43:14,698 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=78826.66666666667, ans=0.0 2023-09-28 16:43:15,753 INFO [train.py:1039] (1/4) Epoch 3, batch 1200, loss[loss=0.268, simple_loss=0.3331, pruned_loss=0.1015, over 24624.00 frames. ], tot_loss[loss=0.2917, simple_loss=0.3391, pruned_loss=0.1221, over 4730966.55 frames. ], batch size: 68, lr: 3.07e-02, grad_scale: 32.0 2023-09-28 16:43:21,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:43:21,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-28 16:43:22,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:43:22,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:43:22,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:43:23,098 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=78826.66666666667, ans=0.2 2023-09-28 16:43:24,515 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=78826.66666666667, ans=0.125 2023-09-28 16:43:25,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:43:27,985 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 16:43:29,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:43:29,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:43:32,569 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-28 16:43:35,627 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-28 16:43:39,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:43:42,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:43:43,913 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=78893.33333333333, ans=0.1 2023-09-28 16:43:45,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:43:45,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:43:45,394 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-28 16:43:46,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:43:55,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-28 16:43:55,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:43:55,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-28 16:43:57,314 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:43:57,895 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.62 vs. limit=15.0 2023-09-28 16:44:02,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-28 16:44:05,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-28 16:44:05,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:44:07,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:44:08,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:44:08,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-28 16:44:11,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:44:11,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:44:12,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:44:12,684 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=79026.66666666667, ans=0.0 2023-09-28 16:44:13,822 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-28 16:44:13,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 16:44:13,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:44:14,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 16:44:17,527 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.26 vs. limit=15.0 2023-09-28 16:44:18,310 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:44:18,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:44:23,470 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-28 16:44:25,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:44:26,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-28 16:44:30,635 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-28 16:44:32,164 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:44:35,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:44:36,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:44:37,113 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=79160.0, ans=0.07 2023-09-28 16:44:38,731 INFO [train.py:1039] (1/4) Epoch 3, batch 1250, loss[loss=0.308, simple_loss=0.3402, pruned_loss=0.1379, over 23641.00 frames. ], tot_loss[loss=0.2928, simple_loss=0.3401, pruned_loss=0.1227, over 4726104.17 frames. ], batch size: 232, lr: 3.06e-02, grad_scale: 32.0 2023-09-28 16:44:38,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:44:40,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-28 16:44:45,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:44:47,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:44:47,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-28 16:44:50,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:44:50,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 16:44:55,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 16:44:55,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:44:57,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:44:57,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:45:02,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-28 16:45:07,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 16:45:07,084 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-28 16:45:07,094 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:45:08,750 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:45:08,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:45:12,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:45:13,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-28 16:45:20,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-28 16:45:20,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:45:21,235 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=79293.33333333333, ans=0.1 2023-09-28 16:45:25,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:45:26,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-28 16:45:26,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:45:26,825 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-28 16:45:26,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:45:26,867 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:45:30,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:45:35,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:45:35,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:45:37,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-28 16:45:37,245 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-28 16:45:37,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-28 16:45:41,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:45:43,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-28 16:45:43,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:45:43,600 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=79426.66666666667, ans=0.125 2023-09-28 16:45:47,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-28 16:45:47,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:45:50,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-28 16:45:51,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-28 16:45:52,063 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:45:52,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-28 16:45:53,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:45:55,647 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.921e+02 2.589e+02 2.905e+02 3.561e+02 6.488e+02, threshold=5.810e+02, percent-clipped=2.0 2023-09-28 16:45:55,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-28 16:45:58,279 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:45:58,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:45:59,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:46:01,403 INFO [train.py:1039] (1/4) Epoch 3, batch 1300, loss[loss=0.3092, simple_loss=0.3421, pruned_loss=0.1381, over 23794.00 frames. ], tot_loss[loss=0.2924, simple_loss=0.3395, pruned_loss=0.1226, over 4728086.62 frames. ], batch size: 195, lr: 3.06e-02, grad_scale: 32.0 2023-09-28 16:46:03,032 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-28 16:46:05,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:46:06,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-28 16:46:12,693 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:46:14,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-28 16:46:14,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:46:17,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:46:18,723 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:46:18,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-28 16:46:24,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:46:24,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-28 16:46:25,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-28 16:46:30,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 16:46:32,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:46:34,176 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:46:34,913 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=14.55 vs. limit=15.0 2023-09-28 16:46:35,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:46:35,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:46:37,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 16:46:37,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-28 16:46:38,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-28 16:46:39,110 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=79626.66666666667, ans=0.125 2023-09-28 16:46:45,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:46:45,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 16:46:46,254 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.31 vs. limit=15.0 2023-09-28 16:46:47,103 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-28 16:46:47,208 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 16:46:50,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:46:53,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:46:53,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-28 16:46:54,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:46:54,695 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-28 16:46:56,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:46:59,563 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:46:59,567 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:47:03,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-28 16:47:04,747 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-28 16:47:04,901 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-28 16:47:09,317 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:47:11,728 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-28 16:47:15,150 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:47:22,588 INFO [train.py:1039] (1/4) Epoch 3, batch 1350, loss[loss=0.2759, simple_loss=0.3214, pruned_loss=0.1152, over 19077.00 frames. ], tot_loss[loss=0.2922, simple_loss=0.3387, pruned_loss=0.1228, over 4725287.18 frames. ], batch size: 41, lr: 3.05e-02, grad_scale: 32.0 2023-09-28 16:47:22,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-28 16:47:28,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:47:29,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:47:32,037 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:47:32,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:47:33,698 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=79826.66666666667, ans=0.2 2023-09-28 16:47:36,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:47:36,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:47:37,306 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=79893.33333333333, ans=0.95 2023-09-28 16:47:40,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:47:40,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-28 16:47:44,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-28 16:47:45,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:47:49,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-28 16:47:50,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:47:51,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:47:51,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-28 16:47:52,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-28 16:47:55,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-28 16:47:57,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:47:57,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-28 16:48:02,572 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.99 vs. limit=10.0 2023-09-28 16:48:11,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:48:20,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:48:21,031 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=80026.66666666667, ans=0.125 2023-09-28 16:48:22,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:48:22,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-28 16:48:25,408 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.97 vs. limit=6.0 2023-09-28 16:48:25,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:48:27,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-28 16:48:27,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-28 16:48:28,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:48:30,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:48:33,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-28 16:48:35,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:48:41,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-28 16:48:42,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-28 16:48:44,354 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.017e+02 2.667e+02 3.027e+02 3.668e+02 6.120e+02, threshold=6.055e+02, percent-clipped=2.0 2023-09-28 16:48:45,040 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=80093.33333333333, ans=0.125 2023-09-28 16:48:47,400 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=80160.0, ans=0.125 2023-09-28 16:48:48,465 INFO [train.py:1039] (1/4) Epoch 3, batch 1400, loss[loss=0.3061, simple_loss=0.3368, pruned_loss=0.1377, over 23906.00 frames. ], tot_loss[loss=0.2917, simple_loss=0.3373, pruned_loss=0.1231, over 4711875.20 frames. ], batch size: 195, lr: 3.05e-02, grad_scale: 16.0 2023-09-28 16:48:48,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-28 16:48:50,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:48:51,134 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=80160.0, ans=0.125 2023-09-28 16:48:52,559 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=80160.0, ans=0.0 2023-09-28 16:48:55,220 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:48:56,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:49:00,410 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-28 16:49:02,039 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-28 16:49:04,022 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=80226.66666666667, ans=0.04949747468305833 2023-09-28 16:49:06,999 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=80226.66666666667, ans=0.125 2023-09-28 16:49:07,131 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=80226.66666666667, ans=0.125 2023-09-28 16:49:11,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:49:12,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:49:14,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:49:14,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-28 16:49:18,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:49:19,786 WARNING [train.py:1197] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 16:49:30,030 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:49:30,133 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:49:34,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-28 16:49:34,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:49:34,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:49:36,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:49:37,649 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:49:39,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:49:39,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:49:39,270 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:49:40,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-28 16:49:40,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:49:45,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:49:52,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:49:53,971 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=80426.66666666667, ans=0.2 2023-09-28 16:50:00,924 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-28 16:50:02,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 16:50:03,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:50:06,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 16:50:07,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:50:07,921 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:50:10,875 INFO [train.py:1039] (1/4) Epoch 3, batch 1450, loss[loss=0.2656, simple_loss=0.3174, pruned_loss=0.1069, over 24336.00 frames. ], tot_loss[loss=0.29, simple_loss=0.3367, pruned_loss=0.1217, over 4729216.18 frames. ], batch size: 61, lr: 3.05e-02, grad_scale: 16.0 2023-09-28 16:50:12,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-28 16:50:12,657 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:50:12,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:50:14,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-28 16:50:18,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:50:19,006 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=80493.33333333333, ans=0.125 2023-09-28 16:50:19,976 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 16:50:20,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:50:20,178 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-28 16:50:22,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 16:50:23,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-28 16:50:23,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:50:26,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:50:26,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-28 16:50:27,090 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:50:27,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-28 16:50:28,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 16:50:28,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:50:28,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:50:31,038 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:50:33,240 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.09 vs. limit=15.0 2023-09-28 16:50:33,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:50:37,820 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=80560.0, ans=0.0 2023-09-28 16:50:38,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:50:38,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:50:40,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:50:40,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:50:43,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:50:43,648 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:50:43,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:50:43,774 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=80626.66666666667, ans=0.015 2023-09-28 16:50:45,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:50:48,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-28 16:50:51,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:50:53,005 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=80626.66666666667, ans=0.125 2023-09-28 16:50:54,289 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-28 16:50:54,631 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=80626.66666666667, ans=0.0 2023-09-28 16:50:56,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:50:56,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-28 16:50:57,989 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:51:01,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-28 16:51:05,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:51:07,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-28 16:51:09,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-28 16:51:11,114 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:51:13,587 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.min_positive, batch_count=80693.33333333333, ans=0.025 2023-09-28 16:51:14,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:51:14,890 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:51:15,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-28 16:51:18,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-28 16:51:18,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-28 16:51:19,795 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:51:20,106 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=80760.0, ans=0.125 2023-09-28 16:51:21,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 16:51:26,150 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=80760.0, ans=0.125 2023-09-28 16:51:28,714 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.864e+02 2.628e+02 3.276e+02 3.890e+02 6.376e+02, threshold=6.552e+02, percent-clipped=1.0 2023-09-28 16:51:31,137 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 16:51:31,263 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=80826.66666666667, ans=0.0 2023-09-28 16:51:32,271 INFO [train.py:1039] (1/4) Epoch 3, batch 1500, loss[loss=0.2782, simple_loss=0.3365, pruned_loss=0.11, over 24566.00 frames. ], tot_loss[loss=0.2903, simple_loss=0.337, pruned_loss=0.1218, over 4705152.60 frames. ], batch size: 71, lr: 3.04e-02, grad_scale: 16.0 2023-09-28 16:51:35,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-28 16:51:35,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:51:35,520 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:51:35,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:51:37,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:51:39,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:51:39,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-28 16:51:43,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:51:43,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-28 16:51:43,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:51:44,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:51:46,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:51:46,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:51:53,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:51:53,063 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-28 16:51:53,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-28 16:51:53,989 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.82 vs. limit=22.5 2023-09-28 16:51:54,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:51:54,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:51:58,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-28 16:51:59,375 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=80893.33333333333, ans=0.0 2023-09-28 16:52:02,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-28 16:52:04,323 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:52:05,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-28 16:52:07,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-28 16:52:12,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:52:12,454 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:52:13,221 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=6.69 vs. limit=15.0 2023-09-28 16:52:13,714 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:52:15,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-28 16:52:15,365 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:52:15,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:52:15,485 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-28 16:52:16,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:52:19,376 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=80960.0, ans=0.2 2023-09-28 16:52:22,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:52:22,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-28 16:52:28,885 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:52:31,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 16:52:35,158 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=81026.66666666667, ans=0.2 2023-09-28 16:52:36,343 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-28 16:52:37,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:52:37,837 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-28 16:52:39,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:52:42,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:52:44,324 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-28 16:52:44,466 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-28 16:52:47,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-28 16:52:49,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:52:52,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:52:52,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:52:52,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:52:54,272 INFO [train.py:1039] (1/4) Epoch 3, batch 1550, loss[loss=0.2534, simple_loss=0.3114, pruned_loss=0.09774, over 24655.00 frames. ], tot_loss[loss=0.2895, simple_loss=0.3367, pruned_loss=0.1211, over 4722800.16 frames. ], batch size: 65, lr: 3.04e-02, grad_scale: 16.0 2023-09-28 16:52:54,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:52:54,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:52:57,837 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-28 16:52:57,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-28 16:52:57,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:52:59,421 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-28 16:53:01,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-28 16:53:03,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:53:06,037 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:53:06,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:53:06,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:53:07,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:53:09,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:53:12,041 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-28 16:53:12,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:53:12,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:53:13,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:53:16,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-28 16:53:16,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-28 16:53:16,931 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=81226.66666666667, ans=0.125 2023-09-28 16:53:18,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:53:18,273 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-28 16:53:18,613 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=81226.66666666667, ans=0.1 2023-09-28 16:53:20,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-28 16:53:20,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-28 16:53:21,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:53:23,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:53:25,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:53:29,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-28 16:53:29,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-28 16:53:38,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:53:42,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:53:42,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:53:42,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:53:44,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-28 16:53:48,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 16:53:48,958 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer_ff3.min_abs, batch_count=81360.0, ans=0.2 2023-09-28 16:53:50,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:53:53,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:53:53,435 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=81360.0, ans=0.1 2023-09-28 16:53:55,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:53:55,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:53:55,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-28 16:53:55,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 16:53:57,635 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=81360.0, ans=0.1 2023-09-28 16:53:59,239 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=81426.66666666667, ans=0.2 2023-09-28 16:54:00,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:54:00,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:54:00,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-28 16:54:00,560 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-28 16:54:00,793 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=81426.66666666667, ans=0.125 2023-09-28 16:54:03,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:54:08,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-28 16:54:13,775 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.933e+02 2.706e+02 3.074e+02 3.869e+02 6.821e+02, threshold=6.147e+02, percent-clipped=1.0 2023-09-28 16:54:14,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:54:15,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:54:16,863 INFO [train.py:1039] (1/4) Epoch 3, batch 1600, loss[loss=0.3136, simple_loss=0.3429, pruned_loss=0.1422, over 23709.00 frames. ], tot_loss[loss=0.2899, simple_loss=0.3369, pruned_loss=0.1215, over 4723975.53 frames. ], batch size: 179, lr: 3.03e-02, grad_scale: 32.0 2023-09-28 16:54:16,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-28 16:54:18,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 16:54:19,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:54:20,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 16:54:20,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:54:21,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:54:23,549 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=81493.33333333333, ans=0.125 2023-09-28 16:54:24,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:54:24,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-28 16:54:26,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-28 16:54:28,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-28 16:54:31,443 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:54:32,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-28 16:54:33,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:54:35,686 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=6.63 vs. limit=12.0 2023-09-28 16:54:36,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:54:40,061 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=81560.0, ans=0.125 2023-09-28 16:54:41,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:54:41,350 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=81560.0, ans=0.0 2023-09-28 16:54:44,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-28 16:54:47,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:54:48,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-28 16:54:49,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:54:49,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-28 16:54:54,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-28 16:55:03,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:55:03,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-28 16:55:05,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:55:05,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:55:05,264 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:55:08,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-28 16:55:11,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 16:55:13,012 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:55:13,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:55:14,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:55:14,651 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:55:18,046 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:55:18,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:55:21,575 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 16:55:27,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:55:29,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:55:29,471 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=81760.0, ans=0.125 2023-09-28 16:55:30,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-28 16:55:30,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:55:33,848 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-28 16:55:37,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:55:39,154 INFO [train.py:1039] (1/4) Epoch 3, batch 1650, loss[loss=0.3004, simple_loss=0.3595, pruned_loss=0.1206, over 24014.00 frames. ], tot_loss[loss=0.2905, simple_loss=0.3377, pruned_loss=0.1216, over 4720560.94 frames. ], batch size: 80, lr: 3.03e-02, grad_scale: 16.0 2023-09-28 16:55:40,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:55:43,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:55:43,678 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-28 16:55:43,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-28 16:55:43,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-28 16:55:43,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-28 16:55:46,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:55:48,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:55:48,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:55:48,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-28 16:55:51,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:55:53,634 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-28 16:55:55,288 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:55:55,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:55:55,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:55:55,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:55:56,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-28 16:55:58,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-28 16:56:02,962 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 16:56:03,255 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=81893.33333333333, ans=0.125 2023-09-28 16:56:04,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-28 16:56:08,397 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=81893.33333333333, ans=0.09899494936611666 2023-09-28 16:56:14,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-28 16:56:16,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:56:17,272 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.44 vs. limit=12.0 2023-09-28 16:56:17,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-28 16:56:21,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:56:25,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:56:25,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:56:25,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:56:26,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:56:26,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:56:29,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:56:30,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:56:31,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:56:31,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:56:32,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:56:33,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 16:56:37,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:56:39,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-28 16:56:41,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:56:42,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-28 16:56:42,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-28 16:56:42,819 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-28 16:56:42,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:56:44,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:56:45,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:56:48,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:56:48,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-28 16:56:51,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:56:52,972 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:56:53,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:56:56,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-28 16:56:59,515 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.012e+02 2.428e+02 2.816e+02 3.293e+02 5.315e+02, threshold=5.632e+02, percent-clipped=0.0 2023-09-28 16:57:01,803 INFO [train.py:1039] (1/4) Epoch 3, batch 1700, loss[loss=0.262, simple_loss=0.323, pruned_loss=0.1005, over 24310.00 frames. ], tot_loss[loss=0.2899, simple_loss=0.3369, pruned_loss=0.1214, over 4707776.65 frames. ], batch size: 61, lr: 3.02e-02, grad_scale: 16.0 2023-09-28 16:57:01,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:57:01,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:57:01,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-28 16:57:02,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:57:02,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:57:02,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:57:02,267 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=82160.0, ans=0.125 2023-09-28 16:57:02,375 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=82160.0, ans=0.125 2023-09-28 16:57:03,587 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=82160.0, ans=0.0 2023-09-28 16:57:06,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:57:06,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:57:06,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-28 16:57:08,377 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 16:57:11,828 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 16:57:16,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:57:18,628 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.40 vs. limit=15.0 2023-09-28 16:57:20,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:57:27,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-28 16:57:27,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:57:27,617 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:57:28,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:57:32,462 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-28 16:57:32,828 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=82293.33333333333, ans=0.2 2023-09-28 16:57:34,043 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:57:35,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:57:37,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-28 16:57:37,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-28 16:57:39,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-28 16:57:40,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-28 16:57:43,816 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:57:45,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-28 16:57:45,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:57:50,443 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=82360.0, ans=0.125 2023-09-28 16:57:53,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:57:55,368 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=82360.0, ans=0.0 2023-09-28 16:57:57,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:57:57,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:58:00,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-28 16:58:00,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-28 16:58:00,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:58:01,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:58:01,987 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-28 16:58:02,362 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=82360.0, ans=0.1 2023-09-28 16:58:03,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:58:03,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:58:03,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:58:03,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:58:06,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:58:06,002 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:58:07,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:58:07,738 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=82426.66666666667, ans=0.0 2023-09-28 16:58:07,834 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=82426.66666666667, ans=0.0 2023-09-28 16:58:09,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:58:11,010 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:58:15,490 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:58:15,633 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-28 16:58:18,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:58:20,258 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:58:21,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-28 16:58:23,944 INFO [train.py:1039] (1/4) Epoch 3, batch 1750, loss[loss=0.2819, simple_loss=0.3411, pruned_loss=0.1114, over 23954.00 frames. ], tot_loss[loss=0.2884, simple_loss=0.3353, pruned_loss=0.1208, over 4706761.13 frames. ], batch size: 80, lr: 3.02e-02, grad_scale: 16.0 2023-09-28 16:58:25,889 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=82493.33333333333, ans=0.125 2023-09-28 16:58:28,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:58:28,961 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=82493.33333333333, ans=0.09899494936611666 2023-09-28 16:58:32,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:58:32,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-28 16:58:32,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-28 16:58:32,567 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:58:35,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:58:37,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:58:42,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-28 16:58:43,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:58:45,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-28 16:58:45,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:58:48,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:58:51,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 16:58:53,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-28 16:58:55,050 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:58:56,439 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-28 16:59:05,188 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:59:07,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:59:07,266 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:59:10,400 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:59:11,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:59:13,314 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:59:15,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:59:17,465 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:59:17,704 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 16:59:18,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:59:20,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-28 16:59:23,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:59:25,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-28 16:59:26,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:59:27,110 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=82693.33333333333, ans=0.05 2023-09-28 16:59:28,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:59:28,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:59:32,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 16:59:33,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-28 16:59:35,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:59:36,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:59:41,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:59:43,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:59:44,782 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.671e+02 2.581e+02 2.939e+02 3.799e+02 7.676e+02, threshold=5.877e+02, percent-clipped=7.0 2023-09-28 16:59:46,362 INFO [train.py:1039] (1/4) Epoch 3, batch 1800, loss[loss=0.3195, simple_loss=0.3744, pruned_loss=0.1323, over 24365.00 frames. ], tot_loss[loss=0.288, simple_loss=0.3352, pruned_loss=0.1204, over 4707137.17 frames. ], batch size: 77, lr: 3.01e-02, grad_scale: 16.0 2023-09-28 16:59:46,417 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:59:46,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-28 16:59:46,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:59:48,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-28 16:59:48,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:59:48,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-28 16:59:48,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:59:49,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:59:51,979 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:59:53,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:59:55,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 16:59:57,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:59:57,323 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 17:00:00,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 17:00:01,501 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:00:05,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:00:08,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:00:08,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:00:08,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:00:11,000 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:00:12,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-28 17:00:12,457 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:00:16,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:00:21,521 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-28 17:00:23,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-28 17:00:23,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-28 17:00:23,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:00:24,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:00:24,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:00:26,712 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:00:32,910 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-28 17:00:34,483 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-28 17:00:34,778 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=83026.66666666667, ans=0.2 2023-09-28 17:00:37,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:00:37,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-28 17:00:39,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-28 17:00:39,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:00:39,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:00:41,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 17:00:43,336 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=83026.66666666667, ans=0.125 2023-09-28 17:00:45,483 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=83026.66666666667, ans=0.125 2023-09-28 17:00:46,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-28 17:00:51,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:00:52,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-28 17:00:52,816 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:00:52,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:00:52,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:00:54,365 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-28 17:00:58,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:00:58,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:00:59,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-28 17:00:59,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:01:02,672 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:01:02,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-28 17:01:02,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:01:02,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:01:02,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 17:01:05,983 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:01:05,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:01:09,512 INFO [train.py:1039] (1/4) Epoch 3, batch 1850, loss[loss=0.2881, simple_loss=0.3276, pruned_loss=0.1243, over 23838.00 frames. ], tot_loss[loss=0.2889, simple_loss=0.336, pruned_loss=0.1209, over 4708633.48 frames. ], batch size: 212, lr: 3.01e-02, grad_scale: 16.0 2023-09-28 17:01:09,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:01:11,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:01:17,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:01:17,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-28 17:01:19,551 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=83160.0, ans=0.125 2023-09-28 17:01:22,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-28 17:01:23,240 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.44 vs. limit=15.0 2023-09-28 17:01:25,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-28 17:01:28,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:01:28,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-28 17:01:28,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 17:01:37,726 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=83226.66666666667, ans=0.125 2023-09-28 17:01:40,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:01:42,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-28 17:01:44,627 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=83293.33333333333, ans=0.0 2023-09-28 17:01:45,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:01:45,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:01:46,106 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=83293.33333333333, ans=0.2 2023-09-28 17:01:49,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-28 17:01:49,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:01:49,897 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:01:51,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:01:51,787 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=83293.33333333333, ans=0.1 2023-09-28 17:01:53,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:01:56,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:02:00,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:02:00,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:02:00,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 17:02:00,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:02:00,973 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=83360.0, ans=0.125 2023-09-28 17:02:02,217 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:02:04,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:02:08,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-28 17:02:08,127 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:02:12,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-28 17:02:14,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:02:14,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-28 17:02:14,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-28 17:02:17,670 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-28 17:02:19,223 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-28 17:02:20,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 17:02:20,855 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:02:20,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:02:22,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:02:24,559 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-28 17:02:24,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:02:24,657 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:02:26,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-28 17:02:27,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 17:02:28,183 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=16.33 vs. limit=15.0 2023-09-28 17:02:30,307 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.984e+02 2.645e+02 2.967e+02 3.523e+02 5.465e+02, threshold=5.934e+02, percent-clipped=0.0 2023-09-28 17:02:30,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:02:30,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-28 17:02:31,900 INFO [train.py:1039] (1/4) Epoch 3, batch 1900, loss[loss=0.2468, simple_loss=0.3024, pruned_loss=0.09558, over 24457.00 frames. ], tot_loss[loss=0.2897, simple_loss=0.3373, pruned_loss=0.1211, over 4724561.15 frames. ], batch size: 58, lr: 3.01e-02, grad_scale: 16.0 2023-09-28 17:02:32,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:02:32,189 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-28 17:02:32,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:02:33,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:02:39,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:02:41,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:02:43,733 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-28 17:02:43,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-28 17:02:47,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:02:47,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:02:47,465 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-28 17:02:48,870 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-28 17:02:51,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-28 17:02:52,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:02:55,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-28 17:02:59,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-28 17:03:03,069 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.21 vs. limit=15.0 2023-09-28 17:03:10,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-28 17:03:13,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-28 17:03:13,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:03:13,296 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-28 17:03:13,304 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-28 17:03:13,368 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-28 17:03:14,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-28 17:03:14,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:03:19,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-28 17:03:22,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:03:26,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:03:26,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-28 17:03:26,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:03:31,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-28 17:03:33,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-28 17:03:40,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:03:41,575 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:03:41,597 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:03:41,932 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=83760.0, ans=0.125 2023-09-28 17:03:43,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:03:44,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 17:03:44,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-28 17:03:44,880 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=83760.0, ans=0.07 2023-09-28 17:03:46,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:03:47,848 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:03:47,851 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:03:51,373 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:03:51,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:03:51,703 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=83760.0, ans=0.1 2023-09-28 17:03:52,798 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-28 17:03:54,196 INFO [train.py:1039] (1/4) Epoch 3, batch 1950, loss[loss=0.294, simple_loss=0.3311, pruned_loss=0.1284, over 23749.00 frames. ], tot_loss[loss=0.291, simple_loss=0.3383, pruned_loss=0.1219, over 4704336.98 frames. ], batch size: 164, lr: 3.00e-02, grad_scale: 16.0 2023-09-28 17:03:54,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:04:00,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:04:02,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:04:02,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:04:02,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 17:04:05,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-28 17:04:05,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 17:04:07,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:04:07,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:04:10,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:04:10,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:04:10,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:04:13,531 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.09 vs. limit=15.0 2023-09-28 17:04:14,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:04:17,172 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:04:17,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 17:04:17,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:04:17,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:04:19,058 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=83893.33333333333, ans=0.125 2023-09-28 17:04:21,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:04:24,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:04:24,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:04:24,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-28 17:04:24,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-28 17:04:26,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:04:27,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:04:27,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:04:31,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:04:35,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:04:40,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 17:04:44,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:04:44,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-28 17:04:44,145 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-28 17:04:45,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:04:49,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:04:50,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:04:52,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-28 17:04:52,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=84026.66666666667, ans=0.125 2023-09-28 17:04:58,649 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=84093.33333333333, ans=0.125 2023-09-28 17:04:59,953 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:05:00,087 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:05:03,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:05:06,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:05:08,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:05:09,857 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:05:09,968 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-28 17:05:09,976 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 17:05:11,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:05:11,630 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=84093.33333333333, ans=0.0 2023-09-28 17:05:12,959 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-28 17:05:14,265 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.050e+02 2.608e+02 2.981e+02 3.638e+02 7.272e+02, threshold=5.963e+02, percent-clipped=1.0 2023-09-28 17:05:14,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:05:16,364 INFO [train.py:1039] (1/4) Epoch 3, batch 2000, loss[loss=0.2578, simple_loss=0.3154, pruned_loss=0.1001, over 24582.00 frames. ], tot_loss[loss=0.291, simple_loss=0.3383, pruned_loss=0.1219, over 4710812.72 frames. ], batch size: 60, lr: 3.00e-02, grad_scale: 32.0 2023-09-28 17:05:16,814 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=84160.0, ans=0.1 2023-09-28 17:05:18,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-28 17:05:19,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:05:19,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:05:21,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:05:23,181 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:05:26,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-28 17:05:26,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-28 17:05:29,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:05:31,017 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-28 17:05:31,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 17:05:31,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:05:32,292 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=84226.66666666667, ans=0.125 2023-09-28 17:05:35,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:05:36,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-28 17:05:38,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:05:40,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:05:40,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:05:41,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-28 17:05:41,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:05:44,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-28 17:05:44,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:05:48,023 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:05:49,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-28 17:05:49,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:05:51,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:05:51,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:05:53,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-28 17:05:58,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-28 17:05:58,088 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:05:58,114 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:06:00,080 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=84293.33333333333, ans=0.5 2023-09-28 17:06:04,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:06:05,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:06:05,766 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:06:06,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:06:06,728 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 17:06:08,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:06:09,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:06:09,616 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:06:09,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:06:11,931 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:06:15,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:06:15,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-28 17:06:15,415 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=84360.0, ans=0.2 2023-09-28 17:06:19,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 17:06:21,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:06:25,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:06:27,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:06:30,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:06:30,534 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=84426.66666666667, ans=10.0 2023-09-28 17:06:33,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:06:33,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:06:35,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 17:06:35,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 17:06:38,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:06:39,524 INFO [train.py:1039] (1/4) Epoch 3, batch 2050, loss[loss=0.3048, simple_loss=0.3571, pruned_loss=0.1263, over 24411.00 frames. ], tot_loss[loss=0.2911, simple_loss=0.3376, pruned_loss=0.1223, over 4709754.59 frames. ], batch size: 69, lr: 2.99e-02, grad_scale: 32.0 2023-09-28 17:06:39,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:06:43,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:06:43,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:06:50,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:06:53,374 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:06:53,461 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:06:54,954 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:06:56,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-28 17:06:56,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:06:58,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:06:58,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-28 17:07:10,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-28 17:07:10,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:07:13,360 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-28 17:07:15,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:07:15,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-28 17:07:17,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-28 17:07:18,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:07:21,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:07:22,088 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-28 17:07:22,165 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:07:23,725 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:07:25,633 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:07:25,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:07:28,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:07:30,545 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 17:07:32,112 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:07:35,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:07:40,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:07:44,874 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=8.00 vs. limit=15.0 2023-09-28 17:07:45,397 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:07:45,676 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=84760.0, ans=0.125 2023-09-28 17:07:46,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-28 17:07:51,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:07:53,124 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:07:54,933 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=84760.0, ans=0.2 2023-09-28 17:07:56,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:07:56,416 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=84760.0, ans=0.2 2023-09-28 17:07:58,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-28 17:08:01,439 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-28 17:08:01,439 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:08:02,769 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.062e+02 2.817e+02 3.171e+02 3.803e+02 7.947e+02, threshold=6.342e+02, percent-clipped=1.0 2023-09-28 17:08:02,811 INFO [train.py:1039] (1/4) Epoch 3, batch 2100, loss[loss=0.2984, simple_loss=0.3302, pruned_loss=0.1333, over 23436.00 frames. ], tot_loss[loss=0.2902, simple_loss=0.3358, pruned_loss=0.1223, over 4684591.88 frames. ], batch size: 119, lr: 2.99e-02, grad_scale: 16.0 2023-09-28 17:08:02,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:08:03,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 17:08:04,684 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:08:04,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-28 17:08:04,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-28 17:08:06,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=84826.66666666667, ans=0.125 2023-09-28 17:08:06,602 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=84826.66666666667, ans=0.125 2023-09-28 17:08:07,780 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:08:09,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:08:11,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:08:15,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:08:16,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:08:16,763 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-28 17:08:17,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:08:17,619 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-28 17:08:17,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-28 17:08:19,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:08:19,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:08:19,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-28 17:08:20,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 17:08:25,906 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.79 vs. limit=22.5 2023-09-28 17:08:28,016 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-28 17:08:28,017 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 17:08:31,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:08:31,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:08:34,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:08:36,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-28 17:08:36,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:08:36,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 17:08:39,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-28 17:08:39,515 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:08:39,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-28 17:08:40,996 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-28 17:08:41,078 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-28 17:08:42,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-28 17:08:44,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:08:48,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 17:08:48,975 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=84960.0, ans=0.0 2023-09-28 17:08:50,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 17:08:53,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:08:54,114 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:08:54,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-28 17:08:54,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:08:54,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:08:55,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:08:55,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-28 17:08:57,387 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-28 17:08:58,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-28 17:09:03,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:09:08,033 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:09:09,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-28 17:09:15,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:09:16,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:09:18,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:09:18,262 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:09:18,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-28 17:09:18,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 17:09:19,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:09:19,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-28 17:09:21,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:09:22,190 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=14.82 vs. limit=15.0 2023-09-28 17:09:23,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:09:23,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-28 17:09:25,966 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-28 17:09:26,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:09:27,904 INFO [train.py:1039] (1/4) Epoch 3, batch 2150, loss[loss=0.2856, simple_loss=0.3283, pruned_loss=0.1215, over 23775.00 frames. ], tot_loss[loss=0.2873, simple_loss=0.3332, pruned_loss=0.1207, over 4688789.71 frames. ], batch size: 164, lr: 2.98e-02, grad_scale: 16.0 2023-09-28 17:09:31,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:09:31,121 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:09:31,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:09:32,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:09:37,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 17:09:40,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:09:40,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:09:43,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:09:43,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:09:43,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:09:47,868 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:09:47,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:09:47,952 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:09:52,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:09:53,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-28 17:09:57,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:09:59,762 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:10:01,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:10:01,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:10:01,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:10:01,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-28 17:10:04,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:10:04,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:10:05,522 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:10:07,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-28 17:10:08,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-28 17:10:10,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:10:10,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:10:11,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 17:10:12,756 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.77 vs. limit=15.0 2023-09-28 17:10:13,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:10:16,702 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:10:16,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:10:18,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:10:18,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-28 17:10:18,182 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-28 17:10:21,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:10:21,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:10:22,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:10:23,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 17:10:24,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:10:24,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:10:25,122 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=85360.0, ans=10.0 2023-09-28 17:10:26,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-28 17:10:26,349 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=85360.0, ans=0.125 2023-09-28 17:10:28,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-28 17:10:28,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:10:28,528 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=85360.0, ans=0.0 2023-09-28 17:10:29,671 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-28 17:10:29,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:10:29,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:10:29,950 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=85360.0, ans=0.1 2023-09-28 17:10:31,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-28 17:10:31,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:10:31,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-28 17:10:31,324 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-28 17:10:31,325 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-28 17:10:33,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-28 17:10:35,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:10:35,693 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:10:35,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:10:37,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:10:38,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 17:10:39,921 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=85426.66666666667, ans=0.125 2023-09-28 17:10:40,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:10:40,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:10:44,402 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=85426.66666666667, ans=0.1 2023-09-28 17:10:47,617 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=85426.66666666667, ans=0.0 2023-09-28 17:10:48,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:10:50,163 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.981e+02 2.450e+02 2.912e+02 3.382e+02 5.716e+02, threshold=5.824e+02, percent-clipped=0.0 2023-09-28 17:10:50,207 INFO [train.py:1039] (1/4) Epoch 3, batch 2200, loss[loss=0.2874, simple_loss=0.3406, pruned_loss=0.1171, over 24633.00 frames. ], tot_loss[loss=0.2866, simple_loss=0.3326, pruned_loss=0.1203, over 4682298.14 frames. ], batch size: 65, lr: 2.98e-02, grad_scale: 16.0 2023-09-28 17:10:50,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-28 17:10:53,514 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:10:56,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:10:58,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:10:59,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:11:00,146 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=85493.33333333333, ans=0.125 2023-09-28 17:11:01,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-28 17:11:05,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:11:06,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:11:06,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-28 17:11:12,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-28 17:11:14,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 17:11:20,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-28 17:11:21,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:11:23,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:11:23,572 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:11:28,014 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:11:28,049 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-28 17:11:28,371 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=85626.66666666667, ans=0.125 2023-09-28 17:11:31,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-28 17:11:32,810 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:11:34,316 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-28 17:11:38,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:11:39,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:11:41,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:11:42,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:11:45,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-28 17:11:46,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:11:48,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-28 17:11:50,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:11:50,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:11:50,864 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=85693.33333333333, ans=0.0 2023-09-28 17:11:52,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:11:53,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:11:55,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:11:55,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:11:55,117 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:11:58,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-28 17:11:58,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:11:59,714 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 17:12:02,834 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 17:12:02,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:12:06,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:12:07,692 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-28 17:12:09,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 17:12:11,179 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-28 17:12:13,189 INFO [train.py:1039] (1/4) Epoch 3, batch 2250, loss[loss=0.2925, simple_loss=0.3537, pruned_loss=0.1157, over 24319.00 frames. ], tot_loss[loss=0.2878, simple_loss=0.334, pruned_loss=0.1208, over 4685981.78 frames. ], batch size: 74, lr: 2.97e-02, grad_scale: 16.0 2023-09-28 17:12:13,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-28 17:12:13,364 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-28 17:12:14,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:12:15,019 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-28 17:12:15,974 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=8.74 vs. limit=15.0 2023-09-28 17:12:16,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:12:18,601 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-28 17:12:21,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:12:23,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:12:28,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:12:29,880 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:12:33,064 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=85893.33333333333, ans=0.0 2023-09-28 17:12:34,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:12:34,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:12:34,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:12:36,599 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.30 vs. limit=6.0 2023-09-28 17:12:37,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-28 17:12:37,370 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:12:37,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:12:40,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-28 17:12:40,549 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:12:40,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:12:43,891 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:12:46,432 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=85960.0, ans=0.125 2023-09-28 17:12:47,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:12:49,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 17:12:49,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-28 17:12:50,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-28 17:12:52,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:12:55,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:13:01,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:13:04,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:13:04,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:13:04,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:13:07,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:13:09,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:13:12,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:13:15,643 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-28 17:13:22,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 17:13:22,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:13:22,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:13:27,821 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=86093.33333333333, ans=0.0 2023-09-28 17:13:29,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 17:13:31,243 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.93 vs. limit=15.0 2023-09-28 17:13:32,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-28 17:13:32,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-28 17:13:32,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:13:32,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:13:35,616 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.002e+02 2.481e+02 2.992e+02 3.507e+02 5.214e+02, threshold=5.985e+02, percent-clipped=0.0 2023-09-28 17:13:35,659 INFO [train.py:1039] (1/4) Epoch 3, batch 2300, loss[loss=0.282, simple_loss=0.3355, pruned_loss=0.1142, over 24658.00 frames. ], tot_loss[loss=0.2881, simple_loss=0.3353, pruned_loss=0.1204, over 4696709.23 frames. ], batch size: 65, lr: 2.97e-02, grad_scale: 16.0 2023-09-28 17:13:35,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-28 17:13:36,189 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=86160.0, ans=0.07 2023-09-28 17:13:38,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:13:38,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:13:40,171 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.99 vs. limit=15.0 2023-09-28 17:13:43,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:13:43,987 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:13:45,563 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-28 17:13:47,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:13:55,547 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:13:55,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-28 17:13:55,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:13:57,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:13:57,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-28 17:13:59,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:14:02,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:14:02,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:14:07,376 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:14:10,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:14:10,723 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=86293.33333333333, ans=0.0 2023-09-28 17:14:13,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:14:21,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:14:21,295 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:14:24,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:14:26,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:14:31,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:14:33,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 17:14:33,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:14:33,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-28 17:14:35,071 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=86360.0, ans=0.125 2023-09-28 17:14:38,788 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 17:14:38,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:14:38,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:14:38,894 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:14:38,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:14:40,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 17:14:40,547 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-28 17:14:40,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-28 17:14:40,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:14:40,697 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:14:43,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-28 17:14:44,980 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=86426.66666666667, ans=0.125 2023-09-28 17:14:46,497 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=86426.66666666667, ans=0.09899494936611666 2023-09-28 17:14:49,337 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:14:52,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:14:57,124 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:14:57,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:14:57,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-28 17:14:58,584 INFO [train.py:1039] (1/4) Epoch 3, batch 2350, loss[loss=0.387, simple_loss=0.3923, pruned_loss=0.1908, over 19578.00 frames. ], tot_loss[loss=0.2883, simple_loss=0.3361, pruned_loss=0.1203, over 4699937.63 frames. ], batch size: 388, lr: 2.97e-02, grad_scale: 16.0 2023-09-28 17:14:58,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 17:14:58,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:15:00,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 17:15:00,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-28 17:15:00,631 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 17:15:07,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:15:07,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-28 17:15:12,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-28 17:15:16,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:15:19,527 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:15:19,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:15:19,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:15:19,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:15:21,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-28 17:15:24,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:15:30,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-28 17:15:30,345 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=86626.66666666667, ans=0.1 2023-09-28 17:15:31,577 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=86626.66666666667, ans=0.125 2023-09-28 17:15:33,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:15:34,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 17:15:34,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:15:38,351 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:15:39,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-28 17:15:41,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:15:42,830 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=86626.66666666667, ans=0.09899494936611666 2023-09-28 17:15:45,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:15:45,311 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:15:45,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:15:48,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:15:50,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-28 17:15:52,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:15:53,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:15:55,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:15:56,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-28 17:15:56,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:16:00,201 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=86693.33333333333, ans=0.125 2023-09-28 17:16:01,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-28 17:16:01,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:16:06,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-28 17:16:07,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-28 17:16:09,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:16:09,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-28 17:16:10,635 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-28 17:16:10,684 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-28 17:16:14,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-28 17:16:16,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:16:20,374 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.814e+02 2.689e+02 3.044e+02 3.623e+02 6.836e+02, threshold=6.088e+02, percent-clipped=1.0 2023-09-28 17:16:20,416 INFO [train.py:1039] (1/4) Epoch 3, batch 2400, loss[loss=0.2654, simple_loss=0.3218, pruned_loss=0.1045, over 24475.00 frames. ], tot_loss[loss=0.2891, simple_loss=0.3359, pruned_loss=0.1211, over 4695605.00 frames. ], batch size: 63, lr: 2.96e-02, grad_scale: 32.0 2023-09-28 17:16:21,964 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:16:22,492 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.66 vs. limit=12.0 2023-09-28 17:16:27,155 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:16:28,803 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:16:28,898 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-28 17:16:30,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-28 17:16:36,751 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 17:16:36,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:16:38,543 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 17:16:39,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-28 17:16:39,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:16:39,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:16:40,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-28 17:16:42,002 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=86893.33333333333, ans=0.2 2023-09-28 17:16:44,836 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:16:49,043 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-28 17:16:56,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-28 17:17:00,747 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-28 17:17:05,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:17:05,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:17:09,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:17:09,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-28 17:17:10,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 17:17:16,372 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:17:19,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:17:21,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:17:22,025 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=87026.66666666667, ans=0.125 2023-09-28 17:17:23,897 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:17:23,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-28 17:17:23,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:17:23,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:17:24,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:17:24,053 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 17:17:29,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:17:31,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 17:17:31,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-28 17:17:34,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-28 17:17:35,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:17:37,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:17:37,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-28 17:17:38,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-28 17:17:38,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-28 17:17:38,863 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-28 17:17:39,003 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-28 17:17:40,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:17:40,642 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:17:40,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:17:42,285 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-28 17:17:43,677 INFO [train.py:1039] (1/4) Epoch 3, batch 2450, loss[loss=0.2764, simple_loss=0.3229, pruned_loss=0.1149, over 23362.00 frames. ], tot_loss[loss=0.2886, simple_loss=0.3352, pruned_loss=0.121, over 4691963.29 frames. ], batch size: 119, lr: 2.96e-02, grad_scale: 32.0 2023-09-28 17:17:43,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:17:43,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-28 17:17:48,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:17:48,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:17:53,718 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:17:53,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:17:53,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-28 17:17:57,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:17:57,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:18:03,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:18:03,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:18:03,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:18:03,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-28 17:18:03,972 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=87226.66666666667, ans=0.09899494936611666 2023-09-28 17:18:04,242 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.66 vs. limit=22.5 2023-09-28 17:18:09,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:18:11,187 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 17:18:12,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:18:13,010 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=87226.66666666667, ans=0.125 2023-09-28 17:18:14,575 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=87226.66666666667, ans=0.0 2023-09-28 17:18:15,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-28 17:18:15,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:18:15,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:18:17,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:18:18,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-28 17:18:20,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:18:29,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:18:29,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:18:31,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:18:31,273 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:18:33,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:18:35,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:18:37,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-28 17:18:40,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:18:40,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:18:41,501 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.29 vs. limit=15.0 2023-09-28 17:18:42,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:18:43,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:18:49,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:18:49,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-28 17:18:51,032 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:18:52,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:18:52,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-28 17:18:54,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:18:54,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:18:57,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:19:00,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:19:00,570 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:19:04,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-28 17:19:05,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:19:07,575 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.996e+02 2.571e+02 3.066e+02 3.811e+02 5.963e+02, threshold=6.132e+02, percent-clipped=0.0 2023-09-28 17:19:07,617 INFO [train.py:1039] (1/4) Epoch 3, batch 2500, loss[loss=0.3153, simple_loss=0.3623, pruned_loss=0.1341, over 23949.00 frames. ], tot_loss[loss=0.2868, simple_loss=0.3338, pruned_loss=0.1199, over 4701866.80 frames. ], batch size: 86, lr: 2.95e-02, grad_scale: 32.0 2023-09-28 17:19:13,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:19:22,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:19:22,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:19:23,351 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=7.17 vs. limit=15.0 2023-09-28 17:19:23,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:19:23,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-28 17:19:24,751 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=14.09 vs. limit=22.5 2023-09-28 17:19:31,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:19:33,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:19:33,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-28 17:19:34,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 17:19:37,135 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-28 17:19:37,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:19:38,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:19:38,809 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-28 17:19:38,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:19:38,930 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-28 17:19:40,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:19:44,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:19:44,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:19:48,445 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 17:19:49,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-28 17:19:51,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:19:51,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:19:56,332 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:19:59,577 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:20:02,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:20:03,474 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.00 vs. limit=12.0 2023-09-28 17:20:08,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-28 17:20:13,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-28 17:20:13,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:20:13,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-28 17:20:14,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:20:14,877 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 17:20:15,027 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-28 17:20:15,028 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-28 17:20:15,036 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-28 17:20:19,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:20:21,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-28 17:20:21,952 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-28 17:20:23,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:20:24,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-28 17:20:27,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-28 17:20:30,814 INFO [train.py:1039] (1/4) Epoch 3, batch 2550, loss[loss=0.2392, simple_loss=0.294, pruned_loss=0.09221, over 24446.00 frames. ], tot_loss[loss=0.2865, simple_loss=0.3341, pruned_loss=0.1195, over 4705835.63 frames. ], batch size: 58, lr: 2.95e-02, grad_scale: 32.0 2023-09-28 17:20:30,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:20:32,710 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=87826.66666666667, ans=0.125 2023-09-28 17:20:33,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:20:35,392 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:20:35,643 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:20:37,154 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-28 17:20:37,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-28 17:20:41,852 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-28 17:20:43,399 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:20:45,542 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:20:47,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:20:47,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 17:20:49,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 17:20:49,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:20:49,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:20:53,112 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:20:53,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-28 17:20:53,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-28 17:20:53,232 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:20:53,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-28 17:20:55,329 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=10.40 vs. limit=15.0 2023-09-28 17:21:03,524 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.30 vs. limit=15.0 2023-09-28 17:21:06,087 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=87960.0, ans=0.125 2023-09-28 17:21:08,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:21:13,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:21:13,706 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:21:13,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:21:15,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 17:21:22,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:21:25,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 17:21:25,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 17:21:25,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:21:25,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-28 17:21:25,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-28 17:21:29,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:21:29,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:21:34,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:21:36,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-28 17:21:36,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:21:37,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:21:37,727 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-28 17:21:39,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 17:21:40,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:21:49,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:21:51,268 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:21:53,288 INFO [train.py:1039] (1/4) Epoch 3, batch 2600, loss[loss=0.3023, simple_loss=0.3394, pruned_loss=0.1326, over 23813.00 frames. ], tot_loss[loss=0.2877, simple_loss=0.3349, pruned_loss=0.1202, over 4708973.92 frames. ], batch size: 179, lr: 2.95e-02, grad_scale: 16.0 2023-09-28 17:21:54,710 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.952e+02 2.618e+02 3.140e+02 3.668e+02 6.690e+02, threshold=6.281e+02, percent-clipped=1.0 2023-09-28 17:21:54,948 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-28 17:21:58,535 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-28 17:21:58,575 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:22:00,072 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-28 17:22:00,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-28 17:22:00,241 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-28 17:22:03,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:22:03,354 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-28 17:22:05,532 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-28 17:22:07,067 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-28 17:22:07,397 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=88160.0, ans=0.125 2023-09-28 17:22:07,731 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.05 vs. limit=22.5 2023-09-28 17:22:09,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:22:10,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-28 17:22:12,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-28 17:22:13,819 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-28 17:22:13,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-28 17:22:16,841 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-28 17:22:16,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-28 17:22:18,803 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=88226.66666666667, ans=0.0 2023-09-28 17:22:24,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:22:24,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:22:24,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:22:24,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-28 17:22:27,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:22:35,250 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-28 17:22:38,703 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=88293.33333333333, ans=0.0 2023-09-28 17:22:39,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:22:42,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:22:42,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-28 17:22:42,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:22:42,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:22:44,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-28 17:22:46,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-28 17:22:46,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:22:47,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:22:52,209 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-28 17:22:53,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:22:53,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:23:00,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:23:00,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:23:00,483 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-28 17:23:01,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:23:03,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:23:04,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:23:05,292 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=88426.66666666667, ans=0.125 2023-09-28 17:23:11,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-28 17:23:11,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:23:14,659 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 17:23:16,461 INFO [train.py:1039] (1/4) Epoch 3, batch 2650, loss[loss=0.2717, simple_loss=0.3133, pruned_loss=0.1151, over 23374.00 frames. ], tot_loss[loss=0.2906, simple_loss=0.3373, pruned_loss=0.1219, over 4688630.98 frames. ], batch size: 105, lr: 2.94e-02, grad_scale: 16.0 2023-09-28 17:23:16,959 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=88493.33333333333, ans=0.125 2023-09-28 17:23:20,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-28 17:23:21,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:23:21,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 17:23:23,352 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-28 17:23:23,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:23:24,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:23:26,861 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=88493.33333333333, ans=0.1 2023-09-28 17:23:28,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 17:23:28,374 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=88493.33333333333, ans=0.2 2023-09-28 17:23:29,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:23:32,546 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:23:34,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-28 17:23:34,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 17:23:34,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:23:37,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-28 17:23:39,588 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-28 17:23:43,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:23:46,339 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-28 17:23:46,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:23:46,477 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-28 17:23:50,028 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.80 vs. limit=6.0 2023-09-28 17:23:50,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:23:50,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-28 17:23:51,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:23:51,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:23:55,498 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=88626.66666666667, ans=0.125 2023-09-28 17:23:56,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-28 17:23:58,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-28 17:23:59,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-28 17:24:02,804 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-28 17:24:02,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:24:02,970 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:24:03,012 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-28 17:24:03,257 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=88626.66666666667, ans=10.0 2023-09-28 17:24:04,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:24:04,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:24:06,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:24:08,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:24:09,632 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:24:09,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:24:11,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:24:13,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:24:14,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 17:24:14,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:24:16,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:24:16,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-28 17:24:21,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:24:22,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:24:24,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:24:24,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-28 17:24:27,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:24:29,175 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:24:32,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:24:34,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:24:35,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-28 17:24:35,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:24:37,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:24:37,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-28 17:24:38,872 INFO [train.py:1039] (1/4) Epoch 3, batch 2700, loss[loss=0.2777, simple_loss=0.3322, pruned_loss=0.1116, over 23967.00 frames. ], tot_loss[loss=0.2911, simple_loss=0.3379, pruned_loss=0.1221, over 4679982.34 frames. ], batch size: 80, lr: 2.94e-02, grad_scale: 16.0 2023-09-28 17:24:40,990 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.036e+02 2.674e+02 3.068e+02 3.788e+02 5.664e+02, threshold=6.136e+02, percent-clipped=0.0 2023-09-28 17:24:41,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:24:42,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 17:24:44,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:24:46,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:24:46,115 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:24:49,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:24:49,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:24:49,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:24:49,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-28 17:24:50,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-28 17:24:52,492 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:24:52,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-28 17:24:54,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 17:24:54,234 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:24:58,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:25:00,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-28 17:25:00,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:25:03,226 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.90 vs. limit=15.0 2023-09-28 17:25:05,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:25:05,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:25:12,779 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:25:12,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:25:14,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:25:14,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:25:18,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:25:21,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:25:22,503 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-28 17:25:22,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:25:27,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:25:27,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:25:33,267 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.54 vs. limit=22.5 2023-09-28 17:25:34,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:25:36,458 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:25:36,729 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=89026.66666666667, ans=0.2 2023-09-28 17:25:39,807 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:25:39,810 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:25:44,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:25:44,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:25:45,218 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=89093.33333333333, ans=0.0 2023-09-28 17:25:45,939 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=15.49 vs. limit=22.5 2023-09-28 17:25:46,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:25:48,049 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:25:49,573 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:25:49,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:25:53,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:25:54,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:25:54,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:25:57,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-28 17:25:59,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:26:02,680 INFO [train.py:1039] (1/4) Epoch 3, batch 2750, loss[loss=0.2862, simple_loss=0.326, pruned_loss=0.1232, over 23811.00 frames. ], tot_loss[loss=0.2905, simple_loss=0.3376, pruned_loss=0.1217, over 4685089.14 frames. ], batch size: 164, lr: 2.93e-02, grad_scale: 16.0 2023-09-28 17:26:02,773 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:26:02,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-28 17:26:04,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-28 17:26:04,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:26:07,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:26:07,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:26:10,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:26:10,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-28 17:26:10,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:26:15,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:26:17,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 17:26:17,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:26:17,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:26:17,638 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-28 17:26:17,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:26:17,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:26:21,302 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=89226.66666666667, ans=0.125 2023-09-28 17:26:24,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-28 17:26:24,407 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_positive, batch_count=89226.66666666667, ans=0.05 2023-09-28 17:26:27,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:26:27,882 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:26:29,443 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:26:29,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-28 17:26:30,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:26:32,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:26:32,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:26:33,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:26:37,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 17:26:37,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 17:26:39,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 17:26:39,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:26:42,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 17:26:47,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:26:49,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 17:26:49,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:26:54,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:26:54,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:26:54,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:27:01,045 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:27:03,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:27:03,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-28 17:27:03,368 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=89360.0, ans=0.0 2023-09-28 17:27:07,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:27:09,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-28 17:27:14,590 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-28 17:27:17,543 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:27:17,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-28 17:27:19,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:27:23,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:27:23,280 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-28 17:27:23,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:27:26,184 INFO [train.py:1039] (1/4) Epoch 3, batch 2800, loss[loss=0.2777, simple_loss=0.3167, pruned_loss=0.1193, over 23673.00 frames. ], tot_loss[loss=0.2875, simple_loss=0.3353, pruned_loss=0.1199, over 4704505.36 frames. ], batch size: 164, lr: 2.93e-02, grad_scale: 32.0 2023-09-28 17:27:27,584 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.002e+02 2.563e+02 3.005e+02 3.573e+02 5.260e+02, threshold=6.010e+02, percent-clipped=0.0 2023-09-28 17:27:27,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-28 17:27:27,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:27:27,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:27:29,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-28 17:27:29,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:27:29,498 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:27:31,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:27:32,643 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-28 17:27:32,644 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-28 17:27:36,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:27:37,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:27:37,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:27:42,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:27:44,042 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-28 17:27:44,253 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=89560.0, ans=0.125 2023-09-28 17:27:47,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-28 17:27:49,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-28 17:27:50,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:27:50,808 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:27:50,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:27:54,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:27:54,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:27:54,169 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-28 17:27:56,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:28:04,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:28:07,784 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:28:10,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:28:10,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:28:11,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:28:17,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:28:17,900 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-28 17:28:17,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:28:21,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:28:21,344 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:28:21,798 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=89693.33333333333, ans=0.05 2023-09-28 17:28:24,451 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:28:25,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:28:26,092 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=89693.33333333333, ans=0.0 2023-09-28 17:28:28,160 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.48 vs. limit=15.0 2023-09-28 17:28:30,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:28:32,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:28:32,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:28:32,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 17:28:32,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 17:28:32,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 17:28:34,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:28:34,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-28 17:28:34,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:28:36,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:28:36,379 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:28:37,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-28 17:28:39,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:28:39,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:28:41,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:28:43,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-28 17:28:49,357 INFO [train.py:1039] (1/4) Epoch 3, batch 2850, loss[loss=0.2866, simple_loss=0.3176, pruned_loss=0.1278, over 22944.00 frames. ], tot_loss[loss=0.2856, simple_loss=0.3332, pruned_loss=0.1191, over 4700693.13 frames. ], batch size: 322, lr: 2.92e-02, grad_scale: 32.0 2023-09-28 17:28:49,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:28:49,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 17:28:51,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:28:52,805 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:28:56,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:28:56,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:28:56,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:29:01,062 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:29:01,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:29:02,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-28 17:29:02,749 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-28 17:29:10,042 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-28 17:29:10,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:29:12,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-28 17:29:13,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:29:18,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-28 17:29:18,424 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-28 17:29:19,866 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:29:31,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:29:31,802 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=89960.0, ans=0.0 2023-09-28 17:29:32,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:29:33,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:29:34,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 17:29:34,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:29:34,643 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:29:37,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 17:29:37,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-28 17:29:41,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:29:41,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:29:41,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:29:41,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:29:44,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:29:46,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:29:46,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:29:48,331 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:29:50,123 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 17:29:51,940 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:29:52,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:29:52,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:29:53,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:29:58,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:30:00,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-28 17:30:00,844 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-28 17:30:03,757 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 17:30:05,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:30:05,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-28 17:30:05,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:30:06,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:30:06,893 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:30:06,938 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:30:06,939 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-28 17:30:08,398 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-28 17:30:08,404 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:30:08,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:30:13,065 INFO [train.py:1039] (1/4) Epoch 3, batch 2900, loss[loss=0.2996, simple_loss=0.3594, pruned_loss=0.1199, over 24671.00 frames. ], tot_loss[loss=0.2848, simple_loss=0.3324, pruned_loss=0.1186, over 4693626.99 frames. ], batch size: 73, lr: 2.92e-02, grad_scale: 32.0 2023-09-28 17:30:13,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-28 17:30:15,028 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.860e+02 2.599e+02 2.941e+02 3.399e+02 5.344e+02, threshold=5.883e+02, percent-clipped=0.0 2023-09-28 17:30:15,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:30:15,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:30:17,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-28 17:30:22,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:30:22,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-28 17:30:22,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-28 17:30:24,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:30:24,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-28 17:30:26,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:30:27,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:30:32,001 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:30:32,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:30:37,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-28 17:30:37,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-28 17:30:38,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-28 17:30:40,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:30:43,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-28 17:30:43,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-28 17:30:48,140 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:30:48,145 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-28 17:30:48,174 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:30:49,924 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:30:51,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-28 17:30:51,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:30:53,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:30:56,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:30:57,691 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=90293.33333333333, ans=0.0 2023-09-28 17:30:57,885 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=90293.33333333333, ans=0.1 2023-09-28 17:30:59,017 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:30:59,473 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=90293.33333333333, ans=0.0 2023-09-28 17:30:59,690 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.05 vs. limit=6.0 2023-09-28 17:31:00,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-28 17:31:00,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-28 17:31:00,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:31:04,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:31:06,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-28 17:31:06,910 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:31:11,980 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:31:17,439 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.42 vs. limit=15.0 2023-09-28 17:31:21,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:31:21,839 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-28 17:31:23,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-28 17:31:25,898 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=90426.66666666667, ans=0.125 2023-09-28 17:31:27,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:31:27,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-28 17:31:27,370 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=90426.66666666667, ans=0.125 2023-09-28 17:31:28,990 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:31:29,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-28 17:31:35,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:31:36,414 INFO [train.py:1039] (1/4) Epoch 3, batch 2950, loss[loss=0.2898, simple_loss=0.3345, pruned_loss=0.1226, over 23362.00 frames. ], tot_loss[loss=0.2867, simple_loss=0.3339, pruned_loss=0.1197, over 4690404.46 frames. ], batch size: 105, lr: 2.92e-02, grad_scale: 32.0 2023-09-28 17:31:36,618 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-28 17:31:38,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:31:38,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:31:39,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:31:41,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:31:43,326 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-28 17:31:44,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-28 17:31:46,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 17:31:46,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:31:48,189 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=90493.33333333333, ans=0.125 2023-09-28 17:31:52,384 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:31:55,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:31:57,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:31:57,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:32:02,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:32:02,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:32:03,162 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=90560.0, ans=0.0 2023-09-28 17:32:04,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:32:06,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:32:06,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:32:07,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-28 17:32:12,778 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-28 17:32:12,819 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-28 17:32:12,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:32:14,467 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-28 17:32:14,674 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=90626.66666666667, ans=0.2 2023-09-28 17:32:15,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-28 17:32:16,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:32:16,352 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=90626.66666666667, ans=0.0 2023-09-28 17:32:17,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:32:17,993 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-28 17:32:18,011 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-28 17:32:22,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-28 17:32:22,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:32:22,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:32:25,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:32:27,701 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=90693.33333333333, ans=0.0 2023-09-28 17:32:28,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:32:28,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:32:28,805 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-28 17:32:28,868 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:32:30,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-28 17:32:36,758 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:32:37,079 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=90693.33333333333, ans=0.0 2023-09-28 17:32:38,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:32:38,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-28 17:32:38,362 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:32:40,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-28 17:32:43,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:32:43,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:32:45,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:32:46,864 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:32:46,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 17:32:47,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:32:48,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:32:48,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-28 17:32:48,746 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=90760.0, ans=0.125 2023-09-28 17:32:48,907 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=90760.0, ans=0.125 2023-09-28 17:32:49,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:32:50,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:32:51,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:32:54,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:32:54,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-28 17:32:56,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:32:59,383 INFO [train.py:1039] (1/4) Epoch 3, batch 3000, loss[loss=0.2465, simple_loss=0.305, pruned_loss=0.09403, over 24504.00 frames. ], tot_loss[loss=0.2871, simple_loss=0.3348, pruned_loss=0.1197, over 4703934.04 frames. ], batch size: 58, lr: 2.91e-02, grad_scale: 32.0 2023-09-28 17:32:59,383 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-28 17:33:07,886 INFO [zipformer.py:1853] (1/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([4.7598, 3.8137, 4.4019, 4.1778], device='cuda:1') 2023-09-28 17:33:13,937 INFO [train.py:1071] (1/4) Epoch 3, validation: loss=0.3974, simple_loss=0.3326, pruned_loss=0.2311, over 1125622.00 frames. 2023-09-28 17:33:13,938 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-28 17:33:15,398 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.839e+02 2.502e+02 2.937e+02 3.419e+02 4.607e+02, threshold=5.874e+02, percent-clipped=0.0 2023-09-28 17:33:15,586 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:33:16,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-28 17:33:18,749 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-28 17:33:20,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-28 17:33:23,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:33:23,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:33:24,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-28 17:33:24,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:33:32,909 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 17:33:33,312 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=90893.33333333333, ans=0.2 2023-09-28 17:33:42,189 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:33:48,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-28 17:33:49,175 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=90960.0, ans=0.2 2023-09-28 17:33:50,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:33:54,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 17:33:54,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:33:54,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:33:57,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:33:57,311 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-28 17:33:57,607 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=90960.0, ans=0.125 2023-09-28 17:34:00,416 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-28 17:34:03,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:34:03,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 17:34:05,534 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:34:05,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:34:07,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:34:07,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:34:10,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 17:34:10,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:34:10,238 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:34:11,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:34:13,461 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-28 17:34:14,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:34:14,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:34:16,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:34:21,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:34:21,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:34:22,803 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-28 17:34:23,614 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-28 17:34:25,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:34:25,088 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-28 17:34:25,177 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:34:30,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-28 17:34:31,879 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-28 17:34:33,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 17:34:33,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-28 17:34:34,803 INFO [train.py:1039] (1/4) Epoch 3, batch 3050, loss[loss=0.2688, simple_loss=0.3376, pruned_loss=0.1, over 24270.00 frames. ], tot_loss[loss=0.2861, simple_loss=0.3345, pruned_loss=0.1189, over 4698807.59 frames. ], batch size: 74, lr: 2.91e-02, grad_scale: 32.0 2023-09-28 17:34:34,960 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-28 17:34:34,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 17:34:36,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:34:38,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:34:38,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-28 17:34:38,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:34:40,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:34:40,296 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 17:34:41,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-28 17:34:43,361 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:34:45,132 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=91160.0, ans=0.0 2023-09-28 17:34:46,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:34:47,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:34:49,524 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=91226.66666666667, ans=0.0 2023-09-28 17:34:50,957 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:34:53,439 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.21 vs. limit=6.0 2023-09-28 17:34:54,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-28 17:35:02,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-28 17:35:02,527 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-28 17:35:02,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:35:04,450 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=91226.66666666667, ans=0.1 2023-09-28 17:35:07,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:35:09,693 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:35:09,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:35:10,082 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=91293.33333333333, ans=0.125 2023-09-28 17:35:11,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:35:14,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:35:16,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-28 17:35:16,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:35:16,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:35:16,377 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:35:17,728 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:35:19,491 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer_ff3.min_abs, batch_count=91293.33333333333, ans=0.2 2023-09-28 17:35:19,819 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.03 vs. limit=15.0 2023-09-28 17:35:20,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:35:22,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:35:22,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-28 17:35:23,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:35:23,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 17:35:27,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:35:27,607 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:35:27,707 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:35:29,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:35:33,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:35:33,993 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:35:39,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:35:40,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:35:40,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:35:42,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:35:42,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 17:35:42,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:35:44,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-28 17:35:44,890 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=91426.66666666667, ans=0.125 2023-09-28 17:35:46,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:35:46,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:35:46,602 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=91426.66666666667, ans=0.0 2023-09-28 17:35:47,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-28 17:35:50,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:35:55,366 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:35:57,520 INFO [train.py:1039] (1/4) Epoch 3, batch 3100, loss[loss=0.2951, simple_loss=0.354, pruned_loss=0.1181, over 24013.00 frames. ], tot_loss[loss=0.2848, simple_loss=0.3333, pruned_loss=0.1181, over 4713661.80 frames. ], batch size: 86, lr: 2.90e-02, grad_scale: 16.0 2023-09-28 17:35:57,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:35:59,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 17:36:00,683 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.012e+02 2.573e+02 3.095e+02 3.783e+02 7.787e+02, threshold=6.189e+02, percent-clipped=2.0 2023-09-28 17:36:00,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-28 17:36:03,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-28 17:36:05,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-28 17:36:07,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:36:10,158 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:36:10,272 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=91493.33333333333, ans=0.0 2023-09-28 17:36:12,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:36:13,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-28 17:36:19,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:36:20,761 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=91560.0, ans=0.125 2023-09-28 17:36:22,374 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=91560.0, ans=0.05 2023-09-28 17:36:25,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-28 17:36:29,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 17:36:31,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:36:32,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:36:33,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:36:33,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-28 17:36:35,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:36:35,169 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-28 17:36:35,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:36:36,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:36:39,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-28 17:36:39,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:36:43,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:36:43,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-28 17:36:45,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-28 17:36:47,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:36:47,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:36:50,418 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:36:50,435 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:36:50,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:36:54,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-28 17:36:54,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:36:54,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:36:55,628 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:36:55,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:36:55,652 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 17:37:00,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:37:00,362 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=91693.33333333333, ans=0.0 2023-09-28 17:37:01,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-28 17:37:05,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:37:05,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-28 17:37:06,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:37:07,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:37:08,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-28 17:37:19,683 INFO [train.py:1039] (1/4) Epoch 3, batch 3150, loss[loss=0.2566, simple_loss=0.3096, pruned_loss=0.1018, over 24335.00 frames. ], tot_loss[loss=0.2837, simple_loss=0.333, pruned_loss=0.1172, over 4709436.25 frames. ], batch size: 56, lr: 2.90e-02, grad_scale: 16.0 2023-09-28 17:37:19,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-28 17:37:22,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:37:22,268 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=91826.66666666667, ans=0.125 2023-09-28 17:37:23,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:37:25,265 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:37:25,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:37:25,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-28 17:37:27,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:37:27,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-28 17:37:28,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-28 17:37:30,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:37:30,894 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=91826.66666666667, ans=0.0 2023-09-28 17:37:32,332 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-28 17:37:36,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-28 17:37:36,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:37:39,102 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-28 17:37:39,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-28 17:37:40,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-28 17:37:40,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-28 17:37:40,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-28 17:37:40,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:37:40,955 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:37:42,563 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:37:45,515 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-28 17:37:47,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:37:47,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:37:48,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:37:50,161 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-28 17:37:54,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-28 17:37:54,195 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:37:57,722 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-28 17:37:57,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:37:59,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-28 17:38:02,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-28 17:38:04,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:38:04,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 17:38:04,811 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 17:38:06,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:38:06,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 17:38:06,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-28 17:38:07,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-28 17:38:09,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-28 17:38:09,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 17:38:09,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:38:11,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:38:11,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:38:13,166 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-28 17:38:13,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:38:13,670 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=92026.66666666667, ans=0.0 2023-09-28 17:38:14,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-28 17:38:16,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:38:17,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-28 17:38:19,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-28 17:38:20,822 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:38:20,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:38:21,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-28 17:38:22,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 17:38:22,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:38:25,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:38:26,630 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=92093.33333333333, ans=0.125 2023-09-28 17:38:27,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:38:27,843 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:38:34,172 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:38:34,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:38:37,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-28 17:38:43,052 INFO [train.py:1039] (1/4) Epoch 3, batch 3200, loss[loss=0.2775, simple_loss=0.2872, pruned_loss=0.1339, over 19153.00 frames. ], tot_loss[loss=0.2834, simple_loss=0.3324, pruned_loss=0.1171, over 4707053.79 frames. ], batch size: 388, lr: 2.90e-02, grad_scale: 32.0 2023-09-28 17:38:43,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:38:43,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-28 17:38:46,886 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.841e+02 2.531e+02 2.998e+02 3.452e+02 5.958e+02, threshold=5.995e+02, percent-clipped=0.0 2023-09-28 17:38:47,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:38:48,684 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:38:48,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-28 17:38:51,078 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=8.16 vs. limit=15.0 2023-09-28 17:38:51,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:38:54,866 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-28 17:38:59,497 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:39:08,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:39:19,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-28 17:39:21,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:39:24,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-28 17:39:24,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 17:39:27,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:39:27,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 17:39:29,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:39:29,638 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=92293.33333333333, ans=0.125 2023-09-28 17:39:32,509 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-28 17:39:33,582 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=92360.0, ans=0.125 2023-09-28 17:39:34,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-28 17:39:38,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-28 17:39:43,365 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-28 17:39:45,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:39:51,186 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:39:51,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:39:51,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:39:51,381 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-28 17:39:51,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 17:39:55,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:39:56,802 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-28 17:39:56,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-28 17:39:58,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-28 17:39:59,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-28 17:40:01,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:40:05,799 INFO [train.py:1039] (1/4) Epoch 3, batch 3250, loss[loss=0.2542, simple_loss=0.3246, pruned_loss=0.09194, over 24661.00 frames. ], tot_loss[loss=0.2828, simple_loss=0.3322, pruned_loss=0.1167, over 4708114.60 frames. ], batch size: 73, lr: 2.89e-02, grad_scale: 32.0 2023-09-28 17:40:05,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-28 17:40:05,902 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-28 17:40:05,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:40:05,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:40:07,508 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-28 17:40:10,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:40:15,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:40:16,343 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten.whitening_limit, batch_count=92493.33333333333, ans=15.0 2023-09-28 17:40:17,634 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=92493.33333333333, ans=0.125 2023-09-28 17:40:20,596 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=92493.33333333333, ans=0.0 2023-09-28 17:40:22,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:40:22,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-28 17:40:23,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:40:23,657 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:40:25,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:40:27,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:40:27,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 17:40:30,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:40:30,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:40:30,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:40:30,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:40:30,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:40:30,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:40:32,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:40:33,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:40:35,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:40:35,737 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:40:37,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:40:37,405 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:40:37,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:40:42,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-28 17:40:42,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:40:42,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:40:42,984 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=92626.66666666667, ans=0.125 2023-09-28 17:40:46,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:40:46,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-28 17:40:50,899 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=92626.66666666667, ans=0.125 2023-09-28 17:40:52,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 17:40:57,401 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.16 vs. limit=22.5 2023-09-28 17:41:00,569 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:41:00,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:41:00,620 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-28 17:41:00,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:41:01,168 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=16.86 vs. limit=15.0 2023-09-28 17:41:02,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 17:41:02,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:41:05,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-28 17:41:05,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-28 17:41:06,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:41:06,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:41:07,050 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=92693.33333333333, ans=0.1 2023-09-28 17:41:08,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:41:08,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-28 17:41:09,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:41:12,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:41:12,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:41:15,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-28 17:41:15,116 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:41:18,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:41:18,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-28 17:41:23,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:41:23,769 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-28 17:41:25,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-28 17:41:26,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-28 17:41:26,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:41:29,925 INFO [train.py:1039] (1/4) Epoch 3, batch 3300, loss[loss=0.2964, simple_loss=0.3389, pruned_loss=0.127, over 23205.00 frames. ], tot_loss[loss=0.284, simple_loss=0.3329, pruned_loss=0.1175, over 4704929.27 frames. ], batch size: 119, lr: 2.89e-02, grad_scale: 32.0 2023-09-28 17:41:30,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:41:31,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:41:31,809 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:41:33,734 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.895e+02 2.576e+02 3.097e+02 3.556e+02 6.978e+02, threshold=6.193e+02, percent-clipped=2.0 2023-09-28 17:41:34,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 17:41:35,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 17:41:37,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:41:40,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:41:43,195 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-28 17:41:44,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:41:44,741 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:41:47,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:41:47,718 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-28 17:41:49,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:41:49,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 17:41:51,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 17:41:51,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:41:51,497 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-28 17:41:58,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:41:58,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-28 17:41:59,310 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.22 vs. limit=15.0 2023-09-28 17:42:01,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:42:01,606 WARNING [train.py:1197] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-28 17:42:03,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-28 17:42:03,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:42:04,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:42:06,455 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-28 17:42:09,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-28 17:42:09,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:42:13,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-28 17:42:13,320 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=92960.0, ans=0.125 2023-09-28 17:42:17,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:42:19,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-28 17:42:20,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:42:23,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:42:23,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:42:23,640 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:42:23,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:42:26,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:42:26,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:42:26,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:42:27,161 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=93026.66666666667, ans=0.2 2023-09-28 17:42:28,344 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-28 17:42:30,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-28 17:42:32,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-28 17:42:32,597 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:42:32,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:42:34,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:42:34,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:42:36,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 17:42:37,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:42:37,857 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-28 17:42:37,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:42:39,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 17:42:41,858 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=6.12 vs. limit=15.0 2023-09-28 17:42:42,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-28 17:42:44,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:42:44,237 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:42:44,436 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=93093.33333333333, ans=0.125 2023-09-28 17:42:47,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 17:42:47,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:42:50,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:42:52,063 INFO [train.py:1039] (1/4) Epoch 3, batch 3350, loss[loss=0.282, simple_loss=0.3276, pruned_loss=0.1182, over 23497.00 frames. ], tot_loss[loss=0.2847, simple_loss=0.3339, pruned_loss=0.1178, over 4716318.22 frames. ], batch size: 105, lr: 2.88e-02, grad_scale: 32.0 2023-09-28 17:42:52,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:42:52,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:42:53,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:42:55,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:42:56,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:42:59,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:43:03,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:43:05,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:43:05,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:43:06,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-28 17:43:08,417 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-28 17:43:08,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:43:12,210 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=93226.66666666667, ans=0.1 2023-09-28 17:43:14,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-28 17:43:14,801 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-28 17:43:14,974 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:43:15,247 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-28 17:43:16,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:43:16,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:43:17,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-28 17:43:17,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:43:18,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:43:20,917 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:43:22,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:43:23,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:43:23,193 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=93293.33333333333, ans=0.125 2023-09-28 17:43:24,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:43:27,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:43:27,951 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=93293.33333333333, ans=0.0 2023-09-28 17:43:29,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:43:30,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:43:33,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:43:35,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:43:39,120 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:43:39,137 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:43:42,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:43:45,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-28 17:43:45,504 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 17:43:45,557 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-28 17:43:45,605 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:43:47,661 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-28 17:43:49,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:43:50,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:43:54,186 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=93360.0, ans=0.0 2023-09-28 17:43:55,819 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=93426.66666666667, ans=0.125 2023-09-28 17:43:55,951 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=93426.66666666667, ans=0.0 2023-09-28 17:43:57,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:43:57,800 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-28 17:43:57,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 17:43:59,378 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:44:00,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:44:05,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:44:08,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-28 17:44:10,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 17:44:10,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:44:10,438 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=93426.66666666667, ans=0.0 2023-09-28 17:44:11,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:44:12,978 INFO [train.py:1039] (1/4) Epoch 3, batch 3400, loss[loss=0.3329, simple_loss=0.361, pruned_loss=0.1524, over 23805.00 frames. ], tot_loss[loss=0.2839, simple_loss=0.3336, pruned_loss=0.1171, over 4724819.79 frames. ], batch size: 179, lr: 2.88e-02, grad_scale: 32.0 2023-09-28 17:44:13,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-28 17:44:13,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:44:13,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-28 17:44:14,047 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=93493.33333333333, ans=0.125 2023-09-28 17:44:15,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:44:15,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:44:15,533 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-28 17:44:17,383 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.863e+02 2.557e+02 2.981e+02 3.725e+02 6.496e+02, threshold=5.961e+02, percent-clipped=1.0 2023-09-28 17:44:17,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:44:19,002 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-28 17:44:22,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-28 17:44:22,734 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-28 17:44:22,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:44:28,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:44:28,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 17:44:28,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:44:29,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:44:35,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:44:36,074 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=93560.0, ans=0.125 2023-09-28 17:44:37,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-28 17:44:39,314 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 17:44:43,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:44:45,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:44:46,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:44:47,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-28 17:44:55,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:44:59,013 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=93626.66666666667, ans=0.125 2023-09-28 17:45:00,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-28 17:45:04,399 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:45:05,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:45:05,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-28 17:45:06,152 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=93693.33333333333, ans=0.0 2023-09-28 17:45:07,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:45:07,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:45:07,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:45:08,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:45:11,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:45:15,282 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=93693.33333333333, ans=0.125 2023-09-28 17:45:16,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 17:45:16,616 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:45:18,673 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=93760.0, ans=0.125 2023-09-28 17:45:22,655 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:45:24,883 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-28 17:45:33,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 17:45:36,559 INFO [train.py:1039] (1/4) Epoch 3, batch 3450, loss[loss=0.2981, simple_loss=0.3517, pruned_loss=0.1222, over 23626.00 frames. ], tot_loss[loss=0.2835, simple_loss=0.333, pruned_loss=0.117, over 4721750.91 frames. ], batch size: 93, lr: 2.88e-02, grad_scale: 32.0 2023-09-28 17:45:38,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-28 17:45:42,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-28 17:45:43,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:45:45,121 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:45:45,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-28 17:45:46,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:45:49,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-28 17:45:55,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:45:55,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:45:55,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:45:55,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:45:59,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:46:04,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-28 17:46:06,504 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.04 vs. limit=15.0 2023-09-28 17:46:12,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-28 17:46:12,091 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 17:46:12,147 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:46:13,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:46:15,607 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.49 vs. limit=15.0 2023-09-28 17:46:20,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-28 17:46:21,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 17:46:24,557 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=94026.66666666667, ans=0.125 2023-09-28 17:46:25,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:46:25,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:46:27,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-28 17:46:28,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:46:30,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-28 17:46:30,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:46:30,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:46:32,987 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=94026.66666666667, ans=0.1 2023-09-28 17:46:35,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:46:39,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-28 17:46:42,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:46:49,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:46:49,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:46:52,754 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:46:57,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:46:57,496 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:46:57,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:46:58,902 INFO [train.py:1039] (1/4) Epoch 3, batch 3500, loss[loss=0.2628, simple_loss=0.3288, pruned_loss=0.09839, over 24489.00 frames. ], tot_loss[loss=0.283, simple_loss=0.3315, pruned_loss=0.1172, over 4709719.43 frames. ], batch size: 66, lr: 2.87e-02, grad_scale: 16.0 2023-09-28 17:46:59,003 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:47:03,553 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.085e+02 2.532e+02 3.066e+02 3.931e+02 6.870e+02, threshold=6.132e+02, percent-clipped=2.0 2023-09-28 17:47:03,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:47:07,439 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:47:07,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-28 17:47:09,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 17:47:13,025 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=94160.0, ans=0.0 2023-09-28 17:47:14,227 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-28 17:47:15,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:47:16,001 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-28 17:47:23,388 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:47:23,540 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:47:25,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:47:25,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:47:25,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-28 17:47:25,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:47:25,513 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=94226.66666666667, ans=0.125 2023-09-28 17:47:26,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:47:26,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-28 17:47:29,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:47:29,795 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-28 17:47:29,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:47:33,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:47:34,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-28 17:47:36,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:47:39,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:47:41,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:47:43,424 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:47:45,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:47:45,075 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:47:45,298 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-28 17:47:46,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-28 17:47:48,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-28 17:47:49,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:47:50,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:47:52,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:47:53,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 17:47:56,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 17:47:56,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:48:02,990 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:48:04,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-28 17:48:04,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-28 17:48:04,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:48:06,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:48:07,722 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:48:09,321 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:48:12,788 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-28 17:48:12,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:48:14,512 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:48:16,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-28 17:48:18,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-28 17:48:18,268 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=94426.66666666667, ans=0.2 2023-09-28 17:48:21,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:48:22,657 INFO [train.py:1039] (1/4) Epoch 3, batch 3550, loss[loss=0.2657, simple_loss=0.3277, pruned_loss=0.1018, over 24483.00 frames. ], tot_loss[loss=0.2825, simple_loss=0.3304, pruned_loss=0.1173, over 4709169.60 frames. ], batch size: 66, lr: 2.87e-02, grad_scale: 16.0 2023-09-28 17:48:22,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:48:22,846 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:48:24,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:48:27,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:48:39,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:48:41,011 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=94560.0, ans=0.1 2023-09-28 17:48:42,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 17:48:43,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:48:45,336 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:48:46,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:48:49,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:48:49,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 17:48:52,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:48:52,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:48:52,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:48:52,384 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-28 17:48:54,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:48:59,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:48:59,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:48:59,658 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.52 vs. limit=15.0 2023-09-28 17:49:01,012 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=94626.66666666667, ans=0.0 2023-09-28 17:49:02,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:49:02,169 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:49:04,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:49:04,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-28 17:49:04,251 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:49:04,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:49:06,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 17:49:12,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:49:14,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:49:14,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:49:15,386 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.64 vs. limit=6.0 2023-09-28 17:49:16,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-28 17:49:16,427 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=94693.33333333333, ans=0.125 2023-09-28 17:49:17,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:49:19,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-28 17:49:21,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:49:23,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:49:24,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:49:26,159 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-28 17:49:27,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:49:34,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:49:36,049 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-28 17:49:36,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:49:43,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:49:44,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-28 17:49:46,257 INFO [train.py:1039] (1/4) Epoch 3, batch 3600, loss[loss=0.2959, simple_loss=0.3335, pruned_loss=0.1292, over 23635.00 frames. ], tot_loss[loss=0.2817, simple_loss=0.33, pruned_loss=0.1167, over 4718367.55 frames. ], batch size: 149, lr: 2.86e-02, grad_scale: 32.0 2023-09-28 17:49:50,960 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.901e+02 2.527e+02 2.760e+02 3.413e+02 5.643e+02, threshold=5.521e+02, percent-clipped=0.0 2023-09-28 17:49:51,174 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-28 17:49:52,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:49:52,924 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=94826.66666666667, ans=0.1 2023-09-28 17:49:54,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:49:56,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:49:56,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:49:57,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:49:59,584 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=94826.66666666667, ans=0.125 2023-09-28 17:50:00,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:50:02,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:50:02,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:50:04,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:50:04,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:50:04,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-28 17:50:09,162 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 17:50:10,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:50:14,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:50:14,653 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=94893.33333333333, ans=0.125 2023-09-28 17:50:17,954 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:50:19,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 17:50:19,504 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:50:19,547 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-28 17:50:21,001 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:50:21,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:50:22,822 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:50:25,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:50:28,092 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:50:29,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:50:29,707 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 17:50:30,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-28 17:50:35,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:50:37,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 17:50:37,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-28 17:50:42,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:50:44,831 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=95026.66666666667, ans=0.09899494936611666 2023-09-28 17:50:48,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:50:53,262 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:50:55,911 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.12 vs. limit=15.0 2023-09-28 17:50:59,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-28 17:50:59,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:50:59,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-28 17:51:01,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-28 17:51:01,964 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-28 17:51:02,197 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=95093.33333333333, ans=0.0 2023-09-28 17:51:03,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:51:05,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:51:06,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-28 17:51:06,482 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:51:07,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:51:07,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:51:08,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-28 17:51:09,412 INFO [train.py:1039] (1/4) Epoch 3, batch 3650, loss[loss=0.3077, simple_loss=0.3472, pruned_loss=0.1341, over 23778.00 frames. ], tot_loss[loss=0.2832, simple_loss=0.3313, pruned_loss=0.1175, over 4716289.02 frames. ], batch size: 179, lr: 2.86e-02, grad_scale: 32.0 2023-09-28 17:51:09,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-28 17:51:12,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:51:14,112 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-28 17:51:18,011 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=95160.0, ans=0.125 2023-09-28 17:51:19,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-28 17:51:21,464 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:51:24,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-28 17:51:24,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-28 17:51:29,769 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:51:29,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:51:29,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 17:51:32,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-28 17:51:34,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:51:34,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-28 17:51:36,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-28 17:51:36,751 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=95226.66666666667, ans=0.0 2023-09-28 17:51:37,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:51:37,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-28 17:51:39,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 17:51:40,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:51:40,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:51:41,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:51:44,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-28 17:51:44,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-28 17:51:45,131 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.62 vs. limit=15.0 2023-09-28 17:51:45,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:51:48,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-28 17:51:49,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:51:49,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:51:50,527 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.27 vs. limit=15.0 2023-09-28 17:51:56,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:51:57,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:51:57,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-28 17:51:59,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:52:01,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:52:03,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:52:06,130 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:52:08,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:52:08,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:52:10,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 17:52:12,439 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:52:12,537 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:52:17,342 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=95426.66666666667, ans=0.125 2023-09-28 17:52:18,684 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-28 17:52:21,038 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=95426.66666666667, ans=0.125 2023-09-28 17:52:23,599 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:52:23,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:52:25,203 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-28 17:52:25,299 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:52:25,515 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=95426.66666666667, ans=0.125 2023-09-28 17:52:26,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:52:28,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:52:30,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-28 17:52:30,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:52:31,717 INFO [train.py:1039] (1/4) Epoch 3, batch 3700, loss[loss=0.2928, simple_loss=0.3321, pruned_loss=0.1268, over 23674.00 frames. ], tot_loss[loss=0.2828, simple_loss=0.3313, pruned_loss=0.1172, over 4720896.37 frames. ], batch size: 135, lr: 2.86e-02, grad_scale: 32.0 2023-09-28 17:52:33,401 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 17:52:35,091 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:52:37,006 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.688e+02 2.521e+02 2.916e+02 3.663e+02 5.180e+02, threshold=5.833e+02, percent-clipped=0.0 2023-09-28 17:52:37,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:52:38,791 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:52:38,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-28 17:52:38,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:52:40,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 17:52:40,282 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 17:52:43,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 17:52:46,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:52:48,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:52:49,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:52:49,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:52:51,239 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 17:52:52,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:52:55,023 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-28 17:53:01,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:53:01,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 17:53:03,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 17:53:04,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-28 17:53:04,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:53:08,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:53:09,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-28 17:53:13,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:53:13,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:53:17,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:53:18,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 17:53:21,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 17:53:23,748 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=95693.33333333333, ans=0.125 2023-09-28 17:53:25,107 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=95693.33333333333, ans=0.125 2023-09-28 17:53:26,315 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:53:26,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-28 17:53:27,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:53:27,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-28 17:53:33,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:53:33,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:53:36,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:53:36,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-28 17:53:39,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:53:39,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-28 17:53:39,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:53:39,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:53:43,560 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.49 vs. limit=15.0 2023-09-28 17:53:45,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:53:45,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-28 17:53:47,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-28 17:53:47,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:53:47,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:53:49,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:53:51,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:53:51,475 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.min_positive, batch_count=95760.0, ans=0.025 2023-09-28 17:53:54,535 INFO [train.py:1039] (1/4) Epoch 3, batch 3750, loss[loss=0.3952, simple_loss=0.4018, pruned_loss=0.1943, over 19520.00 frames. ], tot_loss[loss=0.2844, simple_loss=0.3328, pruned_loss=0.118, over 4705906.19 frames. ], batch size: 388, lr: 2.85e-02, grad_scale: 32.0 2023-09-28 17:53:54,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:53:54,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:53:57,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:54:00,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-28 17:54:00,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 17:54:03,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-28 17:54:03,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-28 17:54:05,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:54:07,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:54:08,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:54:11,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:54:14,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:54:16,826 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=95893.33333333333, ans=0.125 2023-09-28 17:54:18,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:54:18,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 17:54:20,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:54:20,788 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=95893.33333333333, ans=0.125 2023-09-28 17:54:23,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:54:23,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-28 17:54:26,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:54:27,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:54:27,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:54:29,638 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=95960.0, ans=0.125 2023-09-28 17:54:30,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-28 17:54:35,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-28 17:54:37,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:54:37,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:54:40,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:54:44,026 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=96026.66666666667, ans=0.125 2023-09-28 17:54:45,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:54:45,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-28 17:54:50,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-28 17:54:52,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:54:56,021 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=96026.66666666667, ans=0.07 2023-09-28 17:54:57,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:54:57,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:55:00,814 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:55:04,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 17:55:06,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-28 17:55:08,502 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.92 vs. limit=6.0 2023-09-28 17:55:09,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 17:55:10,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:55:14,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:55:17,121 INFO [train.py:1039] (1/4) Epoch 3, batch 3800, loss[loss=0.2852, simple_loss=0.3401, pruned_loss=0.1152, over 24626.00 frames. ], tot_loss[loss=0.2835, simple_loss=0.3322, pruned_loss=0.1175, over 4713735.74 frames. ], batch size: 68, lr: 2.85e-02, grad_scale: 16.0 2023-09-28 17:55:23,803 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.013e+02 2.428e+02 2.901e+02 3.496e+02 5.183e+02, threshold=5.803e+02, percent-clipped=0.0 2023-09-28 17:55:23,981 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:55:27,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:55:27,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 17:55:27,254 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-28 17:55:28,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:55:30,342 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:55:31,030 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.89 vs. limit=15.0 2023-09-28 17:55:32,319 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-28 17:55:34,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 17:55:34,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:55:36,040 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:55:37,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:55:37,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:55:37,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:55:39,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-28 17:55:43,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-28 17:55:43,903 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:55:47,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:55:49,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:55:49,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 17:55:49,571 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=96293.33333333333, ans=0.125 2023-09-28 17:55:52,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-28 17:55:52,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:55:56,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:55:57,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:56:03,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 17:56:03,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-28 17:56:05,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:56:12,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:56:16,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:56:20,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-28 17:56:22,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-28 17:56:23,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:56:25,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:56:25,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:56:26,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-28 17:56:27,024 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=96426.66666666667, ans=0.0 2023-09-28 17:56:30,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-28 17:56:31,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-28 17:56:31,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:56:33,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:56:39,353 INFO [train.py:1039] (1/4) Epoch 3, batch 3850, loss[loss=0.2872, simple_loss=0.3511, pruned_loss=0.1116, over 24651.00 frames. ], tot_loss[loss=0.2832, simple_loss=0.3314, pruned_loss=0.1176, over 4704836.44 frames. ], batch size: 73, lr: 2.84e-02, grad_scale: 16.0 2023-09-28 17:56:39,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:56:40,299 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=32.34 vs. limit=22.5 2023-09-28 17:56:42,253 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 17:56:47,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:56:47,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-28 17:56:48,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 17:56:48,900 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:56:50,563 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=96493.33333333333, ans=0.1 2023-09-28 17:56:51,033 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=14.25 vs. limit=15.0 2023-09-28 17:56:53,246 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 17:56:58,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:56:59,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-28 17:57:00,123 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=96560.0, ans=0.125 2023-09-28 17:57:01,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-28 17:57:08,353 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:57:09,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:57:11,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:57:13,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:57:16,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:57:18,018 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:57:19,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:57:20,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:57:20,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:57:21,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:57:21,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:57:21,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:57:22,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-28 17:57:23,388 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-28 17:57:23,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:57:23,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:57:26,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:57:26,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:57:26,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-28 17:57:30,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-28 17:57:31,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:57:33,457 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-28 17:57:36,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-28 17:57:42,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:57:43,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:57:47,382 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.25 vs. limit=22.5 2023-09-28 17:57:48,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:57:49,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-28 17:57:51,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-28 17:57:53,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:57:54,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:57:55,747 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.96 vs. limit=15.0 2023-09-28 17:57:58,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:57:58,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:57:59,599 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:58:01,027 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:58:01,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:58:01,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-28 17:58:01,891 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.24 vs. limit=12.0 2023-09-28 17:58:02,477 INFO [train.py:1039] (1/4) Epoch 3, batch 3900, loss[loss=0.2909, simple_loss=0.3436, pruned_loss=0.1191, over 23260.00 frames. ], tot_loss[loss=0.2817, simple_loss=0.3289, pruned_loss=0.1173, over 4670490.12 frames. ], batch size: 93, lr: 2.84e-02, grad_scale: 16.0 2023-09-28 17:58:02,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:58:04,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-28 17:58:04,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:58:04,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:58:07,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:58:07,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:58:09,134 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.036e+02 2.471e+02 2.886e+02 3.509e+02 5.748e+02, threshold=5.772e+02, percent-clipped=0.0 2023-09-28 17:58:09,332 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:58:10,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:58:10,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:58:10,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:58:10,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-28 17:58:12,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:58:15,493 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:58:15,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 17:58:15,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:58:17,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:58:20,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 17:58:21,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:58:25,091 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-28 17:58:26,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-28 17:58:26,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:58:28,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-28 17:58:28,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:58:29,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-28 17:58:31,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-28 17:58:34,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:58:35,010 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=96960.0, ans=0.2 2023-09-28 17:58:36,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:58:36,135 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 17:58:37,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:58:40,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:58:41,028 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=96960.0, ans=0.125 2023-09-28 17:58:43,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:58:44,709 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=96960.0, ans=0.0 2023-09-28 17:58:45,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:58:45,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:58:47,525 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:58:48,373 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=6.71 vs. limit=15.0 2023-09-28 17:58:54,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:58:55,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:59:03,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 17:59:05,265 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:59:15,311 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:59:15,763 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=97093.33333333333, ans=0.0 2023-09-28 17:59:18,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:59:18,427 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-28 17:59:20,244 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-28 17:59:20,267 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:59:21,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-28 17:59:23,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:59:23,958 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=97093.33333333333, ans=0.2 2023-09-28 17:59:25,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-28 17:59:27,208 INFO [train.py:1039] (1/4) Epoch 3, batch 3950, loss[loss=0.299, simple_loss=0.3484, pruned_loss=0.1248, over 23895.00 frames. ], tot_loss[loss=0.2812, simple_loss=0.3287, pruned_loss=0.1168, over 4682987.35 frames. ], batch size: 86, lr: 2.84e-02, grad_scale: 16.0 2023-09-28 17:59:33,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:59:34,115 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-28 17:59:35,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:59:38,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:59:39,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:59:44,572 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-28 17:59:45,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 17:59:46,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-28 17:59:47,488 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-28 17:59:47,537 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:59:51,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:59:51,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-28 17:59:51,989 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:59:55,599 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-28 17:59:57,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:59:57,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 17:59:57,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 17:59:58,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 17:59:58,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-28 18:00:12,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:00:12,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:00:17,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-28 18:00:23,708 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-28 18:00:23,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-28 18:00:23,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:00:25,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:00:33,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-28 18:00:33,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-28 18:00:33,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:00:33,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:00:35,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-28 18:00:41,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:00:42,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:00:46,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-28 18:00:48,402 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.60 vs. limit=15.0 2023-09-28 18:00:50,725 INFO [train.py:1039] (1/4) Epoch 3, batch 4000, loss[loss=0.3031, simple_loss=0.3369, pruned_loss=0.1346, over 23761.00 frames. ], tot_loss[loss=0.2824, simple_loss=0.3301, pruned_loss=0.1174, over 4688641.36 frames. ], batch size: 150, lr: 2.83e-02, grad_scale: 32.0 2023-09-28 18:00:55,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:00:56,952 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.085e+02 2.653e+02 3.032e+02 3.720e+02 5.555e+02, threshold=6.065e+02, percent-clipped=0.0 2023-09-28 18:00:57,431 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=97493.33333333333, ans=0.125 2023-09-28 18:01:03,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:01:09,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:01:09,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:01:10,959 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:01:10,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-28 18:01:11,191 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=97560.0, ans=0.0 2023-09-28 18:01:12,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-28 18:01:12,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-28 18:01:12,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:01:14,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-28 18:01:16,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:01:19,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:01:20,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:01:20,619 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:01:20,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:01:20,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-28 18:01:23,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:01:24,516 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-28 18:01:24,786 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 18:01:25,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:01:26,590 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.15 vs. limit=22.5 2023-09-28 18:01:27,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:01:30,346 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-28 18:01:31,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 18:01:31,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:01:33,939 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.77 vs. limit=15.0 2023-09-28 18:01:38,142 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-28 18:01:38,231 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:01:41,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:01:41,644 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-28 18:01:43,830 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:01:43,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-28 18:01:43,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:01:44,403 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=11.42 vs. limit=15.0 2023-09-28 18:01:45,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:01:47,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:01:49,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:01:49,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:01:49,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:01:49,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-28 18:01:50,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:01:52,563 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-28 18:01:58,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:02:03,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 18:02:05,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:02:06,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:02:06,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:02:08,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:02:12,359 INFO [train.py:1039] (1/4) Epoch 3, batch 4050, loss[loss=0.2617, simple_loss=0.3286, pruned_loss=0.09744, over 24615.00 frames. ], tot_loss[loss=0.2826, simple_loss=0.331, pruned_loss=0.1171, over 4701555.53 frames. ], batch size: 68, lr: 2.83e-02, grad_scale: 32.0 2023-09-28 18:02:16,164 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:02:19,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-28 18:02:19,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-28 18:02:21,673 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:02:22,268 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.59 vs. limit=15.0 2023-09-28 18:02:22,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:02:24,487 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:02:24,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-28 18:02:27,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:02:30,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:02:31,074 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:02:32,458 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 18:02:32,705 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=97893.33333333333, ans=0.07 2023-09-28 18:02:34,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:02:35,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:02:37,978 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=97893.33333333333, ans=0.125 2023-09-28 18:02:39,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:02:40,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-28 18:02:44,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 18:02:44,240 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=97960.0, ans=0.125 2023-09-28 18:02:45,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-28 18:02:45,677 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-28 18:02:50,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:02:50,628 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=97960.0, ans=0.0 2023-09-28 18:02:50,730 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=97960.0, ans=0.1 2023-09-28 18:02:58,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-28 18:02:59,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:03:01,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:03:04,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:03:05,604 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:03:05,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:03:08,179 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.95 vs. limit=6.0 2023-09-28 18:03:08,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:03:12,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-28 18:03:12,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 18:03:13,736 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:03:15,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-28 18:03:17,162 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=98093.33333333333, ans=0.125 2023-09-28 18:03:17,633 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=27.53 vs. limit=22.5 2023-09-28 18:03:18,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:03:25,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-28 18:03:27,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:03:27,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:03:28,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-28 18:03:30,285 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-28 18:03:30,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:03:32,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:03:33,941 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:03:33,976 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:03:35,369 INFO [train.py:1039] (1/4) Epoch 3, batch 4100, loss[loss=0.2571, simple_loss=0.3158, pruned_loss=0.09914, over 20220.00 frames. ], tot_loss[loss=0.2821, simple_loss=0.3307, pruned_loss=0.1167, over 4696931.19 frames. ], batch size: 44, lr: 2.82e-02, grad_scale: 32.0 2023-09-28 18:03:42,060 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.855e+02 2.385e+02 2.703e+02 3.359e+02 5.329e+02, threshold=5.406e+02, percent-clipped=0.0 2023-09-28 18:03:43,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-28 18:03:45,136 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-28 18:03:48,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-28 18:03:49,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-28 18:03:49,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:03:49,806 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:03:49,852 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:03:51,246 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:03:52,755 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-28 18:03:55,960 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:03:56,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:03:56,296 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=98226.66666666667, ans=10.0 2023-09-28 18:03:58,048 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:03:58,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:04:02,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 18:04:02,889 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:04:02,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:04:05,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-28 18:04:05,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:04:05,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:04:05,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:04:05,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:04:06,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-28 18:04:08,552 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=98293.33333333333, ans=0.2 2023-09-28 18:04:09,871 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:04:11,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-28 18:04:12,825 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:04:13,045 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=98293.33333333333, ans=0.1 2023-09-28 18:04:16,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:04:16,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-28 18:04:18,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:04:18,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:04:18,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:04:19,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-28 18:04:22,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:04:24,252 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:04:25,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-28 18:04:27,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:04:27,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-28 18:04:29,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:04:35,792 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:04:36,737 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=98360.0, ans=0.125 2023-09-28 18:04:38,104 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=98360.0, ans=0.0 2023-09-28 18:04:39,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:04:39,759 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:04:48,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:04:48,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:04:51,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:04:52,431 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=98426.66666666667, ans=0.125 2023-09-28 18:04:53,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:04:57,939 INFO [train.py:1039] (1/4) Epoch 3, batch 4150, loss[loss=0.28, simple_loss=0.3413, pruned_loss=0.1094, over 24498.00 frames. ], tot_loss[loss=0.2833, simple_loss=0.3321, pruned_loss=0.1173, over 4698228.65 frames. ], batch size: 63, lr: 2.82e-02, grad_scale: 32.0 2023-09-28 18:04:58,085 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-28 18:04:59,632 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:04:59,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:04:59,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:05:04,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-28 18:05:04,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:05:06,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-28 18:05:07,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-28 18:05:08,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-28 18:05:10,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:05:13,021 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=98560.0, ans=0.125 2023-09-28 18:05:15,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:05:15,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:05:18,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:05:19,540 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:05:19,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-28 18:05:21,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 18:05:21,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:05:23,282 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-28 18:05:27,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:05:32,300 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:05:32,570 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=98626.66666666667, ans=0.125 2023-09-28 18:05:33,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-28 18:05:35,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-28 18:05:36,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:05:36,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-28 18:05:36,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:05:36,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:05:42,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:05:42,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:05:47,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-28 18:05:50,280 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-28 18:05:50,487 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_positive, batch_count=98693.33333333333, ans=0.05 2023-09-28 18:05:51,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:05:53,997 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-28 18:05:54,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:05:57,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-28 18:05:57,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:05:58,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:06:00,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:06:00,489 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=98693.33333333333, ans=0.125 2023-09-28 18:06:01,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-28 18:06:01,648 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:06:01,651 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-28 18:06:03,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 18:06:06,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-28 18:06:06,491 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:06:06,498 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 18:06:06,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 18:06:08,017 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-28 18:06:08,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:06:08,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 18:06:09,498 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:06:11,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:06:11,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-28 18:06:13,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-28 18:06:17,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:06:19,359 INFO [train.py:1039] (1/4) Epoch 3, batch 4200, loss[loss=0.2651, simple_loss=0.3304, pruned_loss=0.0999, over 24443.00 frames. ], tot_loss[loss=0.2818, simple_loss=0.3313, pruned_loss=0.1162, over 4708822.92 frames. ], batch size: 69, lr: 2.82e-02, grad_scale: 32.0 2023-09-28 18:06:20,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-28 18:06:20,470 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:06:21,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:06:23,702 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=98826.66666666667, ans=0.0 2023-09-28 18:06:24,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:06:24,907 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:06:24,910 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:06:26,713 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.941e+02 2.537e+02 2.926e+02 3.391e+02 4.648e+02, threshold=5.852e+02, percent-clipped=0.0 2023-09-28 18:06:26,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-28 18:06:28,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-28 18:06:30,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:06:33,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:06:35,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:06:37,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-28 18:06:39,306 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:06:40,727 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:06:42,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-28 18:06:42,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:06:44,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:06:44,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:06:44,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:06:45,075 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=13.04 vs. limit=15.0 2023-09-28 18:06:45,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:06:48,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-28 18:06:49,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:06:56,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-28 18:06:57,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:06:57,990 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=98960.0, ans=0.0 2023-09-28 18:06:59,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:06:59,511 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=98960.0, ans=0.2 2023-09-28 18:07:02,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:07:05,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:07:05,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-28 18:07:05,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:07:07,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:07:12,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:07:13,570 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:07:20,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-28 18:07:21,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-28 18:07:25,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:07:30,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 18:07:31,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:07:33,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-28 18:07:33,907 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=99093.33333333333, ans=0.2 2023-09-28 18:07:40,323 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-28 18:07:41,851 INFO [train.py:1039] (1/4) Epoch 3, batch 4250, loss[loss=0.252, simple_loss=0.3037, pruned_loss=0.1002, over 24455.00 frames. ], tot_loss[loss=0.2802, simple_loss=0.3298, pruned_loss=0.1153, over 4714856.28 frames. ], batch size: 58, lr: 2.81e-02, grad_scale: 16.0 2023-09-28 18:07:42,376 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=99160.0, ans=0.125 2023-09-28 18:07:45,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:07:45,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-28 18:07:45,812 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.50 vs. limit=22.5 2023-09-28 18:07:46,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:07:51,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:07:52,867 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-28 18:07:52,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:07:56,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:07:59,613 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.98 vs. limit=6.0 2023-09-28 18:08:00,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:08:05,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:08:05,534 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:08:08,433 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:08:08,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:08:08,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:08:08,901 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=99226.66666666667, ans=0.1 2023-09-28 18:08:10,165 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:08:12,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:08:15,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:08:16,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:08:17,742 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.04 vs. limit=15.0 2023-09-28 18:08:18,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-28 18:08:21,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-28 18:08:21,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:08:22,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:08:22,898 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:08:24,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:08:24,498 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:08:24,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:08:27,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-28 18:08:27,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:08:33,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:08:35,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:08:35,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-28 18:08:35,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:08:37,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-28 18:08:39,993 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-28 18:08:40,461 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=99360.0, ans=0.04949747468305833 2023-09-28 18:08:41,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:08:44,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:08:44,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:08:46,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-28 18:08:46,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 18:08:48,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-28 18:08:52,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:08:55,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:08:57,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:08:58,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:09:02,198 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:09:03,672 INFO [train.py:1039] (1/4) Epoch 3, batch 4300, loss[loss=0.2611, simple_loss=0.3175, pruned_loss=0.1023, over 24324.00 frames. ], tot_loss[loss=0.2799, simple_loss=0.3297, pruned_loss=0.115, over 4714703.12 frames. ], batch size: 61, lr: 2.81e-02, grad_scale: 16.0 2023-09-28 18:09:03,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:09:05,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:09:05,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-28 18:09:06,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:09:12,372 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.893e+02 2.623e+02 3.036e+02 3.611e+02 5.200e+02, threshold=6.071e+02, percent-clipped=0.0 2023-09-28 18:09:12,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:09:12,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:09:17,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:09:21,015 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.78 vs. limit=6.0 2023-09-28 18:09:23,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:09:23,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-28 18:09:25,476 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:09:27,648 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.30 vs. limit=15.0 2023-09-28 18:09:28,448 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:09:28,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:09:28,516 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-28 18:09:31,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 18:09:33,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:09:36,674 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-28 18:09:36,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:09:36,763 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-28 18:09:38,931 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=2.95 vs. limit=15.0 2023-09-28 18:09:40,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 18:09:41,958 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:09:47,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:09:47,117 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:09:47,266 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:09:48,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:09:49,024 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=99626.66666666667, ans=0.0 2023-09-28 18:09:50,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:09:50,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-28 18:09:51,942 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-28 18:09:53,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:09:55,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:09:55,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 18:09:55,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:09:56,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:09:56,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-28 18:09:56,787 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-28 18:09:56,889 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-28 18:09:57,129 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=99693.33333333333, ans=0.125 2023-09-28 18:09:58,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:09:58,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-28 18:10:00,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-28 18:10:03,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:10:03,590 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-28 18:10:04,992 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:10:06,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:10:06,653 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:10:10,322 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-28 18:10:10,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:10:10,430 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:10:10,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:10:10,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:10:10,661 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:10:10,965 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=99760.0, ans=0.0 2023-09-28 18:10:12,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:10:15,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:10:16,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:10:17,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:10:23,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-28 18:10:23,583 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-28 18:10:26,388 INFO [train.py:1039] (1/4) Epoch 3, batch 4350, loss[loss=0.2899, simple_loss=0.3315, pruned_loss=0.1242, over 23832.00 frames. ], tot_loss[loss=0.2803, simple_loss=0.3305, pruned_loss=0.115, over 4721638.79 frames. ], batch size: 164, lr: 2.81e-02, grad_scale: 16.0 2023-09-28 18:10:29,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:10:31,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:10:34,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:10:34,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:10:36,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=99826.66666666667, ans=0.125 2023-09-28 18:10:36,955 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=99826.66666666667, ans=0.0 2023-09-28 18:10:40,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:10:44,134 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:10:47,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:10:47,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:10:47,354 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=99893.33333333333, ans=0.0 2023-09-28 18:10:48,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:10:51,358 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=99893.33333333333, ans=0.125 2023-09-28 18:10:52,750 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=99893.33333333333, ans=0.0 2023-09-28 18:10:53,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:10:55,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:11:01,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-28 18:11:02,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:11:02,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:11:08,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:11:10,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-28 18:11:15,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:11:18,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 18:11:20,842 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-28 18:11:21,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:11:22,461 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-28 18:11:24,108 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-28 18:11:24,225 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-28 18:11:24,234 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:11:24,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:11:25,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:11:27,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:11:29,249 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:11:29,331 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:11:32,367 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-28 18:11:32,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:11:32,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:11:33,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:11:33,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-28 18:11:35,375 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-28 18:11:35,381 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-28 18:11:35,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-28 18:11:38,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:11:38,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:11:39,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:11:39,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:11:41,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-28 18:11:45,050 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-28 18:11:45,073 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:11:49,523 INFO [train.py:1039] (1/4) Epoch 3, batch 4400, loss[loss=0.2596, simple_loss=0.3251, pruned_loss=0.09704, over 24319.00 frames. ], tot_loss[loss=0.282, simple_loss=0.3319, pruned_loss=0.1161, over 4732073.51 frames. ], batch size: 61, lr: 2.80e-02, grad_scale: 32.0 2023-09-28 18:11:49,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:11:49,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:11:51,927 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:11:56,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-28 18:11:56,203 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-28 18:11:56,262 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-28 18:11:56,301 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-28 18:11:57,548 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.868e+02 2.556e+02 3.170e+02 3.495e+02 5.491e+02, threshold=6.340e+02, percent-clipped=0.0 2023-09-28 18:11:57,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 18:11:57,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:12:01,340 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-28 18:12:01,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:12:04,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:12:04,460 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-28 18:12:07,569 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:12:07,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-28 18:12:07,639 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-28 18:12:10,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-28 18:12:12,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-28 18:12:12,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-28 18:12:12,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:12:13,700 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:12:15,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:12:15,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:12:16,051 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.38 vs. limit=22.5 2023-09-28 18:12:16,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-28 18:12:16,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-28 18:12:16,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:12:20,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:12:20,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:12:20,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:12:20,805 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=100293.33333333333, ans=0.125 2023-09-28 18:12:22,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:12:22,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-28 18:12:23,891 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-28 18:12:29,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:12:29,397 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=100293.33333333333, ans=0.125 2023-09-28 18:12:31,485 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.11 vs. limit=12.0 2023-09-28 18:12:37,136 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:12:40,150 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-28 18:12:44,750 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:12:47,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:12:49,485 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:12:50,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-28 18:12:51,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:12:51,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:12:51,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:12:52,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-28 18:12:57,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-28 18:13:02,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-28 18:13:02,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-28 18:13:02,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:13:04,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-28 18:13:04,718 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:13:07,836 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:13:09,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-28 18:13:10,895 INFO [train.py:1039] (1/4) Epoch 3, batch 4450, loss[loss=0.2892, simple_loss=0.3559, pruned_loss=0.1113, over 23946.00 frames. ], tot_loss[loss=0.2817, simple_loss=0.332, pruned_loss=0.1157, over 4727832.13 frames. ], batch size: 80, lr: 2.80e-02, grad_scale: 32.0 2023-09-28 18:13:13,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:13:16,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:13:16,272 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:13:23,739 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:13:23,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:13:26,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:13:28,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:13:28,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:13:28,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:13:30,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-28 18:13:30,774 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:13:32,796 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:13:32,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:13:32,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:13:35,198 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 18:13:41,469 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=100560.0, ans=0.0 2023-09-28 18:13:42,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:13:43,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:13:45,871 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:13:45,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:13:47,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:13:52,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 18:13:52,520 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=100626.66666666667, ans=0.0 2023-09-28 18:13:53,910 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-28 18:13:53,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-28 18:13:53,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:13:55,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:13:57,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-28 18:14:01,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-28 18:14:04,741 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:14:04,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-28 18:14:04,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:14:04,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:14:04,899 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:14:04,911 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:14:07,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:14:10,200 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-28 18:14:10,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-28 18:14:13,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 18:14:14,786 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:14:17,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:14:17,890 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=100760.0, ans=0.125 2023-09-28 18:14:19,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:14:19,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 18:14:21,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-28 18:14:23,683 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.32 vs. limit=22.5 2023-09-28 18:14:26,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-28 18:14:27,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:14:31,867 INFO [train.py:1039] (1/4) Epoch 3, batch 4500, loss[loss=0.2868, simple_loss=0.352, pruned_loss=0.1108, over 24649.00 frames. ], tot_loss[loss=0.2829, simple_loss=0.3329, pruned_loss=0.1164, over 4720455.24 frames. ], batch size: 73, lr: 2.79e-02, grad_scale: 32.0 2023-09-28 18:14:33,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:14:34,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-28 18:14:34,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-28 18:14:36,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:14:38,425 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=100826.66666666667, ans=0.1 2023-09-28 18:14:40,304 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.868e+02 2.564e+02 2.888e+02 3.333e+02 4.958e+02, threshold=5.777e+02, percent-clipped=0.0 2023-09-28 18:14:40,614 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:14:42,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:14:42,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 18:14:44,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:14:44,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:14:44,474 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=100826.66666666667, ans=0.0 2023-09-28 18:14:45,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:14:59,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:14:59,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:15:03,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:15:04,502 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:15:05,990 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:15:12,044 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 18:15:13,939 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=100960.0, ans=0.2 2023-09-28 18:15:17,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:15:21,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 18:15:26,046 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:15:26,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-28 18:15:26,221 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:15:27,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:15:28,309 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=7.20 vs. limit=12.0 2023-09-28 18:15:29,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:15:29,236 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:15:32,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:15:32,892 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-28 18:15:32,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 18:15:34,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:15:37,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:15:38,954 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:15:40,680 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:15:43,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:15:43,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:15:45,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-28 18:15:48,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-28 18:15:48,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-28 18:15:53,572 INFO [train.py:1039] (1/4) Epoch 3, batch 4550, loss[loss=0.2679, simple_loss=0.3323, pruned_loss=0.1017, over 24670.00 frames. ], tot_loss[loss=0.2821, simple_loss=0.3325, pruned_loss=0.1159, over 4731362.79 frames. ], batch size: 73, lr: 2.79e-02, grad_scale: 16.0 2023-09-28 18:15:53,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-28 18:15:55,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-28 18:15:56,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:15:58,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:15:59,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:16:03,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:16:08,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:16:11,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:16:12,841 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:16:12,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:16:12,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:16:15,837 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:16:15,917 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:16:19,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:16:21,565 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.72 vs. limit=15.0 2023-09-28 18:16:22,604 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-28 18:16:22,702 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-28 18:16:24,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:16:26,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-28 18:16:30,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-28 18:16:30,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:16:33,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-28 18:16:36,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 18:16:39,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:16:39,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:16:39,221 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-28 18:16:42,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-28 18:16:44,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:16:47,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:16:47,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:16:48,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:16:50,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-28 18:16:52,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-28 18:16:52,106 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:16:53,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-28 18:16:57,106 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-28 18:16:57,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:16:57,333 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:16:59,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:16:59,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:16:59,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:17:01,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 18:17:01,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-28 18:17:04,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:17:04,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 18:17:05,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-28 18:17:05,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:17:05,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-28 18:17:08,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:17:08,801 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:17:10,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:17:10,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:17:10,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-28 18:17:12,170 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:17:15,556 INFO [train.py:1039] (1/4) Epoch 3, batch 4600, loss[loss=0.3183, simple_loss=0.3497, pruned_loss=0.1434, over 23794.00 frames. ], tot_loss[loss=0.2801, simple_loss=0.3308, pruned_loss=0.1147, over 4733181.44 frames. ], batch size: 212, lr: 2.79e-02, grad_scale: 16.0 2023-09-28 18:17:15,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-28 18:17:17,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:17:20,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:17:23,190 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:17:23,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:17:23,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:17:24,709 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.921e+02 2.433e+02 2.837e+02 3.221e+02 4.908e+02, threshold=5.674e+02, percent-clipped=0.0 2023-09-28 18:17:24,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-28 18:17:25,246 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=101493.33333333333, ans=0.2 2023-09-28 18:17:27,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:17:30,439 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=101560.0, ans=0.1 2023-09-28 18:17:30,454 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=101560.0, ans=0.0 2023-09-28 18:17:32,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:17:34,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:17:34,685 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=101560.0, ans=0.125 2023-09-28 18:17:37,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:17:42,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-28 18:17:43,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:17:45,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:17:48,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:17:49,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:17:55,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-28 18:17:55,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 18:17:55,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:18:00,237 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=101626.66666666667, ans=0.125 2023-09-28 18:18:02,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:18:03,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:18:05,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:18:09,350 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-28 18:18:10,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-28 18:18:15,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:18:16,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:18:18,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:18:18,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 18:18:18,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:18:19,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-28 18:18:20,043 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:18:20,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:18:22,262 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:18:23,771 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:18:25,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:18:25,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-28 18:18:25,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-28 18:18:26,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-28 18:18:26,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:18:28,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:18:29,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:18:29,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:18:38,788 INFO [train.py:1039] (1/4) Epoch 3, batch 4650, loss[loss=0.3157, simple_loss=0.3295, pruned_loss=0.151, over 19371.00 frames. ], tot_loss[loss=0.2798, simple_loss=0.3302, pruned_loss=0.1147, over 4725367.72 frames. ], batch size: 388, lr: 2.78e-02, grad_scale: 16.0 2023-09-28 18:18:41,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:18:44,250 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=101826.66666666667, ans=0.2 2023-09-28 18:18:45,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:18:45,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:18:46,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:18:47,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:18:47,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:18:48,619 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:18:50,781 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=10.43 vs. limit=15.0 2023-09-28 18:18:52,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-28 18:18:56,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:18:58,495 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-28 18:18:58,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:18:59,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-28 18:19:00,041 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:19:01,446 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-28 18:19:01,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-28 18:19:02,862 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:19:02,960 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:19:05,871 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:19:07,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:19:07,445 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-28 18:19:10,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:19:12,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-28 18:19:16,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:19:16,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:19:17,520 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-28 18:19:19,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:19:22,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:19:25,976 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:19:30,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:19:34,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:19:34,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:19:34,907 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.96 vs. limit=22.5 2023-09-28 18:19:35,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:19:38,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-28 18:19:40,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-28 18:19:41,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 18:19:41,706 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-28 18:19:43,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:19:47,684 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=11.09 vs. limit=15.0 2023-09-28 18:19:52,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-28 18:19:52,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:19:52,306 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-28 18:19:52,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:19:53,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:19:53,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:19:56,020 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:19:57,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:19:57,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:19:57,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:20:00,524 INFO [train.py:1039] (1/4) Epoch 3, batch 4700, loss[loss=0.2944, simple_loss=0.3362, pruned_loss=0.1263, over 23816.00 frames. ], tot_loss[loss=0.2792, simple_loss=0.33, pruned_loss=0.1142, over 4729637.40 frames. ], batch size: 164, lr: 2.78e-02, grad_scale: 16.0 2023-09-28 18:20:03,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:20:05,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:20:05,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 18:20:05,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-28 18:20:06,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-28 18:20:06,903 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-28 18:20:10,642 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.892e+02 2.707e+02 3.161e+02 3.958e+02 7.246e+02, threshold=6.322e+02, percent-clipped=4.0 2023-09-28 18:20:14,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:20:14,233 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:20:14,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:20:15,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:20:17,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 18:20:21,105 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=102226.66666666667, ans=0.0 2023-09-28 18:20:24,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-28 18:20:24,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-28 18:20:25,908 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:20:29,623 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:20:29,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:20:32,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:20:39,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 18:20:41,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 18:20:42,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:20:48,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-28 18:20:50,469 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:20:53,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:20:56,077 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=102360.0, ans=0.125 2023-09-28 18:20:57,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-28 18:20:59,347 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:21:04,376 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:21:04,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-28 18:21:06,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:21:06,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:21:09,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:21:10,532 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:21:10,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-28 18:21:12,049 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-28 18:21:13,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:21:13,869 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=102426.66666666667, ans=0.0 2023-09-28 18:21:15,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:21:15,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:21:15,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-28 18:21:18,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:21:20,628 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=102426.66666666667, ans=0.2 2023-09-28 18:21:21,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-28 18:21:23,335 INFO [train.py:1039] (1/4) Epoch 3, batch 4750, loss[loss=0.2827, simple_loss=0.332, pruned_loss=0.1167, over 23304.00 frames. ], tot_loss[loss=0.2796, simple_loss=0.3301, pruned_loss=0.1146, over 4726630.58 frames. ], batch size: 105, lr: 2.78e-02, grad_scale: 16.0 2023-09-28 18:21:23,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:21:24,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:21:30,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:21:30,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:21:33,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-28 18:21:33,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:21:35,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-28 18:21:38,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:21:38,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:21:39,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:21:45,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-28 18:21:50,462 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:21:52,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-28 18:21:53,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:21:59,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:21:59,502 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:21:59,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:21:59,660 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-28 18:21:59,665 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-28 18:22:04,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-28 18:22:06,960 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 18:22:08,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:22:10,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:22:13,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 18:22:13,905 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-28 18:22:13,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:22:14,355 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=102693.33333333333, ans=0.025 2023-09-28 18:22:15,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-28 18:22:18,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:22:20,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-28 18:22:20,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-28 18:22:20,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:22:21,800 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:22:21,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:22:23,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 18:22:23,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-28 18:22:25,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-28 18:22:27,407 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=102693.33333333333, ans=0.125 2023-09-28 18:22:28,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:22:31,773 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:22:31,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-28 18:22:33,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:22:33,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:22:34,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:22:35,341 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=102760.0, ans=0.125 2023-09-28 18:22:36,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:22:37,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 18:22:41,610 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:22:41,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-28 18:22:43,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-28 18:22:44,747 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-28 18:22:46,150 INFO [train.py:1039] (1/4) Epoch 3, batch 4800, loss[loss=0.2535, simple_loss=0.3244, pruned_loss=0.09136, over 24321.00 frames. ], tot_loss[loss=0.2812, simple_loss=0.3317, pruned_loss=0.1154, over 4724434.02 frames. ], batch size: 74, lr: 2.77e-02, grad_scale: 32.0 2023-09-28 18:22:48,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-28 18:22:48,470 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:22:49,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-28 18:22:55,847 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.876e+02 2.499e+02 2.983e+02 3.709e+02 7.262e+02, threshold=5.966e+02, percent-clipped=1.0 2023-09-28 18:22:55,979 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:22:56,060 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:22:58,533 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=102826.66666666667, ans=0.5 2023-09-28 18:23:02,697 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:23:03,412 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.50 vs. limit=15.0 2023-09-28 18:23:04,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:23:04,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:23:04,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-28 18:23:05,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:23:07,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:23:07,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:23:07,903 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=102893.33333333333, ans=0.0 2023-09-28 18:23:13,544 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:23:17,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:23:17,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:23:17,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:23:17,255 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 18:23:17,283 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:23:19,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:23:22,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:23:22,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:23:24,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:23:24,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:23:26,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 18:23:27,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:23:29,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-28 18:23:29,310 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-28 18:23:32,004 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:23:32,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:23:32,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:23:32,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:23:33,016 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=102960.0, ans=0.2 2023-09-28 18:23:34,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-28 18:23:35,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:23:35,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:23:39,808 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.32 vs. limit=10.0 2023-09-28 18:23:40,331 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:23:41,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:23:44,804 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:23:48,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-28 18:23:49,783 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:23:49,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:23:49,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 18:23:52,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:23:52,373 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=103093.33333333333, ans=0.0 2023-09-28 18:23:55,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:23:57,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:23:57,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:23:57,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:23:58,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:24:00,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:24:03,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:24:03,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:24:03,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:24:05,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-28 18:24:08,456 INFO [train.py:1039] (1/4) Epoch 3, batch 4850, loss[loss=0.3089, simple_loss=0.3314, pruned_loss=0.1432, over 19731.00 frames. ], tot_loss[loss=0.2815, simple_loss=0.3315, pruned_loss=0.1158, over 4726201.02 frames. ], batch size: 388, lr: 2.77e-02, grad_scale: 32.0 2023-09-28 18:24:08,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-28 18:24:08,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:24:08,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:24:10,334 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:24:10,347 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:24:13,519 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:24:21,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-28 18:24:21,266 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:24:26,589 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:24:28,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 18:24:28,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:24:32,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:24:33,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:24:35,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:24:35,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-28 18:24:39,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:24:42,026 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:24:42,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 18:24:42,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 18:24:42,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-28 18:24:46,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:24:46,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:24:48,383 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=103293.33333333333, ans=0.125 2023-09-28 18:24:49,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:24:49,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-28 18:24:51,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-28 18:24:51,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 18:24:56,748 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=22.22 vs. limit=22.5 2023-09-28 18:24:59,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:25:00,708 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-28 18:25:00,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:25:00,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:25:01,187 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=103360.0, ans=0.2 2023-09-28 18:25:02,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-28 18:25:05,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-28 18:25:05,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:25:08,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-28 18:25:08,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:25:09,667 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:25:11,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-28 18:25:22,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:25:28,534 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:25:28,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:25:29,925 INFO [train.py:1039] (1/4) Epoch 3, batch 4900, loss[loss=0.2778, simple_loss=0.2982, pruned_loss=0.1287, over 19483.00 frames. ], tot_loss[loss=0.2802, simple_loss=0.3295, pruned_loss=0.1154, over 4702646.37 frames. ], batch size: 388, lr: 2.77e-02, grad_scale: 32.0 2023-09-28 18:25:35,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-28 18:25:35,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:25:40,618 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.035e+02 2.465e+02 2.992e+02 4.302e+02 8.236e+02, threshold=5.984e+02, percent-clipped=6.0 2023-09-28 18:25:40,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:25:42,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:25:42,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:25:44,760 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=103493.33333333333, ans=0.125 2023-09-28 18:25:45,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-28 18:25:49,756 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=103560.0, ans=0.2 2023-09-28 18:25:50,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-28 18:25:54,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-28 18:25:55,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-28 18:25:55,667 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:25:57,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:25:57,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:25:57,167 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:25:57,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:25:57,277 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-28 18:26:00,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-28 18:26:01,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 18:26:03,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:26:04,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:26:07,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:26:08,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:26:08,655 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:26:08,669 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-28 18:26:10,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:26:13,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:26:13,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-28 18:26:13,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-28 18:26:18,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-28 18:26:20,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-28 18:26:20,578 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=103693.33333333333, ans=0.0 2023-09-28 18:26:21,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:26:23,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:26:23,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:26:23,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 18:26:23,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:26:24,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-28 18:26:27,845 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:26:30,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-28 18:26:30,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:26:34,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-28 18:26:35,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:26:35,709 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-28 18:26:35,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-28 18:26:42,767 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=103760.0, ans=0.2 2023-09-28 18:26:44,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:26:45,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:26:47,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-28 18:26:47,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 18:26:47,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:26:51,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:26:53,061 INFO [train.py:1039] (1/4) Epoch 3, batch 4950, loss[loss=0.2845, simple_loss=0.3319, pruned_loss=0.1186, over 23397.00 frames. ], tot_loss[loss=0.278, simple_loss=0.327, pruned_loss=0.1145, over 4688418.24 frames. ], batch size: 134, lr: 2.76e-02, grad_scale: 32.0 2023-09-28 18:26:53,408 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=103826.66666666667, ans=0.1 2023-09-28 18:26:54,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:26:54,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:26:55,017 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=103826.66666666667, ans=0.1 2023-09-28 18:26:56,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:26:56,174 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-28 18:26:57,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 18:27:00,846 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:27:00,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 18:27:04,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-28 18:27:04,067 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-28 18:27:05,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-28 18:27:05,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-28 18:27:05,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:27:06,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:27:07,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-28 18:27:07,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:27:08,639 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:27:10,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:27:11,616 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:27:14,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:27:14,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:27:14,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:27:17,029 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=103893.33333333333, ans=0.0 2023-09-28 18:27:19,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 18:27:19,951 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=103893.33333333333, ans=0.125 2023-09-28 18:27:21,701 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=103893.33333333333, ans=0.125 2023-09-28 18:27:24,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:27:25,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:27:27,155 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:27:27,227 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:27:28,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:27:30,263 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-28 18:27:31,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-28 18:27:34,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:27:37,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:27:37,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:27:39,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:27:39,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:27:40,836 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-28 18:27:42,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:27:44,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-28 18:27:45,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:27:50,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:27:50,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:27:50,800 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=104026.66666666667, ans=0.1 2023-09-28 18:27:52,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-28 18:27:52,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:27:53,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:27:56,254 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.05 vs. limit=12.0 2023-09-28 18:27:56,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:27:59,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:27:59,027 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:27:59,194 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=104093.33333333333, ans=0.1 2023-09-28 18:28:01,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:28:01,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:28:02,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:28:04,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:28:04,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:28:04,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:28:05,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-28 18:28:09,193 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:28:13,301 INFO [train.py:1039] (1/4) Epoch 3, batch 5000, loss[loss=0.2468, simple_loss=0.3081, pruned_loss=0.09281, over 24490.00 frames. ], tot_loss[loss=0.2775, simple_loss=0.327, pruned_loss=0.114, over 4707679.30 frames. ], batch size: 63, lr: 2.76e-02, grad_scale: 32.0 2023-09-28 18:28:15,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-28 18:28:15,118 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-28 18:28:22,252 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:28:23,501 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.857e+02 2.486e+02 2.809e+02 3.764e+02 5.780e+02, threshold=5.617e+02, percent-clipped=0.0 2023-09-28 18:28:23,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:28:25,070 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-28 18:28:25,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-28 18:28:26,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:28:29,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-28 18:28:29,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:28:29,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:28:31,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-28 18:28:31,427 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:28:33,482 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:28:33,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-28 18:28:33,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:28:34,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:28:35,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-28 18:28:35,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-28 18:28:36,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:28:38,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-28 18:28:38,189 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 18:28:38,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:28:39,600 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 18:28:39,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-28 18:28:39,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-28 18:28:41,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-28 18:28:41,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:28:41,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:28:44,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-28 18:28:44,375 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:28:45,957 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:28:46,115 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:28:48,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-28 18:28:50,577 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-28 18:28:50,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:28:52,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:28:57,411 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-28 18:28:58,227 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=8.64 vs. limit=12.0 2023-09-28 18:29:00,500 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:29:02,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:29:02,071 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:29:03,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-28 18:29:04,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:29:04,725 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:29:06,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:29:07,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-28 18:29:09,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:29:11,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:29:13,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:29:15,614 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=104360.0, ans=15.0 2023-09-28 18:29:19,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-28 18:29:19,685 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=104426.66666666667, ans=0.0 2023-09-28 18:29:23,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:29:30,576 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=104426.66666666667, ans=0.125 2023-09-28 18:29:33,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:29:34,922 INFO [train.py:1039] (1/4) Epoch 3, batch 5050, loss[loss=0.3135, simple_loss=0.3444, pruned_loss=0.1413, over 23771.00 frames. ], tot_loss[loss=0.2783, simple_loss=0.3281, pruned_loss=0.1143, over 4715417.39 frames. ], batch size: 212, lr: 2.75e-02, grad_scale: 32.0 2023-09-28 18:29:35,034 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:29:35,047 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:29:35,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:29:36,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 18:29:36,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:29:37,011 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:29:42,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:29:42,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-28 18:29:42,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:29:44,514 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=104493.33333333333, ans=0.125 2023-09-28 18:29:45,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:29:45,860 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=104493.33333333333, ans=0.125 2023-09-28 18:29:47,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:29:48,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-28 18:29:48,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:29:50,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:29:53,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 18:29:53,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:29:54,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-28 18:30:04,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-28 18:30:04,835 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-28 18:30:06,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:30:06,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-28 18:30:06,515 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:30:08,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:30:08,803 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 18:30:09,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:30:09,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:30:09,995 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-28 18:30:11,477 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-28 18:30:13,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:30:16,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:30:19,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:30:20,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-28 18:30:21,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:30:24,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-28 18:30:24,984 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.max_abs, batch_count=104693.33333333333, ans=10.0 2023-09-28 18:30:27,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:30:27,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:30:27,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:30:29,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:30:32,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:30:33,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:30:34,050 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=104693.33333333333, ans=0.0 2023-09-28 18:30:35,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:30:35,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:30:37,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:30:37,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-28 18:30:38,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:30:39,173 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=104760.0, ans=0.125 2023-09-28 18:30:40,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:30:43,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:30:43,640 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-28 18:30:43,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-28 18:30:45,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:30:45,857 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:30:45,891 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-28 18:30:48,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:30:48,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-28 18:30:48,954 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:30:52,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:30:52,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:30:54,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-28 18:30:55,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-28 18:30:57,746 INFO [train.py:1039] (1/4) Epoch 3, batch 5100, loss[loss=0.3045, simple_loss=0.3433, pruned_loss=0.1329, over 23521.00 frames. ], tot_loss[loss=0.2785, simple_loss=0.3286, pruned_loss=0.1143, over 4715946.27 frames. ], batch size: 134, lr: 2.75e-02, grad_scale: 32.0 2023-09-28 18:30:57,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:30:57,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:30:59,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:31:02,337 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-28 18:31:03,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:31:06,669 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.935e+02 2.697e+02 3.242e+02 4.082e+02 8.790e+02, threshold=6.484e+02, percent-clipped=7.0 2023-09-28 18:31:06,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-28 18:31:08,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-28 18:31:08,632 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=104826.66666666667, ans=0.125 2023-09-28 18:31:09,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:31:12,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:31:15,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:31:15,443 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=104893.33333333333, ans=0.125 2023-09-28 18:31:16,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-28 18:31:16,649 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-28 18:31:20,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:31:22,047 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:31:25,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:31:28,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-28 18:31:28,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:31:31,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:31:31,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-28 18:31:31,289 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 18:31:31,396 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=104960.0, ans=0.125 2023-09-28 18:31:32,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:31:34,121 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:31:34,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-28 18:31:37,129 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-28 18:31:37,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:31:37,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-28 18:31:37,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-28 18:31:41,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:31:44,292 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=104960.0, ans=0.0 2023-09-28 18:31:45,695 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=105026.66666666667, ans=0.125 2023-09-28 18:31:51,619 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:31:53,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-28 18:31:53,864 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-28 18:31:55,179 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-28 18:31:56,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-28 18:31:56,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:32:00,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-28 18:32:06,972 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-28 18:32:09,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 18:32:11,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:32:12,943 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-28 18:32:14,559 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-28 18:32:14,639 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-28 18:32:18,910 INFO [train.py:1039] (1/4) Epoch 3, batch 5150, loss[loss=0.3038, simple_loss=0.3422, pruned_loss=0.1327, over 23721.00 frames. ], tot_loss[loss=0.2792, simple_loss=0.3293, pruned_loss=0.1146, over 4718135.34 frames. ], batch size: 232, lr: 2.75e-02, grad_scale: 32.0 2023-09-28 18:32:21,518 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=105160.0, ans=0.125 2023-09-28 18:32:22,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:32:22,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:32:22,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:32:23,299 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.24 vs. limit=6.0 2023-09-28 18:32:24,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:32:24,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 18:32:24,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:32:24,614 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=105160.0, ans=0.125 2023-09-28 18:32:25,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-28 18:32:25,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-28 18:32:27,325 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-28 18:32:27,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:32:27,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-28 18:32:29,332 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:32:29,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 18:32:30,994 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:32:32,455 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:32:37,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 18:32:37,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-28 18:32:40,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:32:40,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:32:42,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-28 18:32:42,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:32:42,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:32:44,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:32:44,504 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:32:44,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-28 18:32:46,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:32:46,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:32:49,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 18:32:51,858 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-28 18:32:53,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:33:00,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-28 18:33:04,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-28 18:33:04,798 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=105293.33333333333, ans=0.125 2023-09-28 18:33:07,488 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:33:14,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:33:14,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:33:17,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:33:19,588 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:33:21,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-28 18:33:25,559 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:33:25,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:33:27,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:33:27,395 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=105426.66666666667, ans=0.2 2023-09-28 18:33:30,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:33:30,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:33:32,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-28 18:33:37,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:33:39,383 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 18:33:42,269 INFO [train.py:1039] (1/4) Epoch 3, batch 5200, loss[loss=0.2479, simple_loss=0.2998, pruned_loss=0.09805, over 24565.00 frames. ], tot_loss[loss=0.2797, simple_loss=0.3299, pruned_loss=0.1148, over 4720317.63 frames. ], batch size: 60, lr: 2.74e-02, grad_scale: 32.0 2023-09-28 18:33:42,332 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:33:42,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:33:43,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-28 18:33:43,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-28 18:33:43,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:33:44,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:33:44,801 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.79 vs. limit=6.0 2023-09-28 18:33:47,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:33:49,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:33:49,900 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.58 vs. limit=15.0 2023-09-28 18:33:52,546 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.860e+02 2.485e+02 2.931e+02 3.472e+02 7.408e+02, threshold=5.863e+02, percent-clipped=1.0 2023-09-28 18:33:52,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:33:55,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-28 18:33:57,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:33:57,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:34:00,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:34:00,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:34:02,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:34:02,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-28 18:34:07,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 18:34:09,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:34:12,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-28 18:34:13,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-28 18:34:13,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:34:15,423 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-28 18:34:15,498 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-28 18:34:19,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-28 18:34:21,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:34:21,738 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-28 18:34:21,749 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:34:21,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:34:23,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:34:23,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-28 18:34:24,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:34:27,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:34:30,238 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-28 18:34:30,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-28 18:34:30,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-28 18:34:35,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-28 18:34:36,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 18:34:41,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:34:42,447 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.20 vs. limit=15.0 2023-09-28 18:34:43,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:34:43,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-28 18:34:44,009 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:34:44,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-28 18:34:44,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:34:45,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 18:34:48,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:34:50,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:34:54,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:34:55,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:34:55,827 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:35:02,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:35:03,980 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-28 18:35:04,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:35:05,447 INFO [train.py:1039] (1/4) Epoch 3, batch 5250, loss[loss=0.2478, simple_loss=0.2981, pruned_loss=0.09872, over 24455.00 frames. ], tot_loss[loss=0.2785, simple_loss=0.3291, pruned_loss=0.1139, over 4727513.08 frames. ], batch size: 58, lr: 2.74e-02, grad_scale: 32.0 2023-09-28 18:35:05,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:35:05,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:35:07,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-28 18:35:08,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:35:12,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:35:14,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:35:14,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:35:15,751 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:35:18,265 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=105826.66666666667, ans=0.0 2023-09-28 18:35:19,931 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=105826.66666666667, ans=0.125 2023-09-28 18:35:21,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:35:24,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:35:27,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:35:30,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 18:35:32,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-28 18:35:32,245 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:35:32,394 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:35:43,010 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=105960.0, ans=0.0 2023-09-28 18:36:12,636 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.87 vs. limit=15.0 2023-09-28 18:36:13,410 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=106093.33333333333, ans=0.1 2023-09-28 18:36:15,328 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-28 18:36:20,381 INFO [train.py:1039] (1/4) Epoch 3, batch 5300, loss[loss=0.3021, simple_loss=0.3318, pruned_loss=0.1362, over 23947.00 frames. ], tot_loss[loss=0.2765, simple_loss=0.3271, pruned_loss=0.1129, over 4717120.56 frames. ], batch size: 195, lr: 2.74e-02, grad_scale: 32.0 2023-09-28 18:36:24,642 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=106160.0, ans=0.125 2023-09-28 18:36:28,675 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.019e+02 2.515e+02 2.948e+02 3.617e+02 7.012e+02, threshold=5.895e+02, percent-clipped=2.0 2023-09-28 18:36:33,401 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=106226.66666666667, ans=0.0 2023-09-28 18:36:33,405 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=106226.66666666667, ans=0.0 2023-09-28 18:36:33,434 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=106226.66666666667, ans=0.125 2023-09-28 18:36:35,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:36:35,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 18:36:35,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 18:36:35,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:36:36,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:36:36,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:36:36,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:36:36,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:36:36,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:36:36,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:36:36,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 18:36:37,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:36:37,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 18:36:37,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 18:36:37,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 18:36:37,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 18:36:37,559 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 18:36:37,686 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 18:36:37,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:36:38,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:36:38,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:36:38,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:36:39,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:36:39,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:36:39,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:36:39,703 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:36:39,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:36:39,894 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:36:39,901 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:36:39,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:36:39,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:36:40,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 18:36:40,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:36:41,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:36:41,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 18:36:41,555 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 18:36:41,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:36:41,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:36:41,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 18:36:42,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 18:36:42,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 18:36:43,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:36:43,602 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:36:43,781 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 18:36:43,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 18:36:43,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:36:44,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:36:44,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 18:36:44,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 18:36:44,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 18:36:44,679 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-28 18:36:54,486 INFO [train.py:1039] (1/4) Epoch 4, batch 0, loss[loss=0.2799, simple_loss=0.3447, pruned_loss=0.1075, over 24680.00 frames. ], tot_loss[loss=0.2799, simple_loss=0.3447, pruned_loss=0.1075, over 24680.00 frames. ], batch size: 73, lr: 2.56e-02, grad_scale: 32.0 2023-09-28 18:36:54,487 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-28 18:37:09,551 INFO [train.py:1071] (1/4) Epoch 4, validation: loss=0.3856, simple_loss=0.3373, pruned_loss=0.217, over 1125622.00 frames. 2023-09-28 18:37:09,552 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-28 18:37:12,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-28 18:37:14,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:37:15,744 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:37:21,126 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:37:21,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:37:22,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:37:24,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-28 18:37:25,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-28 18:37:27,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:37:27,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:37:33,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:37:33,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:37:34,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:37:34,602 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:37:36,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-28 18:37:39,262 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:37:41,248 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=106373.33333333333, ans=0.2 2023-09-28 18:37:46,143 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:37:46,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:37:48,445 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-28 18:37:52,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:37:52,968 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:37:53,345 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=106373.33333333333, ans=0.125 2023-09-28 18:37:54,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:37:58,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:38:01,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:38:08,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-28 18:38:13,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-28 18:38:13,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:38:13,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:38:15,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:38:15,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:38:16,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-28 18:38:18,763 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=106506.66666666667, ans=0.2 2023-09-28 18:38:20,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:38:22,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:38:25,427 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:38:28,712 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-28 18:38:30,523 INFO [train.py:1039] (1/4) Epoch 4, batch 50, loss[loss=0.2899, simple_loss=0.333, pruned_loss=0.1234, over 23778.00 frames. ], tot_loss[loss=0.2798, simple_loss=0.3305, pruned_loss=0.1146, over 1066159.26 frames. ], batch size: 164, lr: 2.56e-02, grad_scale: 32.0 2023-09-28 18:38:32,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:38:34,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:38:37,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:38:37,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-28 18:38:39,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 18:38:39,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:38:40,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:38:42,073 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:38:44,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:38:48,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-28 18:38:48,160 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:38:58,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-28 18:39:00,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-28 18:39:02,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-28 18:39:05,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:39:07,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:39:07,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:39:09,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:39:09,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-28 18:39:11,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 18:39:11,487 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:39:11,881 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=106706.66666666667, ans=0.125 2023-09-28 18:39:12,493 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.83 vs. limit=15.0 2023-09-28 18:39:16,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:39:17,270 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=14.44 vs. limit=15.0 2023-09-28 18:39:19,331 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:39:19,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:39:19,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-28 18:39:20,969 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:39:22,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:39:22,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-28 18:39:24,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:39:24,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-28 18:39:30,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:39:30,672 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:39:33,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:39:35,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:39:35,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-28 18:39:37,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-28 18:39:37,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-28 18:39:37,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:39:39,510 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-28 18:39:41,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:39:41,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:39:41,311 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 18:39:43,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-28 18:39:43,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-28 18:39:44,725 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-28 18:39:46,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:39:46,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:39:46,543 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=106840.0, ans=0.1 2023-09-28 18:39:47,468 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 2.007e+02 2.562e+02 2.907e+02 3.580e+02 6.238e+02, threshold=5.814e+02, percent-clipped=1.0 2023-09-28 18:39:47,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-28 18:39:47,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-28 18:39:49,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:39:49,334 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:39:50,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-28 18:39:50,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:39:55,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:39:56,646 INFO [train.py:1039] (1/4) Epoch 4, batch 100, loss[loss=0.3749, simple_loss=0.389, pruned_loss=0.1804, over 19363.00 frames. ], tot_loss[loss=0.2806, simple_loss=0.3317, pruned_loss=0.1148, over 1880055.39 frames. ], batch size: 388, lr: 2.55e-02, grad_scale: 32.0 2023-09-28 18:39:58,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:40:00,310 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=106906.66666666667, ans=0.04949747468305833 2023-09-28 18:40:01,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:40:03,850 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=106906.66666666667, ans=0.0 2023-09-28 18:40:03,979 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=106906.66666666667, ans=0.125 2023-09-28 18:40:05,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-28 18:40:05,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:40:08,868 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:40:08,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:40:08,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:40:08,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:40:10,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:40:11,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-28 18:40:15,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:40:15,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:40:15,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:40:17,067 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:40:17,521 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=106973.33333333333, ans=0.2 2023-09-28 18:40:21,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-28 18:40:21,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:40:22,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:40:22,841 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-28 18:40:24,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 18:40:29,220 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-28 18:40:29,244 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-28 18:40:30,900 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:40:30,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:40:32,758 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=107040.0, ans=0.1 2023-09-28 18:40:35,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-28 18:40:35,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:40:39,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:40:43,363 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=107040.0, ans=0.125 2023-09-28 18:40:47,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:40:47,615 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-28 18:40:47,737 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.max_positive, batch_count=107106.66666666667, ans=0.95 2023-09-28 18:40:49,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-28 18:40:54,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:40:55,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:41:00,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:41:02,384 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:41:05,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:41:07,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:41:09,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:41:10,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:41:11,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:41:11,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:41:11,669 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:41:13,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-28 18:41:13,073 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-28 18:41:13,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:41:14,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:41:15,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:41:15,373 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:41:15,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 18:41:16,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 18:41:16,822 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-28 18:41:16,834 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:41:16,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:41:18,400 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:41:18,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:41:18,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:41:19,796 INFO [train.py:1039] (1/4) Epoch 4, batch 150, loss[loss=0.2765, simple_loss=0.3171, pruned_loss=0.1179, over 23452.00 frames. ], tot_loss[loss=0.2757, simple_loss=0.3274, pruned_loss=0.112, over 2511610.47 frames. ], batch size: 134, lr: 2.55e-02, grad_scale: 32.0 2023-09-28 18:41:21,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:41:24,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:41:24,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:41:24,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:41:28,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:41:28,740 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.40 vs. limit=15.0 2023-09-28 18:41:29,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:41:31,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:41:33,354 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:41:39,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-28 18:41:39,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-28 18:41:39,218 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-28 18:41:42,335 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:41:42,344 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:41:42,600 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=107306.66666666667, ans=0.0 2023-09-28 18:41:43,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:41:44,092 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=107306.66666666667, ans=0.1 2023-09-28 18:41:45,361 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:41:45,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:41:45,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:41:48,242 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:41:49,733 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-28 18:41:52,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:41:54,090 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=107373.33333333333, ans=0.2 2023-09-28 18:41:58,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:42:03,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:42:03,718 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-28 18:42:09,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:42:09,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:42:09,705 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:42:11,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:42:13,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:42:16,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-28 18:42:17,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:42:18,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-28 18:42:22,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:42:24,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:42:24,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:42:24,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:42:27,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:42:27,615 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=107506.66666666667, ans=0.125 2023-09-28 18:42:29,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 18:42:31,537 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.741e+02 2.491e+02 2.943e+02 3.333e+02 6.261e+02, threshold=5.886e+02, percent-clipped=1.0 2023-09-28 18:42:33,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:42:33,694 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=107506.66666666667, ans=0.1 2023-09-28 18:42:33,719 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=107506.66666666667, ans=0.125 2023-09-28 18:42:34,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:42:34,888 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:42:36,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-28 18:42:36,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-28 18:42:36,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:42:36,661 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-28 18:42:40,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:42:41,582 INFO [train.py:1039] (1/4) Epoch 4, batch 200, loss[loss=0.2811, simple_loss=0.3428, pruned_loss=0.1097, over 23793.00 frames. ], tot_loss[loss=0.2744, simple_loss=0.3274, pruned_loss=0.1107, over 3020645.86 frames. ], batch size: 85, lr: 2.55e-02, grad_scale: 32.0 2023-09-28 18:42:44,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:42:44,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:42:48,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-28 18:42:50,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:42:50,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:42:53,125 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-28 18:42:54,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=107573.33333333333, ans=0.1 2023-09-28 18:42:54,841 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=107573.33333333333, ans=0.0 2023-09-28 18:42:56,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-28 18:42:56,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:42:57,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:43:01,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:43:02,903 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:43:02,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:43:19,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:43:19,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:43:20,074 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=107706.66666666667, ans=0.05 2023-09-28 18:43:21,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:43:23,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:43:23,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 18:43:23,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 18:43:26,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:43:26,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:43:26,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:43:28,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:43:28,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-28 18:43:29,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 18:43:29,731 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:43:32,461 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=107773.33333333333, ans=10.0 2023-09-28 18:43:33,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:43:43,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:43:45,008 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=107773.33333333333, ans=0.0 2023-09-28 18:43:51,605 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:43:53,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:43:57,891 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:44:01,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-28 18:44:01,499 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:44:01,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-28 18:44:01,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:44:02,995 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:44:04,257 INFO [train.py:1039] (1/4) Epoch 4, batch 250, loss[loss=0.2757, simple_loss=0.3086, pruned_loss=0.1215, over 22730.00 frames. ], tot_loss[loss=0.2771, simple_loss=0.3293, pruned_loss=0.1124, over 3385487.00 frames. ], batch size: 322, lr: 2.54e-02, grad_scale: 16.0 2023-09-28 18:44:04,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-28 18:44:04,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:44:04,541 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-28 18:44:07,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:44:11,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:44:13,084 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:44:13,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:44:14,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:44:14,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:44:16,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:44:20,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:44:34,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:44:35,438 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.42 vs. limit=15.0 2023-09-28 18:44:36,683 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:44:38,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:44:45,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-28 18:44:46,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-28 18:44:47,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:44:47,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:44:47,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 18:44:47,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:44:48,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:44:51,181 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:44:52,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-28 18:44:52,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:44:54,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:44:56,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-28 18:44:56,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:44:57,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:44:57,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:44:59,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:44:59,792 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=108106.66666666667, ans=0.125 2023-09-28 18:45:00,985 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:45:02,530 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:45:03,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:45:07,733 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.99 vs. limit=22.5 2023-09-28 18:45:08,444 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:45:09,269 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=13.71 vs. limit=15.0 2023-09-28 18:45:12,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:45:13,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:45:18,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:45:19,093 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.821e+02 2.378e+02 2.704e+02 3.177e+02 4.711e+02, threshold=5.407e+02, percent-clipped=0.0 2023-09-28 18:45:19,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:45:23,846 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-28 18:45:25,256 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:45:25,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:45:25,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-28 18:45:25,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-28 18:45:26,849 INFO [train.py:1039] (1/4) Epoch 4, batch 300, loss[loss=0.2683, simple_loss=0.2844, pruned_loss=0.1261, over 19283.00 frames. ], tot_loss[loss=0.2745, simple_loss=0.3265, pruned_loss=0.1112, over 3667162.25 frames. ], batch size: 388, lr: 2.54e-02, grad_scale: 16.0 2023-09-28 18:45:27,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:45:27,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-28 18:45:32,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:45:32,491 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:45:38,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:45:38,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-28 18:45:40,336 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:45:41,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 18:45:41,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-28 18:45:41,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:45:47,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:45:52,478 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:45:52,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-28 18:45:54,735 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.67 vs. limit=15.0 2023-09-28 18:45:57,198 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-28 18:45:58,596 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:46:00,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:46:01,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:46:01,967 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-28 18:46:01,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:46:04,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:46:07,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:46:07,199 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:46:11,979 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-28 18:46:11,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-28 18:46:13,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:46:16,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:46:16,828 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=108440.0, ans=0.125 2023-09-28 18:46:18,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-28 18:46:18,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:46:24,399 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:46:24,544 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=108440.0, ans=0.04949747468305833 2023-09-28 18:46:27,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:46:27,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-28 18:46:30,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:46:30,404 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 18:46:33,450 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:46:35,014 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-28 18:46:35,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-28 18:46:35,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 18:46:38,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:46:39,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-28 18:46:41,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:46:41,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:46:43,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:46:44,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:46:44,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:46:48,535 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.98 vs. limit=15.0 2023-09-28 18:46:49,312 INFO [train.py:1039] (1/4) Epoch 4, batch 350, loss[loss=0.269, simple_loss=0.3323, pruned_loss=0.1028, over 24067.00 frames. ], tot_loss[loss=0.2715, simple_loss=0.3233, pruned_loss=0.1099, over 3898064.46 frames. ], batch size: 80, lr: 2.54e-02, grad_scale: 16.0 2023-09-28 18:46:49,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:46:49,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 18:46:53,055 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:46:56,104 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=108573.33333333333, ans=0.0 2023-09-28 18:46:56,217 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=108573.33333333333, ans=0.125 2023-09-28 18:47:01,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:47:04,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:47:04,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:47:07,850 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-28 18:47:09,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:47:10,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-28 18:47:12,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:47:12,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-28 18:47:13,089 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=108640.0, ans=0.0 2023-09-28 18:47:14,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:47:17,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-28 18:47:18,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:47:21,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:47:22,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:47:24,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:47:24,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:47:24,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:47:24,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:47:25,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:47:26,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:47:26,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:47:27,973 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=108706.66666666667, ans=0.125 2023-09-28 18:47:30,279 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=108706.66666666667, ans=0.2 2023-09-28 18:47:35,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:47:36,633 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-28 18:47:36,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:47:36,773 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:47:41,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-28 18:47:41,534 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:47:46,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:47:47,539 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:47:47,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:47:49,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-28 18:47:50,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:47:52,373 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-28 18:47:53,783 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-28 18:47:53,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:47:54,559 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.39 vs. limit=15.0 2023-09-28 18:47:57,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:47:57,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-28 18:48:00,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:48:03,918 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.921e+02 2.316e+02 2.681e+02 3.192e+02 4.934e+02, threshold=5.363e+02, percent-clipped=0.0 2023-09-28 18:48:04,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:48:06,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:48:08,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:48:08,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:48:09,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:48:12,995 INFO [train.py:1039] (1/4) Epoch 4, batch 400, loss[loss=0.2678, simple_loss=0.3312, pruned_loss=0.1022, over 24295.00 frames. ], tot_loss[loss=0.27, simple_loss=0.3224, pruned_loss=0.1088, over 4081138.44 frames. ], batch size: 74, lr: 2.53e-02, grad_scale: 32.0 2023-09-28 18:48:13,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:48:16,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-28 18:48:16,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-28 18:48:17,799 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:48:17,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:48:18,922 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.43 vs. limit=10.0 2023-09-28 18:48:20,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:48:21,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:48:22,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:48:25,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:48:28,711 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-28 18:48:30,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-28 18:48:30,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:48:32,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-28 18:48:32,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:48:37,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:48:37,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:48:39,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-28 18:48:39,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:48:40,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:48:40,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:48:40,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:48:43,247 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-28 18:48:44,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-28 18:48:50,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:48:51,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:48:51,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-28 18:48:53,251 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-28 18:48:56,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:48:58,151 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=109040.0, ans=0.0 2023-09-28 18:48:58,159 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=109040.0, ans=0.5 2023-09-28 18:48:59,253 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:49:05,509 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-28 18:49:09,454 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-28 18:49:09,728 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=109106.66666666667, ans=0.09899494936611666 2023-09-28 18:49:10,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-28 18:49:14,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:49:14,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:49:15,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-28 18:49:18,830 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.87 vs. limit=15.0 2023-09-28 18:49:19,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:49:23,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 18:49:23,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:49:25,236 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=109173.33333333333, ans=0.125 2023-09-28 18:49:26,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:49:26,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-28 18:49:29,538 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-28 18:49:30,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-28 18:49:34,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:49:34,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:49:34,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-28 18:49:35,805 INFO [train.py:1039] (1/4) Epoch 4, batch 450, loss[loss=0.2377, simple_loss=0.2934, pruned_loss=0.09102, over 17039.00 frames. ], tot_loss[loss=0.2712, simple_loss=0.3234, pruned_loss=0.1095, over 4211876.15 frames. ], batch size: 36, lr: 2.53e-02, grad_scale: 32.0 2023-09-28 18:49:36,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:49:37,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:49:37,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-28 18:49:39,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-28 18:49:40,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:49:40,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:49:40,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:49:43,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-28 18:49:43,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:49:44,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:49:45,499 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=6.04 vs. limit=15.0 2023-09-28 18:49:46,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:49:55,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:49:57,168 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:49:58,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-28 18:49:58,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-28 18:50:03,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-28 18:50:06,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:50:08,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:50:11,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:50:12,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:50:16,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-28 18:50:18,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-28 18:50:20,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-28 18:50:21,992 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:50:22,295 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=109373.33333333333, ans=0.125 2023-09-28 18:50:23,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:50:24,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:50:24,681 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=109373.33333333333, ans=0.125 2023-09-28 18:50:25,628 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-28 18:50:25,643 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-28 18:50:25,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:50:27,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:50:27,300 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-28 18:50:31,085 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-28 18:50:31,127 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:50:31,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-28 18:50:32,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-28 18:50:35,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:50:37,240 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-28 18:50:37,295 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 18:50:38,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-28 18:50:43,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:50:45,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-28 18:50:45,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-28 18:50:46,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:50:50,075 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=109506.66666666667, ans=0.2 2023-09-28 18:50:51,107 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.901e+02 2.294e+02 2.615e+02 3.130e+02 6.732e+02, threshold=5.230e+02, percent-clipped=1.0 2023-09-28 18:50:53,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:50:55,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:50:58,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:50:58,541 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-28 18:50:59,902 INFO [train.py:1039] (1/4) Epoch 4, batch 500, loss[loss=0.2749, simple_loss=0.3221, pruned_loss=0.1138, over 23330.00 frames. ], tot_loss[loss=0.2722, simple_loss=0.3243, pruned_loss=0.1101, over 4328410.29 frames. ], batch size: 119, lr: 2.53e-02, grad_scale: 32.0 2023-09-28 18:51:02,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:51:04,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:51:04,312 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:51:04,330 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-28 18:51:05,174 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.01 vs. limit=15.0 2023-09-28 18:51:07,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-28 18:51:07,382 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:51:10,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 18:51:10,823 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=109573.33333333333, ans=0.95 2023-09-28 18:51:14,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 18:51:17,852 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:51:19,562 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:51:19,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:51:19,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:51:28,185 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.38 vs. limit=15.0 2023-09-28 18:51:31,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:51:31,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-28 18:51:31,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:51:33,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:51:33,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-28 18:51:33,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:51:36,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:51:36,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:51:38,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:51:38,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:51:40,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-28 18:51:42,226 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=109706.66666666667, ans=0.1 2023-09-28 18:51:43,326 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-28 18:51:45,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:51:47,149 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.whiten.whitening_limit, batch_count=109706.66666666667, ans=12.0 2023-09-28 18:51:47,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:51:48,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:51:48,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:51:49,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-28 18:51:51,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-28 18:51:55,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:51:55,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:52:01,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:52:03,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:52:10,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:52:12,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-28 18:52:14,509 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:52:14,530 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:52:17,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-28 18:52:19,148 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-28 18:52:20,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:52:21,032 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=109906.66666666667, ans=0.125 2023-09-28 18:52:22,108 INFO [train.py:1039] (1/4) Epoch 4, batch 550, loss[loss=0.2797, simple_loss=0.3454, pruned_loss=0.107, over 24071.00 frames. ], tot_loss[loss=0.2725, simple_loss=0.3247, pruned_loss=0.1101, over 4416277.27 frames. ], batch size: 80, lr: 2.52e-02, grad_scale: 32.0 2023-09-28 18:52:25,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-28 18:52:26,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-28 18:52:26,761 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:52:26,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-28 18:52:28,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:52:28,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:52:28,384 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:52:28,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:52:29,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:52:30,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:52:33,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:52:34,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-28 18:52:34,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:52:38,688 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:52:38,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:52:40,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:52:43,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:52:44,169 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=109973.33333333333, ans=0.05 2023-09-28 18:52:47,478 WARNING [train.py:1197] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-28 18:52:48,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-28 18:52:49,259 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=109973.33333333333, ans=0.1 2023-09-28 18:52:51,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:52:58,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:52:58,156 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:52:59,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-28 18:53:04,256 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:53:04,265 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-28 18:53:04,410 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:53:05,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 18:53:06,081 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=110040.0, ans=0.0 2023-09-28 18:53:09,247 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=110106.66666666667, ans=0.1 2023-09-28 18:53:10,979 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:53:11,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:53:11,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-28 18:53:11,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:53:12,869 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=110106.66666666667, ans=0.0 2023-09-28 18:53:14,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-28 18:53:15,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-28 18:53:15,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:53:15,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:53:15,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:53:15,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:53:21,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:53:22,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:53:25,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:53:25,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:53:28,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 18:53:28,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:53:29,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:53:31,158 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:53:31,251 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:53:32,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-28 18:53:32,801 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-28 18:53:35,857 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.907e+02 2.501e+02 3.093e+02 3.785e+02 7.626e+02, threshold=6.186e+02, percent-clipped=7.0 2023-09-28 18:53:37,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-28 18:53:43,633 INFO [train.py:1039] (1/4) Epoch 4, batch 600, loss[loss=0.2707, simple_loss=0.3317, pruned_loss=0.1048, over 24491.00 frames. ], tot_loss[loss=0.2739, simple_loss=0.3257, pruned_loss=0.111, over 4475188.39 frames. ], batch size: 66, lr: 2.52e-02, grad_scale: 32.0 2023-09-28 18:53:43,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-28 18:53:43,874 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:53:45,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 18:53:45,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:53:53,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:53:53,829 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=110240.0, ans=0.125 2023-09-28 18:53:55,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 18:53:57,140 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-28 18:53:58,657 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-28 18:54:01,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:54:04,457 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:54:06,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-28 18:54:06,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:54:12,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-28 18:54:12,590 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=110306.66666666667, ans=0.0 2023-09-28 18:54:18,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:54:18,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:54:18,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:54:25,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:54:25,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:54:25,335 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=110373.33333333333, ans=0.125 2023-09-28 18:54:26,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:54:33,827 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:54:38,664 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:54:38,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:54:38,687 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:54:39,036 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=110440.0, ans=10.0 2023-09-28 18:54:44,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-28 18:54:51,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-28 18:54:51,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:54:54,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-28 18:54:55,153 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=110506.66666666667, ans=0.0 2023-09-28 18:54:56,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:55:00,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-28 18:55:00,147 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:55:00,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 18:55:06,790 INFO [train.py:1039] (1/4) Epoch 4, batch 650, loss[loss=0.2715, simple_loss=0.3356, pruned_loss=0.1037, over 24396.00 frames. ], tot_loss[loss=0.2719, simple_loss=0.3237, pruned_loss=0.11, over 4532045.16 frames. ], batch size: 77, lr: 2.52e-02, grad_scale: 32.0 2023-09-28 18:55:06,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 18:55:07,086 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:55:10,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:55:12,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:55:15,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:55:17,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-28 18:55:17,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:55:23,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:55:23,595 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:55:26,128 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.32 vs. limit=6.0 2023-09-28 18:55:29,096 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:55:29,621 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=110640.0, ans=0.125 2023-09-28 18:55:30,846 WARNING [train.py:1197] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-28 18:55:34,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:55:34,519 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:55:38,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:55:38,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 18:55:41,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:55:42,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:55:42,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 18:55:43,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:55:44,580 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:55:46,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:55:46,249 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-28 18:55:46,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:55:46,290 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:55:49,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:55:51,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:55:52,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:55:52,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-28 18:55:54,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-28 18:55:54,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:55:55,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:55:57,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-28 18:55:57,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:55:59,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 18:56:02,102 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-28 18:56:03,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-28 18:56:05,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:56:05,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:56:05,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:56:05,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:56:08,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:56:15,485 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:56:15,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:56:15,700 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:56:18,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:56:18,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 18:56:20,138 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.763e+02 2.387e+02 2.700e+02 3.231e+02 6.128e+02, threshold=5.400e+02, percent-clipped=0.0 2023-09-28 18:56:20,321 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:56:26,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 18:56:26,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:56:27,736 INFO [train.py:1039] (1/4) Epoch 4, batch 700, loss[loss=0.2732, simple_loss=0.3153, pruned_loss=0.1156, over 23724.00 frames. ], tot_loss[loss=0.2703, simple_loss=0.3224, pruned_loss=0.1091, over 4580450.06 frames. ], batch size: 232, lr: 2.51e-02, grad_scale: 32.0 2023-09-28 18:56:27,844 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:56:27,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:56:31,674 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.62 vs. limit=12.0 2023-09-28 18:56:32,520 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-28 18:56:34,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-28 18:56:36,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-28 18:56:36,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:56:39,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:56:41,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-28 18:56:45,643 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:56:48,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:56:48,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:56:48,854 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=110973.33333333333, ans=0.125 2023-09-28 18:56:50,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-28 18:56:51,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:56:53,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:56:56,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 18:56:56,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:56:58,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-28 18:57:00,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-28 18:57:05,088 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-28 18:57:05,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:57:08,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-28 18:57:11,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:57:13,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-28 18:57:16,341 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.72 vs. limit=22.5 2023-09-28 18:57:18,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:57:19,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:57:19,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-28 18:57:24,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:57:26,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:57:30,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:57:36,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:57:36,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-28 18:57:40,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-28 18:57:40,673 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-28 18:57:43,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:57:45,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:57:46,817 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:57:50,936 INFO [train.py:1039] (1/4) Epoch 4, batch 750, loss[loss=0.2535, simple_loss=0.3277, pruned_loss=0.0896, over 24675.00 frames. ], tot_loss[loss=0.2694, simple_loss=0.3221, pruned_loss=0.1084, over 4616729.53 frames. ], batch size: 73, lr: 2.51e-02, grad_scale: 32.0 2023-09-28 18:57:51,072 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:57:51,082 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-28 18:57:54,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-28 18:57:54,908 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-28 18:57:56,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-28 18:57:56,630 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=111240.0, ans=0.0 2023-09-28 18:57:56,721 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=111240.0, ans=0.125 2023-09-28 18:57:57,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-28 18:57:57,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-28 18:57:57,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:57:59,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-28 18:57:59,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:58:01,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-28 18:58:02,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:58:04,227 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:58:05,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-28 18:58:05,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:58:07,309 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:58:08,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:58:10,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:58:12,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:58:14,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:58:14,194 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-28 18:58:15,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:58:16,029 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=111306.66666666667, ans=0.2 2023-09-28 18:58:16,672 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.09 vs. limit=10.0 2023-09-28 18:58:17,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:58:17,425 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=111306.66666666667, ans=0.1 2023-09-28 18:58:18,704 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:58:20,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-28 18:58:21,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-28 18:58:21,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:58:24,404 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=111373.33333333333, ans=0.2 2023-09-28 18:58:25,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-28 18:58:25,612 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-28 18:58:26,029 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=111373.33333333333, ans=0.2 2023-09-28 18:58:27,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-28 18:58:27,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:58:27,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 18:58:28,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:58:29,125 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=111373.33333333333, ans=0.125 2023-09-28 18:58:35,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-28 18:58:36,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:58:36,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 18:58:38,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:58:39,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:58:39,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-28 18:58:41,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:58:44,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-28 18:58:44,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:58:47,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:58:47,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-28 18:58:48,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:58:54,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:58:54,376 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=111506.66666666667, ans=0.2 2023-09-28 18:58:55,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:58:55,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:58:59,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:59:03,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-28 18:59:04,537 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.995e+02 2.483e+02 2.790e+02 3.186e+02 5.320e+02, threshold=5.579e+02, percent-clipped=0.0 2023-09-28 18:59:04,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:59:04,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:59:07,892 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=111506.66666666667, ans=0.125 2023-09-28 18:59:09,223 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:59:09,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:59:12,120 INFO [train.py:1039] (1/4) Epoch 4, batch 800, loss[loss=0.283, simple_loss=0.3393, pruned_loss=0.1133, over 23651.00 frames. ], tot_loss[loss=0.2703, simple_loss=0.3228, pruned_loss=0.1089, over 4624989.98 frames. ], batch size: 85, lr: 2.51e-02, grad_scale: 32.0 2023-09-28 18:59:12,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:59:12,426 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-28 18:59:14,273 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=111573.33333333333, ans=0.2 2023-09-28 18:59:20,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:59:20,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:59:23,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:59:23,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:59:24,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:59:24,085 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:59:25,754 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=111573.33333333333, ans=0.07 2023-09-28 18:59:26,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:59:27,295 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=111640.0, ans=0.0 2023-09-28 18:59:31,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:59:31,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 18:59:35,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-28 18:59:37,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:59:38,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:59:38,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:59:39,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:59:39,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-28 18:59:39,160 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:59:39,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-28 18:59:42,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:59:45,461 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:59:46,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:59:46,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:59:47,315 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=111706.66666666667, ans=0.1 2023-09-28 18:59:50,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:59:50,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:59:56,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:59:56,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 18:59:56,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-28 18:59:58,482 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-28 18:59:59,936 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-28 18:59:59,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:59:59,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:00:01,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:00:02,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:00:08,273 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-28 19:00:08,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-28 19:00:10,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:00:11,381 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.39 vs. limit=22.5 2023-09-28 19:00:13,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:00:18,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:00:21,593 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:00:23,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-28 19:00:23,161 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:00:23,585 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=111840.0, ans=0.2 2023-09-28 19:00:26,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-28 19:00:31,635 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=111840.0, ans=0.05 2023-09-28 19:00:34,273 INFO [train.py:1039] (1/4) Epoch 4, batch 850, loss[loss=0.2436, simple_loss=0.2987, pruned_loss=0.09428, over 22268.00 frames. ], tot_loss[loss=0.2713, simple_loss=0.3238, pruned_loss=0.1094, over 4645282.63 frames. ], batch size: 48, lr: 2.50e-02, grad_scale: 32.0 2023-09-28 19:00:34,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:00:36,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:00:36,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-28 19:00:37,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:00:37,721 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:00:39,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-28 19:00:40,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:00:40,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:00:42,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:00:44,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 19:00:46,620 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:00:48,219 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-28 19:00:48,305 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-28 19:00:48,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-28 19:00:49,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:00:49,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:00:52,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:00:52,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:00:52,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 19:00:58,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:00:59,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:00:59,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-28 19:01:01,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-28 19:01:04,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:01:06,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-28 19:01:10,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-28 19:01:12,384 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-28 19:01:14,081 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-28 19:01:15,097 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=112040.0, ans=0.125 2023-09-28 19:01:16,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:01:16,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:01:16,157 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 19:01:17,926 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:01:21,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:01:21,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-28 19:01:24,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:01:24,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:01:25,892 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:01:25,941 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-28 19:01:28,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:01:30,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-28 19:01:30,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-28 19:01:35,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:01:35,764 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:01:35,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:01:35,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:01:37,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:01:38,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:01:42,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-28 19:01:43,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:01:45,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:01:46,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-28 19:01:47,931 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.846e+02 2.332e+02 2.611e+02 3.097e+02 5.192e+02, threshold=5.223e+02, percent-clipped=0.0 2023-09-28 19:01:52,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-28 19:01:54,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:01:56,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-28 19:01:56,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:01:56,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:01:57,610 INFO [train.py:1039] (1/4) Epoch 4, batch 900, loss[loss=0.2907, simple_loss=0.3432, pruned_loss=0.1191, over 23943.00 frames. ], tot_loss[loss=0.2735, simple_loss=0.3256, pruned_loss=0.1108, over 4657877.79 frames. ], batch size: 86, lr: 2.50e-02, grad_scale: 32.0 2023-09-28 19:01:59,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-28 19:02:00,951 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=112240.0, ans=0.0 2023-09-28 19:02:06,624 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:02:09,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:02:09,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-28 19:02:11,572 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=112306.66666666667, ans=0.125 2023-09-28 19:02:11,755 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=112306.66666666667, ans=0.125 2023-09-28 19:02:12,024 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=6.47 vs. limit=15.0 2023-09-28 19:02:13,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:02:13,960 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=112306.66666666667, ans=0.2 2023-09-28 19:02:15,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-28 19:02:15,181 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-28 19:02:16,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:02:16,715 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:02:16,801 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 19:02:18,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:02:24,078 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.49 vs. limit=22.5 2023-09-28 19:02:27,037 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=12.62 vs. limit=15.0 2023-09-28 19:02:29,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:02:29,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:02:29,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 19:02:29,809 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=112373.33333333333, ans=0.125 2023-09-28 19:02:32,164 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=112373.33333333333, ans=0.1 2023-09-28 19:02:33,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:02:38,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-28 19:02:40,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:02:44,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-28 19:02:46,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:02:46,789 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-28 19:02:48,255 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-28 19:02:51,505 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=112440.0, ans=0.125 2023-09-28 19:02:54,475 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-28 19:02:54,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:02:55,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 19:03:01,457 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:03:01,475 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:03:03,576 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.13 vs. limit=15.0 2023-09-28 19:03:05,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-28 19:03:05,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:03:08,822 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-28 19:03:10,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:03:10,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:03:10,550 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=112506.66666666667, ans=0.2 2023-09-28 19:03:12,034 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=112506.66666666667, ans=0.125 2023-09-28 19:03:13,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:03:13,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:03:16,563 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-28 19:03:16,623 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-28 19:03:18,157 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-28 19:03:18,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-28 19:03:20,116 INFO [train.py:1039] (1/4) Epoch 4, batch 950, loss[loss=0.2393, simple_loss=0.3058, pruned_loss=0.08641, over 24650.00 frames. ], tot_loss[loss=0.2732, simple_loss=0.3255, pruned_loss=0.1105, over 4672995.59 frames. ], batch size: 65, lr: 2.50e-02, grad_scale: 32.0 2023-09-28 19:03:21,708 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:03:23,562 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=112573.33333333333, ans=0.125 2023-09-28 19:03:26,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-28 19:03:30,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:03:33,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:03:33,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:03:34,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 19:03:36,501 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-28 19:03:41,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:03:41,716 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:03:41,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:03:43,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:03:43,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-28 19:03:44,152 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:03:45,387 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-28 19:03:46,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:03:47,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-28 19:03:48,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:03:53,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:03:53,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:03:53,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:03:55,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-28 19:03:56,945 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 19:03:58,807 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=112706.66666666667, ans=0.125 2023-09-28 19:04:00,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:04:01,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:04:05,524 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:04:06,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:04:08,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-28 19:04:10,674 WARNING [train.py:1197] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 19:04:10,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 19:04:12,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:04:12,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:04:12,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:04:17,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-28 19:04:18,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:04:21,814 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:04:23,255 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:04:23,295 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-28 19:04:24,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:04:24,709 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:04:26,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-28 19:04:30,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 19:04:33,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:04:34,679 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.828e+02 2.515e+02 2.858e+02 3.350e+02 4.786e+02, threshold=5.716e+02, percent-clipped=0.0 2023-09-28 19:04:37,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:04:38,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-28 19:04:38,611 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-28 19:04:42,974 INFO [train.py:1039] (1/4) Epoch 4, batch 1000, loss[loss=0.2813, simple_loss=0.2943, pruned_loss=0.1341, over 19586.00 frames. ], tot_loss[loss=0.2729, simple_loss=0.3242, pruned_loss=0.1108, over 4658207.56 frames. ], batch size: 388, lr: 2.50e-02, grad_scale: 32.0 2023-09-28 19:04:43,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:04:46,828 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-28 19:04:46,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:04:51,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:04:53,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-28 19:04:53,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-28 19:04:59,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:04:59,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:04:59,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:05:04,963 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-28 19:05:08,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-28 19:05:10,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-28 19:05:10,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:05:10,797 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.41 vs. limit=15.0 2023-09-28 19:05:11,719 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-28 19:05:13,377 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-28 19:05:14,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-28 19:05:15,059 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=113040.0, ans=0.125 2023-09-28 19:05:16,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:05:17,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:05:27,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:05:27,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:05:29,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:05:30,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:05:30,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-28 19:05:30,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:05:31,091 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:05:32,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:05:32,615 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-28 19:05:35,182 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=113106.66666666667, ans=0.125 2023-09-28 19:05:36,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-28 19:05:37,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-28 19:05:39,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-28 19:05:40,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:05:43,610 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=113106.66666666667, ans=0.125 2023-09-28 19:05:47,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:05:47,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:05:47,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:05:49,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:05:50,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-28 19:05:52,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:05:53,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-28 19:05:53,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-28 19:05:55,953 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:05:55,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:05:58,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:06:02,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 19:06:04,965 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:06:06,332 INFO [train.py:1039] (1/4) Epoch 4, batch 1050, loss[loss=0.2911, simple_loss=0.3524, pruned_loss=0.1149, over 24634.00 frames. ], tot_loss[loss=0.2707, simple_loss=0.3227, pruned_loss=0.1094, over 4676053.02 frames. ], batch size: 73, lr: 2.49e-02, grad_scale: 32.0 2023-09-28 19:06:09,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:06:11,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:06:12,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 19:06:14,473 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:06:15,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:06:16,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:06:18,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:06:19,056 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.18 vs. limit=15.0 2023-09-28 19:06:21,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:06:22,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:06:22,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-28 19:06:24,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:06:24,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-28 19:06:26,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:06:26,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-28 19:06:29,903 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:06:29,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-28 19:06:29,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-28 19:06:38,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:06:38,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:06:40,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:06:41,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-28 19:06:41,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-28 19:06:42,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:06:45,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-28 19:06:48,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-28 19:06:48,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:06:52,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 19:06:54,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-28 19:06:55,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:06:55,164 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:06:58,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:07:02,086 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-28 19:07:03,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-28 19:07:03,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-28 19:07:05,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:07:05,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:07:07,213 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-28 19:07:13,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:07:15,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:07:15,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:07:16,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:07:16,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:07:20,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:07:20,202 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-28 19:07:21,390 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.899e+02 2.368e+02 2.685e+02 3.530e+02 6.169e+02, threshold=5.370e+02, percent-clipped=1.0 2023-09-28 19:07:23,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:07:23,037 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-28 19:07:23,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-28 19:07:24,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:07:24,839 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=113506.66666666667, ans=0.125 2023-09-28 19:07:28,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:07:28,593 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=113573.33333333333, ans=0.1 2023-09-28 19:07:29,828 INFO [train.py:1039] (1/4) Epoch 4, batch 1100, loss[loss=0.2945, simple_loss=0.3502, pruned_loss=0.1194, over 24421.00 frames. ], tot_loss[loss=0.2707, simple_loss=0.3233, pruned_loss=0.1091, over 4677927.06 frames. ], batch size: 77, lr: 2.49e-02, grad_scale: 32.0 2023-09-28 19:07:34,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:07:38,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 19:07:38,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:07:38,691 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=113573.33333333333, ans=0.0 2023-09-28 19:07:40,329 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:07:40,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-28 19:07:43,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:07:43,691 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=113573.33333333333, ans=0.0 2023-09-28 19:07:45,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-28 19:07:47,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:07:51,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:07:51,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-28 19:07:53,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 19:07:54,712 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:07:56,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:07:57,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:07:57,967 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=113640.0, ans=0.125 2023-09-28 19:08:00,700 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:08:04,744 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:08:06,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-28 19:08:08,053 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-28 19:08:09,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:08:11,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:08:12,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-28 19:08:15,035 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:08:16,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-28 19:08:16,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:08:16,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:08:16,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:08:16,802 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:08:16,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-28 19:08:23,443 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:08:23,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-28 19:08:23,713 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=113773.33333333333, ans=0.07 2023-09-28 19:08:26,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 19:08:29,886 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=113773.33333333333, ans=0.0 2023-09-28 19:08:30,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 19:08:34,904 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-28 19:08:34,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-28 19:08:35,850 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.04 vs. limit=8.0 2023-09-28 19:08:37,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:08:40,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:08:40,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:08:42,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-28 19:08:42,747 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=113840.0, ans=0.125 2023-09-28 19:08:43,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:08:43,965 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:08:44,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-28 19:08:44,163 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:08:45,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-28 19:08:49,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:08:49,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:08:51,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:08:52,494 INFO [train.py:1039] (1/4) Epoch 4, batch 1150, loss[loss=0.2868, simple_loss=0.333, pruned_loss=0.1203, over 23518.00 frames. ], tot_loss[loss=0.2707, simple_loss=0.3232, pruned_loss=0.1091, over 4682115.13 frames. ], batch size: 134, lr: 2.49e-02, grad_scale: 32.0 2023-09-28 19:08:53,058 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=113906.66666666667, ans=0.0 2023-09-28 19:08:57,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:08:59,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:09:00,044 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=113906.66666666667, ans=0.125 2023-09-28 19:09:01,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:09:01,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:09:02,487 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-28 19:09:02,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:09:05,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-28 19:09:05,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:09:05,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 19:09:12,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-28 19:09:14,242 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:09:20,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:09:20,204 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:09:20,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-28 19:09:20,292 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:09:20,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:09:24,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-28 19:09:26,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:09:27,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:09:30,131 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=114040.0, ans=0.125 2023-09-28 19:09:38,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:09:47,102 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:09:47,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-28 19:09:47,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:09:48,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:09:53,410 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-28 19:09:56,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:09:58,286 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=114173.33333333333, ans=0.0 2023-09-28 19:10:03,237 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-28 19:10:06,255 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.775e+02 2.334e+02 2.773e+02 3.498e+02 6.141e+02, threshold=5.547e+02, percent-clipped=2.0 2023-09-28 19:10:06,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:10:06,765 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=114173.33333333333, ans=0.125 2023-09-28 19:10:08,582 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:10:08,618 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-28 19:10:08,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:10:11,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:10:14,770 INFO [train.py:1039] (1/4) Epoch 4, batch 1200, loss[loss=0.2793, simple_loss=0.3402, pruned_loss=0.1091, over 24642.00 frames. ], tot_loss[loss=0.2705, simple_loss=0.3234, pruned_loss=0.1088, over 4701469.51 frames. ], batch size: 68, lr: 2.48e-02, grad_scale: 32.0 2023-09-28 19:10:16,759 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=114240.0, ans=0.95 2023-09-28 19:10:17,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:10:17,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:10:19,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:10:19,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:10:19,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:10:23,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:10:24,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 19:10:26,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:10:26,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:10:29,361 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-28 19:10:32,358 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-28 19:10:37,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:10:39,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:10:42,131 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.13 vs. limit=12.0 2023-09-28 19:10:42,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:10:44,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:10:44,572 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-28 19:10:44,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:10:53,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-28 19:10:53,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:10:53,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-28 19:10:55,324 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:10:59,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-28 19:11:02,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-28 19:11:02,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:11:05,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:11:05,747 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.56 vs. limit=10.0 2023-09-28 19:11:06,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:11:06,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-28 19:11:08,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:11:08,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:11:08,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:11:10,774 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-28 19:11:10,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 19:11:12,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:11:12,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 19:11:13,161 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.84 vs. limit=12.0 2023-09-28 19:11:15,984 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:11:15,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:11:20,647 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-28 19:11:22,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:11:23,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-28 19:11:27,853 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-28 19:11:29,350 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:11:32,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:11:34,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:11:35,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:11:37,072 INFO [train.py:1039] (1/4) Epoch 4, batch 1250, loss[loss=0.2821, simple_loss=0.3363, pruned_loss=0.1139, over 24098.00 frames. ], tot_loss[loss=0.2723, simple_loss=0.3246, pruned_loss=0.11, over 4698246.58 frames. ], batch size: 86, lr: 2.48e-02, grad_scale: 32.0 2023-09-28 19:11:38,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-28 19:11:42,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:11:43,037 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=114573.33333333333, ans=0.2 2023-09-28 19:11:44,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:11:44,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-28 19:11:48,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:11:49,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 19:11:51,420 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=114573.33333333333, ans=0.2 2023-09-28 19:11:54,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 19:11:54,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:11:55,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:11:55,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:11:57,711 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=114640.0, ans=0.1 2023-09-28 19:11:58,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-28 19:12:04,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 19:12:04,435 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-28 19:12:04,443 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:12:06,116 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:12:06,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:12:09,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:12:09,665 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=114706.66666666667, ans=0.125 2023-09-28 19:12:10,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-28 19:12:14,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-28 19:12:15,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:12:19,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:12:19,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-28 19:12:21,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:12:21,481 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-28 19:12:21,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:12:21,547 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:12:24,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:12:27,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:12:28,131 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=114773.33333333333, ans=0.125 2023-09-28 19:12:29,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:12:29,447 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=114773.33333333333, ans=0.125 2023-09-28 19:12:29,485 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=114773.33333333333, ans=0.125 2023-09-28 19:12:30,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-28 19:12:30,890 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-28 19:12:31,213 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=114773.33333333333, ans=0.125 2023-09-28 19:12:32,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-28 19:12:32,645 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:12:34,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:12:36,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-28 19:12:37,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:12:40,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-28 19:12:40,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:12:42,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-28 19:12:42,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-28 19:12:43,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:12:43,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-28 19:12:45,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:12:48,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-28 19:12:49,107 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=16.12 vs. limit=15.0 2023-09-28 19:12:50,013 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:12:51,350 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.852e+02 2.462e+02 2.704e+02 3.277e+02 4.911e+02, threshold=5.408e+02, percent-clipped=0.0 2023-09-28 19:12:51,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:12:53,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 19:12:57,469 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-28 19:13:00,833 INFO [train.py:1039] (1/4) Epoch 4, batch 1300, loss[loss=0.2772, simple_loss=0.3377, pruned_loss=0.1083, over 24469.00 frames. ], tot_loss[loss=0.2732, simple_loss=0.3255, pruned_loss=0.1105, over 4707897.56 frames. ], batch size: 66, lr: 2.48e-02, grad_scale: 32.0 2023-09-28 19:13:01,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:13:02,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-28 19:13:05,686 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:13:07,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-28 19:13:09,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:13:12,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:13:13,034 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.09 vs. limit=15.0 2023-09-28 19:13:13,660 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:13:13,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-28 19:13:14,030 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=114906.66666666667, ans=0.2 2023-09-28 19:13:20,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:13:20,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-28 19:13:20,386 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=114973.33333333333, ans=0.0 2023-09-28 19:13:21,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-28 19:13:26,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 19:13:30,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:13:30,679 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:13:30,936 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:13:32,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:13:34,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:13:35,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 19:13:35,966 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=115040.0, ans=0.2 2023-09-28 19:13:37,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-28 19:13:37,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-28 19:13:42,542 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=14.38 vs. limit=15.0 2023-09-28 19:13:43,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:13:43,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 19:13:45,676 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-28 19:13:47,126 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 19:13:48,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:13:51,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:13:53,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-28 19:13:53,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:13:53,350 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-28 19:13:54,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:13:59,443 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:13:59,448 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:14:04,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-28 19:14:04,456 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-28 19:14:06,064 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-28 19:14:11,361 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:14:13,086 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-28 19:14:14,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:14:14,872 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=115173.33333333333, ans=0.0 2023-09-28 19:14:21,662 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=115240.0, ans=0.1 2023-09-28 19:14:22,651 INFO [train.py:1039] (1/4) Epoch 4, batch 1350, loss[loss=0.2845, simple_loss=0.3295, pruned_loss=0.1198, over 23636.00 frames. ], tot_loss[loss=0.2718, simple_loss=0.3241, pruned_loss=0.1097, over 4708497.08 frames. ], batch size: 149, lr: 2.47e-02, grad_scale: 32.0 2023-09-28 19:14:24,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-28 19:14:28,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:14:30,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:14:30,826 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=115240.0, ans=0.125 2023-09-28 19:14:33,739 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:14:33,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:14:35,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:14:35,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:14:43,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:14:44,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-28 19:14:44,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-28 19:14:46,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:14:48,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-28 19:14:49,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:14:51,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:14:51,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-28 19:14:52,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-28 19:14:55,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-28 19:14:56,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:14:56,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-28 19:14:58,448 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=115373.33333333333, ans=0.1 2023-09-28 19:15:04,850 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=115373.33333333333, ans=0.125 2023-09-28 19:15:07,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:15:16,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:15:18,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:15:18,307 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-28 19:15:21,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:15:21,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-28 19:15:21,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-28 19:15:23,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:15:26,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:15:29,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-28 19:15:31,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:15:36,830 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.790e+02 2.262e+02 2.530e+02 3.010e+02 4.866e+02, threshold=5.060e+02, percent-clipped=0.0 2023-09-28 19:15:37,394 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=115506.66666666667, ans=0.1 2023-09-28 19:15:37,767 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=13.34 vs. limit=15.0 2023-09-28 19:15:38,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-28 19:15:40,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-28 19:15:45,572 INFO [train.py:1039] (1/4) Epoch 4, batch 1400, loss[loss=0.289, simple_loss=0.3166, pruned_loss=0.1307, over 22654.00 frames. ], tot_loss[loss=0.2702, simple_loss=0.3226, pruned_loss=0.1089, over 4708615.41 frames. ], batch size: 322, lr: 2.47e-02, grad_scale: 32.0 2023-09-28 19:15:45,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-28 19:15:48,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:15:49,705 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=115573.33333333333, ans=0.1 2023-09-28 19:15:52,448 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:15:52,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:15:57,376 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-28 19:16:00,257 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-28 19:16:06,113 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=115640.0, ans=0.2 2023-09-28 19:16:08,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:16:10,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:16:13,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:16:13,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-28 19:16:17,219 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:16:20,851 WARNING [train.py:1197] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 19:16:28,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:16:30,386 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:16:30,784 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=115706.66666666667, ans=0.2 2023-09-28 19:16:35,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-28 19:16:37,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:16:38,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:16:38,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:16:38,759 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:16:40,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:16:40,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:16:40,326 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:16:43,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-28 19:16:43,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:16:49,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:16:52,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:17:00,201 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-28 19:17:01,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 19:17:03,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:17:06,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 19:17:06,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:17:07,682 INFO [train.py:1039] (1/4) Epoch 4, batch 1450, loss[loss=0.2722, simple_loss=0.3184, pruned_loss=0.113, over 23365.00 frames. ], tot_loss[loss=0.2692, simple_loss=0.3216, pruned_loss=0.1083, over 4708857.64 frames. ], batch size: 119, lr: 2.47e-02, grad_scale: 32.0 2023-09-28 19:17:07,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:17:12,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:17:15,939 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:17:15,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:17:15,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-28 19:17:18,265 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.34 vs. limit=15.0 2023-09-28 19:17:20,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:17:20,839 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 19:17:22,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:17:22,416 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-28 19:17:22,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 19:17:24,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-28 19:17:24,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:17:24,369 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=115973.33333333333, ans=0.0 2023-09-28 19:17:25,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:17:25,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-28 19:17:27,698 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:17:29,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-28 19:17:29,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 19:17:31,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:17:31,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:17:33,233 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:17:34,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:17:38,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:17:38,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:17:39,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:17:41,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:17:42,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:17:43,702 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:17:43,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:17:45,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:17:49,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-28 19:17:52,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:17:54,385 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-28 19:17:54,674 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=116040.0, ans=0.125 2023-09-28 19:17:55,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:17:59,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:17:59,400 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:18:00,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-28 19:18:06,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:18:06,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-28 19:18:08,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-28 19:18:09,662 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:18:11,574 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=116106.66666666667, ans=0.125 2023-09-28 19:18:12,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:18:14,112 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:18:15,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-28 19:18:17,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-28 19:18:17,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-28 19:18:20,815 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:18:22,090 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.906e+02 2.300e+02 2.689e+02 3.268e+02 5.170e+02, threshold=5.379e+02, percent-clipped=2.0 2023-09-28 19:18:22,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 19:18:29,426 INFO [train.py:1039] (1/4) Epoch 4, batch 1500, loss[loss=0.2039, simple_loss=0.275, pruned_loss=0.0664, over 24617.00 frames. ], tot_loss[loss=0.2681, simple_loss=0.3209, pruned_loss=0.1076, over 4713786.27 frames. ], batch size: 60, lr: 2.46e-02, grad_scale: 32.0 2023-09-28 19:18:31,497 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=116240.0, ans=0.125 2023-09-28 19:18:34,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-28 19:18:34,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:18:34,974 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:18:36,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:18:38,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:18:38,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:18:40,327 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-28 19:18:41,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:18:42,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-28 19:18:42,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:18:43,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:18:44,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:18:47,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:18:53,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:18:53,139 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-28 19:18:54,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-28 19:18:54,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:18:56,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:18:57,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-28 19:19:02,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-28 19:19:04,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:19:04,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-28 19:19:06,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-28 19:19:09,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:19:09,688 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:19:09,710 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:19:11,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-28 19:19:11,270 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:19:13,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:19:13,498 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-28 19:19:15,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:19:21,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:19:21,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-28 19:19:28,018 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 19:19:29,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 19:19:32,863 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=116440.0, ans=0.125 2023-09-28 19:19:34,020 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-28 19:19:35,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:19:35,503 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-28 19:19:37,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:19:37,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:19:38,758 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-28 19:19:40,302 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-28 19:19:41,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-28 19:19:44,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:19:47,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:19:47,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:19:49,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:19:49,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:19:51,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:19:51,391 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-28 19:19:52,792 INFO [train.py:1039] (1/4) Epoch 4, batch 1550, loss[loss=0.2546, simple_loss=0.3127, pruned_loss=0.09831, over 24300.00 frames. ], tot_loss[loss=0.2689, simple_loss=0.3219, pruned_loss=0.108, over 4716755.46 frames. ], batch size: 61, lr: 2.46e-02, grad_scale: 32.0 2023-09-28 19:19:52,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-28 19:19:52,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:19:53,058 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-28 19:19:54,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-28 19:19:59,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:20:00,827 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:20:00,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:20:00,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:20:02,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:20:03,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:20:06,127 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=116573.33333333333, ans=0.125 2023-09-28 19:20:07,105 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-28 19:20:07,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:20:07,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:20:07,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 19:20:10,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:20:10,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-28 19:20:11,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:20:12,044 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-28 19:20:13,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-28 19:20:13,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-28 19:20:13,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:20:15,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:20:21,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:20:22,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-28 19:20:22,709 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-28 19:20:33,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:20:36,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:20:36,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-28 19:20:36,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:20:36,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-28 19:20:41,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 19:20:42,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:20:43,077 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=116773.33333333333, ans=0.0 2023-09-28 19:20:45,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:20:48,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:20:48,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:20:48,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-28 19:20:50,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 19:20:51,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:20:52,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:20:53,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-28 19:20:53,938 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-28 19:20:56,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:21:02,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-28 19:21:06,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:21:07,902 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.992e+02 2.334e+02 2.588e+02 3.124e+02 7.530e+02, threshold=5.176e+02, percent-clipped=1.0 2023-09-28 19:21:08,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:21:09,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-28 19:21:11,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 19:21:12,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:21:12,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:21:12,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:21:14,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:21:15,444 INFO [train.py:1039] (1/4) Epoch 4, batch 1600, loss[loss=0.2798, simple_loss=0.3212, pruned_loss=0.1192, over 23711.00 frames. ], tot_loss[loss=0.2704, simple_loss=0.3229, pruned_loss=0.1089, over 4707897.88 frames. ], batch size: 232, lr: 2.46e-02, grad_scale: 32.0 2023-09-28 19:21:18,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:21:18,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-28 19:21:20,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-28 19:21:21,877 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=116906.66666666667, ans=0.125 2023-09-28 19:21:23,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-28 19:21:24,745 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:21:24,938 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=116906.66666666667, ans=0.125 2023-09-28 19:21:26,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-28 19:21:28,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:21:30,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:21:34,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:21:40,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-28 19:21:43,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:21:43,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-28 19:21:43,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:21:45,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-28 19:21:49,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-28 19:21:57,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:22:01,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-28 19:22:02,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:22:02,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:22:02,742 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:22:05,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-28 19:22:09,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 19:22:11,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:22:12,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:22:13,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:22:13,548 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:22:16,468 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:22:18,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:22:19,583 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:22:24,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:22:25,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:22:29,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-28 19:22:29,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:22:29,251 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-28 19:22:29,601 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.max_abs, batch_count=117173.33333333333, ans=10.0 2023-09-28 19:22:29,920 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=117173.33333333333, ans=0.125 2023-09-28 19:22:31,904 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=117173.33333333333, ans=0.0 2023-09-28 19:22:34,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:22:37,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:22:37,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:22:39,572 INFO [train.py:1039] (1/4) Epoch 4, batch 1650, loss[loss=0.2947, simple_loss=0.3322, pruned_loss=0.1286, over 23757.00 frames. ], tot_loss[loss=0.2708, simple_loss=0.3232, pruned_loss=0.1092, over 4710353.67 frames. ], batch size: 179, lr: 2.46e-02, grad_scale: 32.0 2023-09-28 19:22:39,644 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-28 19:22:39,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-28 19:22:39,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-28 19:22:39,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-28 19:22:40,020 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=117240.0, ans=0.125 2023-09-28 19:22:44,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:22:45,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:22:45,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:22:47,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-28 19:22:50,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:22:53,976 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-28 19:22:55,616 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:22:57,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:22:57,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:22:57,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:22:57,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-28 19:22:57,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-28 19:23:04,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 19:23:07,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-28 19:23:15,725 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=117373.33333333333, ans=0.0 2023-09-28 19:23:19,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-28 19:23:19,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:23:20,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-28 19:23:23,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:23:26,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:23:26,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:23:26,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:23:27,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:23:29,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:23:31,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:23:31,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:23:32,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:23:32,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:23:32,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:23:33,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 19:23:33,970 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=6.01 vs. limit=15.0 2023-09-28 19:23:36,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:23:37,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-28 19:23:37,771 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=117440.0, ans=0.125 2023-09-28 19:23:39,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:23:39,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-28 19:23:42,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-28 19:23:42,563 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-28 19:23:42,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:23:43,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:23:45,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:23:45,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:23:45,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-28 19:23:50,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:23:53,774 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.658e+02 2.422e+02 2.762e+02 3.244e+02 4.441e+02, threshold=5.524e+02, percent-clipped=0.0 2023-09-28 19:23:53,894 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:23:53,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:23:55,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-28 19:23:59,753 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=117506.66666666667, ans=0.125 2023-09-28 19:24:00,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:24:00,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:24:00,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-28 19:24:01,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:24:01,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:24:01,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:24:02,348 INFO [train.py:1039] (1/4) Epoch 4, batch 1700, loss[loss=0.278, simple_loss=0.3113, pruned_loss=0.1223, over 23639.00 frames. ], tot_loss[loss=0.2696, simple_loss=0.3224, pruned_loss=0.1083, over 4701922.32 frames. ], batch size: 256, lr: 2.45e-02, grad_scale: 32.0 2023-09-28 19:24:05,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:24:06,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:24:06,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-28 19:24:10,457 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 19:24:20,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:24:22,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:24:27,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-28 19:24:28,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:24:30,015 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:24:30,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:24:31,725 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-28 19:24:34,643 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:24:35,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:24:38,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:24:39,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-28 19:24:41,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-28 19:24:43,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-28 19:24:43,619 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:24:45,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-28 19:24:46,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:24:56,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:24:58,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:24:59,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:25:00,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-28 19:25:00,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-28 19:25:01,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:25:03,148 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:25:03,149 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-28 19:25:04,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:25:04,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:25:04,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:25:04,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:25:07,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:25:07,669 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:25:09,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:25:09,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:25:09,288 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:25:14,543 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:25:16,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-28 19:25:18,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:25:18,309 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:25:19,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-28 19:25:24,954 INFO [train.py:1039] (1/4) Epoch 4, batch 1750, loss[loss=0.2598, simple_loss=0.3264, pruned_loss=0.09658, over 24653.00 frames. ], tot_loss[loss=0.268, simple_loss=0.3218, pruned_loss=0.107, over 4715145.24 frames. ], batch size: 65, lr: 2.45e-02, grad_scale: 32.0 2023-09-28 19:25:26,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:25:28,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:25:28,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-28 19:25:30,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-28 19:25:31,712 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:25:34,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:25:34,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:25:39,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-28 19:25:40,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:25:43,915 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.59 vs. limit=15.0 2023-09-28 19:25:44,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-28 19:25:44,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:25:46,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:25:48,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 19:25:51,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-28 19:25:51,930 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:25:52,481 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.88 vs. limit=22.5 2023-09-28 19:25:53,411 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-28 19:26:03,855 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:26:06,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:26:06,934 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:26:12,781 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:26:12,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:26:14,481 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:26:16,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:26:18,194 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:26:19,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:26:19,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-28 19:26:20,083 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=118106.66666666667, ans=0.07 2023-09-28 19:26:21,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:26:24,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-28 19:26:24,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:26:25,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:26:26,097 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=118106.66666666667, ans=0.0 2023-09-28 19:26:27,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:26:32,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 19:26:32,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-28 19:26:32,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:26:36,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:26:39,668 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.738e+02 2.369e+02 2.693e+02 3.225e+02 5.418e+02, threshold=5.386e+02, percent-clipped=0.0 2023-09-28 19:26:41,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:26:44,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:26:44,603 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:26:47,449 INFO [train.py:1039] (1/4) Epoch 4, batch 1800, loss[loss=0.2732, simple_loss=0.3381, pruned_loss=0.1041, over 24265.00 frames. ], tot_loss[loss=0.2665, simple_loss=0.3198, pruned_loss=0.1066, over 4690014.14 frames. ], batch size: 74, lr: 2.45e-02, grad_scale: 32.0 2023-09-28 19:26:47,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-28 19:26:47,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:26:47,852 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=118240.0, ans=0.1 2023-09-28 19:26:48,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-28 19:26:48,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:26:49,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:26:49,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:26:49,287 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=118240.0, ans=0.2 2023-09-28 19:26:50,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:26:52,879 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:26:54,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:26:55,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 19:26:57,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=118240.0, ans=0.125 2023-09-28 19:26:58,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:27:00,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 19:27:04,700 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:27:07,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:27:10,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:27:11,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:27:12,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:27:14,203 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:27:14,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-28 19:27:15,662 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:27:19,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:27:21,069 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-28 19:27:24,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-28 19:27:24,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-28 19:27:24,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:27:26,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:27:26,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:27:26,810 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.16 vs. limit=22.5 2023-09-28 19:27:27,722 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:27:33,970 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-28 19:27:34,154 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:27:36,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:27:38,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-28 19:27:38,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-28 19:27:40,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:27:40,469 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:27:41,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:27:41,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 19:27:42,108 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=118440.0, ans=0.0 2023-09-28 19:27:44,194 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.71 vs. limit=10.0 2023-09-28 19:27:46,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-28 19:27:51,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:27:51,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-28 19:27:51,884 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:27:53,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:27:53,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:27:53,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-28 19:27:58,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:27:58,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:28:01,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-28 19:28:01,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:28:02,053 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=118506.66666666667, ans=0.0 2023-09-28 19:28:03,325 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:28:04,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-28 19:28:04,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:28:06,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:28:06,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:28:10,354 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:28:10,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:28:11,793 INFO [train.py:1039] (1/4) Epoch 4, batch 1850, loss[loss=0.2785, simple_loss=0.3261, pruned_loss=0.1154, over 23639.00 frames. ], tot_loss[loss=0.2679, simple_loss=0.3209, pruned_loss=0.1074, over 4696973.30 frames. ], batch size: 232, lr: 2.44e-02, grad_scale: 32.0 2023-09-28 19:28:13,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:28:13,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:28:14,164 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.80 vs. limit=15.0 2023-09-28 19:28:23,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:28:23,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-28 19:28:26,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-28 19:28:26,786 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=118640.0, ans=0.0 2023-09-28 19:28:29,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-28 19:28:33,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:28:35,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-28 19:28:35,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 19:28:44,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:28:44,505 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=118706.66666666667, ans=0.2 2023-09-28 19:28:46,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-28 19:28:49,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:28:50,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:28:52,022 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.99 vs. limit=12.0 2023-09-28 19:28:54,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-28 19:28:54,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:28:54,285 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 19:28:57,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:28:58,632 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.45 vs. limit=22.5 2023-09-28 19:28:59,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:29:01,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:29:03,309 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.32 vs. limit=15.0 2023-09-28 19:29:04,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-28 19:29:04,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:29:06,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 19:29:06,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:29:09,354 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:29:09,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:29:14,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-28 19:29:14,112 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:29:16,465 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=118773.33333333333, ans=0.0 2023-09-28 19:29:19,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:29:19,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:29:19,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-28 19:29:19,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-28 19:29:20,965 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-28 19:29:21,086 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-28 19:29:24,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 19:29:24,589 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:29:24,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:29:24,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:29:24,752 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-28 19:29:24,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:29:26,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:29:27,993 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.758e+02 2.551e+02 2.974e+02 3.413e+02 5.793e+02, threshold=5.947e+02, percent-clipped=2.0 2023-09-28 19:29:28,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-28 19:29:28,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 19:29:29,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:29:29,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-28 19:29:31,797 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=118840.0, ans=0.1 2023-09-28 19:29:33,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:29:33,092 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-28 19:29:33,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:29:34,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:29:36,069 INFO [train.py:1039] (1/4) Epoch 4, batch 1900, loss[loss=0.2796, simple_loss=0.3407, pruned_loss=0.1093, over 24671.00 frames. ], tot_loss[loss=0.2682, simple_loss=0.3221, pruned_loss=0.1072, over 4699629.25 frames. ], batch size: 73, lr: 2.44e-02, grad_scale: 32.0 2023-09-28 19:29:39,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:29:41,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:29:43,547 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-28 19:29:43,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-28 19:29:45,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:29:46,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:29:46,620 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-28 19:29:46,692 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-28 19:29:51,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-28 19:29:54,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:29:55,782 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=118973.33333333333, ans=0.125 2023-09-28 19:29:57,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-28 19:30:00,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-28 19:30:08,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-28 19:30:11,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-28 19:30:13,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:30:13,114 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-28 19:30:13,122 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-28 19:30:14,385 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-28 19:30:14,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-28 19:30:14,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:30:20,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-28 19:30:23,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:30:27,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:30:27,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-28 19:30:29,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:30:29,837 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=119106.66666666667, ans=0.125 2023-09-28 19:30:32,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-28 19:30:33,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:30:42,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:30:42,587 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:30:42,608 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:30:42,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:30:44,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 19:30:45,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-28 19:30:45,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:30:48,835 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:30:48,838 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:30:52,010 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:30:52,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:30:54,116 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:30:54,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:30:58,756 INFO [train.py:1039] (1/4) Epoch 4, batch 1950, loss[loss=0.2837, simple_loss=0.352, pruned_loss=0.1077, over 24566.00 frames. ], tot_loss[loss=0.2678, simple_loss=0.322, pruned_loss=0.1068, over 4715600.30 frames. ], batch size: 71, lr: 2.44e-02, grad_scale: 32.0 2023-09-28 19:30:58,909 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:31:00,673 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=119240.0, ans=0.125 2023-09-28 19:31:01,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:31:03,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:31:03,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 19:31:05,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-28 19:31:06,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 19:31:07,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:31:07,793 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.76 vs. limit=22.5 2023-09-28 19:31:10,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:31:12,270 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:31:13,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:31:13,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:31:13,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:31:15,904 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=119306.66666666667, ans=0.95 2023-09-28 19:31:17,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:31:17,507 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=119306.66666666667, ans=0.125 2023-09-28 19:31:18,878 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:31:18,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 19:31:19,098 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=119306.66666666667, ans=0.1 2023-09-28 19:31:20,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:31:20,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:31:20,685 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=119306.66666666667, ans=0.0 2023-09-28 19:31:24,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:31:27,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:31:27,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:31:27,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-28 19:31:27,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-28 19:31:29,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 19:31:29,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:31:29,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:31:36,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:31:37,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:31:42,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 19:31:46,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:31:46,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:31:46,705 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-28 19:31:46,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:31:50,654 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.69 vs. limit=12.0 2023-09-28 19:31:52,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:31:52,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:31:52,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-28 19:32:00,080 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:32:01,511 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:32:01,844 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=119440.0, ans=0.125 2023-09-28 19:32:06,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:32:08,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:32:08,659 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:32:09,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:32:11,306 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:32:11,412 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-28 19:32:11,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 19:32:12,783 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.736e+02 2.624e+02 2.939e+02 3.496e+02 6.198e+02, threshold=5.878e+02, percent-clipped=1.0 2023-09-28 19:32:12,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:32:13,247 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=119506.66666666667, ans=0.0 2023-09-28 19:32:14,446 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-28 19:32:16,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:32:21,589 INFO [train.py:1039] (1/4) Epoch 4, batch 2000, loss[loss=0.2601, simple_loss=0.3106, pruned_loss=0.1048, over 23420.00 frames. ], tot_loss[loss=0.2674, simple_loss=0.3221, pruned_loss=0.1063, over 4713607.81 frames. ], batch size: 93, lr: 2.44e-02, grad_scale: 32.0 2023-09-28 19:32:21,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-28 19:32:23,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:32:23,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:32:26,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:32:27,025 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:32:31,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-28 19:32:31,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:32:34,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:32:36,289 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-28 19:32:36,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 19:32:36,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:32:40,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:32:41,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-28 19:32:43,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:32:44,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:32:44,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:32:47,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-28 19:32:47,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 19:32:49,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-28 19:32:49,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:32:53,682 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:32:53,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-28 19:32:53,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:32:55,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:32:55,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:32:56,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-28 19:33:00,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-28 19:33:00,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:33:00,511 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:33:06,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:33:08,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:33:08,032 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:33:09,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:33:11,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:33:11,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:33:13,383 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:33:13,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:33:16,274 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:33:19,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:33:19,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-28 19:33:24,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 19:33:26,611 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=119840.0, ans=0.0 2023-09-28 19:33:27,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:33:31,382 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=119840.0, ans=10.0 2023-09-28 19:33:32,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:33:32,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:33:35,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:33:35,240 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=119840.0, ans=0.1 2023-09-28 19:33:39,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:33:39,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:33:39,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 19:33:39,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 19:33:42,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:33:44,037 INFO [train.py:1039] (1/4) Epoch 4, batch 2050, loss[loss=0.2653, simple_loss=0.3211, pruned_loss=0.1048, over 23347.00 frames. ], tot_loss[loss=0.2663, simple_loss=0.3205, pruned_loss=0.106, over 4705605.05 frames. ], batch size: 93, lr: 2.43e-02, grad_scale: 32.0 2023-09-28 19:33:44,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:33:47,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:33:48,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:33:55,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:33:57,126 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:33:57,221 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:33:58,788 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:34:02,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-28 19:34:02,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:34:02,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:34:04,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-28 19:34:10,242 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=119973.33333333333, ans=0.0 2023-09-28 19:34:14,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:34:14,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:34:17,803 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-28 19:34:19,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:34:20,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-28 19:34:20,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:34:24,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:34:27,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:34:29,244 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-28 19:34:29,328 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:34:30,905 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:34:31,038 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:34:31,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:34:32,789 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:34:36,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:34:37,873 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:34:39,520 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-28 19:34:39,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:34:44,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:34:49,228 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:34:49,663 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=120173.33333333333, ans=0.125 2023-09-28 19:34:50,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-28 19:34:55,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:34:57,736 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:34:59,047 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.851e+02 2.560e+02 3.020e+02 3.672e+02 5.923e+02, threshold=6.041e+02, percent-clipped=1.0 2023-09-28 19:34:59,475 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=120173.33333333333, ans=0.1 2023-09-28 19:35:00,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:35:02,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-28 19:35:06,675 INFO [train.py:1039] (1/4) Epoch 4, batch 2100, loss[loss=0.2924, simple_loss=0.3299, pruned_loss=0.1275, over 23777.00 frames. ], tot_loss[loss=0.2654, simple_loss=0.3192, pruned_loss=0.1057, over 4695730.51 frames. ], batch size: 164, lr: 2.43e-02, grad_scale: 32.0 2023-09-28 19:35:06,914 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-28 19:35:06,915 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:35:07,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:35:08,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 19:35:09,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:35:09,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-28 19:35:10,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-28 19:35:12,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:35:14,212 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.39 vs. limit=22.5 2023-09-28 19:35:15,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:35:17,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:35:20,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:35:22,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:35:22,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-28 19:35:22,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:35:23,630 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-28 19:35:23,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-28 19:35:25,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:35:26,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:35:26,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-28 19:35:26,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 19:35:29,720 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.68 vs. limit=15.0 2023-09-28 19:35:32,446 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-28 19:35:32,448 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 19:35:36,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:35:37,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:35:39,576 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.50 vs. limit=15.0 2023-09-28 19:35:40,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:35:40,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-28 19:35:40,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=120373.33333333333, ans=0.035 2023-09-28 19:35:41,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:35:41,781 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 19:35:43,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-28 19:35:43,538 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:35:43,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-28 19:35:45,594 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-28 19:35:45,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-28 19:35:48,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-28 19:35:50,819 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:35:52,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 19:35:54,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 19:35:57,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:35:59,098 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:35:59,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-28 19:35:59,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:35:59,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:36:00,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:36:00,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-28 19:36:02,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-28 19:36:02,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-28 19:36:06,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:36:09,152 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:36:10,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-28 19:36:13,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:36:17,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:36:18,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:36:18,872 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:36:18,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-28 19:36:20,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 19:36:23,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:36:23,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:36:23,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:36:23,935 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:36:27,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-28 19:36:30,134 INFO [train.py:1039] (1/4) Epoch 4, batch 2150, loss[loss=0.2745, simple_loss=0.3333, pruned_loss=0.1079, over 23277.00 frames. ], tot_loss[loss=0.2645, simple_loss=0.3178, pruned_loss=0.1055, over 4698053.90 frames. ], batch size: 93, lr: 2.43e-02, grad_scale: 32.0 2023-09-28 19:36:30,237 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-28 19:36:30,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:36:31,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:36:31,811 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:36:31,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:36:32,147 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=120573.33333333333, ans=0.125 2023-09-28 19:36:33,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:36:37,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 19:36:38,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:36:41,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:36:44,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:36:44,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:36:44,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:36:47,685 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:36:47,845 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=120640.0, ans=0.125 2023-09-28 19:36:49,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:36:49,214 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:36:52,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:36:52,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-28 19:36:56,607 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=120640.0, ans=0.0 2023-09-28 19:36:57,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:37:00,136 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:37:00,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:37:00,526 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=120640.0, ans=0.125 2023-09-28 19:37:02,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:37:02,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:37:02,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-28 19:37:02,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:37:02,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:37:04,050 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:37:04,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-28 19:37:07,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:37:07,872 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=120706.66666666667, ans=0.0 2023-09-28 19:37:08,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:37:08,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:37:10,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 19:37:11,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:37:14,987 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:37:16,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:37:16,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:37:16,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-28 19:37:17,099 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=120706.66666666667, ans=0.09899494936611666 2023-09-28 19:37:18,262 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-28 19:37:21,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:37:21,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:37:22,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:37:24,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 19:37:24,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:37:26,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:37:26,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-28 19:37:27,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-28 19:37:27,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:37:27,883 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-28 19:37:29,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:37:30,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:37:31,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-28 19:37:31,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:37:32,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-28 19:37:32,868 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-28 19:37:32,869 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-28 19:37:32,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-28 19:37:33,273 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=120773.33333333333, ans=0.125 2023-09-28 19:37:35,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:37:36,693 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:37:36,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:37:36,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:37:38,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 19:37:38,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:37:40,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:37:45,490 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.874e+02 2.345e+02 2.868e+02 3.477e+02 5.291e+02, threshold=5.737e+02, percent-clipped=0.0 2023-09-28 19:37:47,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:37:48,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-28 19:37:53,310 INFO [train.py:1039] (1/4) Epoch 4, batch 2200, loss[loss=0.3501, simple_loss=0.3618, pruned_loss=0.1692, over 19491.00 frames. ], tot_loss[loss=0.2658, simple_loss=0.3184, pruned_loss=0.1065, over 4690816.75 frames. ], batch size: 388, lr: 2.42e-02, grad_scale: 32.0 2023-09-28 19:37:53,413 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:37:58,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:37:58,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:37:59,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:38:01,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:38:04,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:38:04,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:38:04,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-28 19:38:11,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-28 19:38:13,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 19:38:20,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-28 19:38:23,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:38:25,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:38:25,257 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:38:28,641 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:38:30,067 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-28 19:38:30,462 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=121040.0, ans=0.2 2023-09-28 19:38:31,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-28 19:38:34,607 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:38:34,739 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-28 19:38:38,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:38:39,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:38:40,159 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=121040.0, ans=0.125 2023-09-28 19:38:43,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:38:44,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:38:48,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-28 19:38:48,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:38:49,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-28 19:38:52,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:38:52,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-28 19:38:52,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:38:55,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:38:56,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:38:56,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:38:56,576 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:38:58,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-28 19:38:58,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:38:59,744 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 19:39:02,960 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 19:39:04,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:39:07,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-28 19:39:07,566 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-28 19:39:10,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 19:39:11,988 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-28 19:39:12,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-28 19:39:14,087 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-28 19:39:15,505 INFO [train.py:1039] (1/4) Epoch 4, batch 2250, loss[loss=0.3698, simple_loss=0.3824, pruned_loss=0.1786, over 19204.00 frames. ], tot_loss[loss=0.2665, simple_loss=0.3191, pruned_loss=0.1069, over 4688681.20 frames. ], batch size: 388, lr: 2.42e-02, grad_scale: 64.0 2023-09-28 19:39:15,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:39:17,143 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-28 19:39:19,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:39:20,764 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-28 19:39:20,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:39:24,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:39:26,593 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=121240.0, ans=0.1 2023-09-28 19:39:30,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:39:32,406 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:39:36,027 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=8.37 vs. limit=15.0 2023-09-28 19:39:38,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:39:38,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:39:38,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:39:40,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-28 19:39:40,453 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:39:41,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:39:42,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-28 19:39:42,290 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:39:42,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:39:45,289 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:39:49,708 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.11 vs. limit=15.0 2023-09-28 19:39:50,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:39:52,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 19:39:52,770 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-28 19:39:54,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-28 19:39:57,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:39:58,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:40:02,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:40:02,963 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=121373.33333333333, ans=0.2 2023-09-28 19:40:04,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:40:05,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:40:05,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:40:08,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:40:09,231 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=121440.0, ans=0.125 2023-09-28 19:40:10,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:40:13,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:40:16,621 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-28 19:40:21,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 19:40:21,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-28 19:40:21,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:40:27,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 19:40:27,921 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.94 vs. limit=15.0 2023-09-28 19:40:28,824 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=121506.66666666667, ans=0.015 2023-09-28 19:40:30,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-28 19:40:30,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-28 19:40:32,082 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.851e+02 2.293e+02 2.607e+02 2.902e+02 3.937e+02, threshold=5.215e+02, percent-clipped=0.0 2023-09-28 19:40:32,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:40:32,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:40:37,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-28 19:40:38,975 INFO [train.py:1039] (1/4) Epoch 4, batch 2300, loss[loss=0.2668, simple_loss=0.3167, pruned_loss=0.1085, over 23359.00 frames. ], tot_loss[loss=0.2667, simple_loss=0.3198, pruned_loss=0.1068, over 4699558.39 frames. ], batch size: 119, lr: 2.42e-02, grad_scale: 32.0 2023-09-28 19:40:39,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:40:40,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:40:45,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:40:46,939 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:40:50,041 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-28 19:40:51,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:40:57,846 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:40:57,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-28 19:40:59,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:41:00,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:41:00,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-28 19:41:02,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:41:03,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:41:05,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:41:08,750 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:41:12,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:41:14,915 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.35 vs. limit=15.0 2023-09-28 19:41:15,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:41:20,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:41:21,748 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:41:26,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:41:27,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:41:30,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:41:32,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 19:41:32,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:41:32,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-28 19:41:37,753 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 19:41:37,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:41:39,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:41:39,265 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:41:40,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:41:42,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 19:41:42,101 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-28 19:41:42,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-28 19:41:42,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:41:42,221 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:41:44,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-28 19:41:49,996 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:41:50,265 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=121840.0, ans=0.1 2023-09-28 19:41:50,310 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=121840.0, ans=0.0 2023-09-28 19:41:50,652 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.42 vs. limit=15.0 2023-09-28 19:41:54,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:41:54,610 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=121840.0, ans=0.0 2023-09-28 19:41:59,107 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:41:59,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:41:59,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-28 19:42:00,446 INFO [train.py:1039] (1/4) Epoch 4, batch 2350, loss[loss=0.2687, simple_loss=0.3316, pruned_loss=0.1029, over 24322.00 frames. ], tot_loss[loss=0.2678, simple_loss=0.3209, pruned_loss=0.1073, over 4706912.39 frames. ], batch size: 74, lr: 2.42e-02, grad_scale: 32.0 2023-09-28 19:42:00,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 19:42:00,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:42:02,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 19:42:02,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-28 19:42:04,248 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=121906.66666666667, ans=0.125 2023-09-28 19:42:10,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:42:10,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-28 19:42:17,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-28 19:42:21,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:42:24,843 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:42:24,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:42:24,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:42:24,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:42:26,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-28 19:42:29,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:42:34,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-28 19:42:35,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:42:38,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 19:42:40,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:42:42,655 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:42:44,194 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-28 19:42:44,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:42:46,049 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=122040.0, ans=0.125 2023-09-28 19:42:46,141 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=122040.0, ans=0.0 2023-09-28 19:42:47,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:42:47,808 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:42:47,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:42:49,583 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=122106.66666666667, ans=0.125 2023-09-28 19:42:52,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:42:55,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-28 19:42:56,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:42:58,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:42:58,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:42:59,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-28 19:43:01,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:43:03,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-28 19:43:04,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-28 19:43:09,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-28 19:43:11,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-28 19:43:12,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:43:12,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-28 19:43:12,482 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-28 19:43:12,512 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-28 19:43:16,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-28 19:43:18,097 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.790e+02 2.367e+02 2.843e+02 3.356e+02 5.882e+02, threshold=5.686e+02, percent-clipped=1.0 2023-09-28 19:43:18,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:43:20,459 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.89 vs. limit=22.5 2023-09-28 19:43:22,881 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:43:24,099 INFO [train.py:1039] (1/4) Epoch 4, batch 2400, loss[loss=0.2707, simple_loss=0.2947, pruned_loss=0.1234, over 19318.00 frames. ], tot_loss[loss=0.2666, simple_loss=0.3197, pruned_loss=0.1068, over 4702570.07 frames. ], batch size: 388, lr: 2.41e-02, grad_scale: 32.0 2023-09-28 19:43:28,419 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:43:29,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:43:31,364 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-28 19:43:31,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-28 19:43:39,153 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 19:43:39,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:43:40,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-28 19:43:40,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:43:42,398 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:43:42,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-28 19:43:44,068 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=122306.66666666667, ans=0.0 2023-09-28 19:43:49,309 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:43:52,210 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-28 19:43:57,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-28 19:44:02,525 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-28 19:44:04,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:44:07,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:44:12,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:44:12,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-28 19:44:12,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 19:44:18,814 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:44:22,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:44:24,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:44:24,539 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=122440.0, ans=0.0 2023-09-28 19:44:25,847 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:44:25,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-28 19:44:27,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:44:27,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:44:27,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:44:29,130 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 19:44:33,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:44:34,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 19:44:34,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-28 19:44:36,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-28 19:44:39,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:44:39,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:44:39,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-28 19:44:41,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-28 19:44:41,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-28 19:44:41,826 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-28 19:44:43,353 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-28 19:44:43,756 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=122506.66666666667, ans=0.0 2023-09-28 19:44:44,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:44:45,065 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:44:45,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:44:45,503 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=122506.66666666667, ans=0.125 2023-09-28 19:44:46,596 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-28 19:44:46,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:44:48,026 INFO [train.py:1039] (1/4) Epoch 4, batch 2450, loss[loss=0.2635, simple_loss=0.3046, pruned_loss=0.1112, over 23735.00 frames. ], tot_loss[loss=0.264, simple_loss=0.3171, pruned_loss=0.1054, over 4697583.12 frames. ], batch size: 232, lr: 2.41e-02, grad_scale: 32.0 2023-09-28 19:44:48,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-28 19:44:48,533 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=122573.33333333333, ans=0.125 2023-09-28 19:44:51,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:44:51,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:44:54,768 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=122573.33333333333, ans=0.0 2023-09-28 19:44:55,175 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=23.70 vs. limit=22.5 2023-09-28 19:44:56,607 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:44:56,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:44:58,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-28 19:45:02,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:45:02,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:45:06,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:45:08,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:45:08,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:45:08,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-28 19:45:13,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:45:15,390 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 19:45:16,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:45:20,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-28 19:45:20,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:45:21,897 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=122706.66666666667, ans=0.0 2023-09-28 19:45:23,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:45:23,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:45:24,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-28 19:45:26,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:45:26,658 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=122706.66666666667, ans=0.125 2023-09-28 19:45:33,553 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=122706.66666666667, ans=0.125 2023-09-28 19:45:34,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:45:36,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:45:36,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:45:36,534 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:45:38,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:45:38,623 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=122773.33333333333, ans=0.125 2023-09-28 19:45:39,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:45:39,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-28 19:45:42,487 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=122773.33333333333, ans=0.1 2023-09-28 19:45:45,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:45:45,748 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:45:48,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:45:48,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:45:54,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:45:54,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-28 19:45:56,387 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:45:56,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:45:56,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-28 19:45:57,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:45:58,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:46:02,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:46:02,782 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=122840.0, ans=0.125 2023-09-28 19:46:03,797 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.743e+02 2.418e+02 2.743e+02 3.147e+02 4.422e+02, threshold=5.485e+02, percent-clipped=0.0 2023-09-28 19:46:05,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:46:05,512 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:46:08,260 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=122840.0, ans=0.0 2023-09-28 19:46:08,264 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=122840.0, ans=0.125 2023-09-28 19:46:09,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-28 19:46:09,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:46:11,543 INFO [train.py:1039] (1/4) Epoch 4, batch 2500, loss[loss=0.2784, simple_loss=0.3229, pruned_loss=0.1169, over 23742.00 frames. ], tot_loss[loss=0.2637, simple_loss=0.3169, pruned_loss=0.1053, over 4695267.56 frames. ], batch size: 164, lr: 2.41e-02, grad_scale: 32.0 2023-09-28 19:46:13,539 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=122906.66666666667, ans=0.07 2023-09-28 19:46:19,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:46:28,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:46:28,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:46:29,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:46:29,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-28 19:46:31,977 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.62 vs. limit=22.5 2023-09-28 19:46:37,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 19:46:37,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:46:38,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-28 19:46:38,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 19:46:40,342 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-28 19:46:40,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:46:42,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:46:44,083 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-28 19:46:44,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:46:44,209 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-28 19:46:44,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:46:51,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:46:52,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:46:56,701 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 19:46:56,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-28 19:46:56,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:46:59,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:47:04,360 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:47:07,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:47:10,910 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=123106.66666666667, ans=0.0 2023-09-28 19:47:12,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:47:15,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-28 19:47:15,958 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.77 vs. limit=12.0 2023-09-28 19:47:17,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-28 19:47:17,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:47:19,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-28 19:47:20,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:47:20,648 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 19:47:22,756 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-28 19:47:22,757 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-28 19:47:22,778 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-28 19:47:26,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:47:29,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-28 19:47:29,428 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-28 19:47:29,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:47:29,881 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=123173.33333333333, ans=0.0 2023-09-28 19:47:31,529 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-28 19:47:34,350 INFO [train.py:1039] (1/4) Epoch 4, batch 2550, loss[loss=0.2739, simple_loss=0.321, pruned_loss=0.1134, over 23754.00 frames. ], tot_loss[loss=0.2642, simple_loss=0.3173, pruned_loss=0.1055, over 4704464.11 frames. ], batch size: 179, lr: 2.40e-02, grad_scale: 32.0 2023-09-28 19:47:36,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-28 19:47:37,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:47:39,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:47:39,671 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=123240.0, ans=0.0 2023-09-28 19:47:40,818 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:47:42,528 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:47:44,110 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-28 19:47:45,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-28 19:47:48,609 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-28 19:47:50,192 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:47:54,601 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:47:56,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:47:56,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 19:47:56,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 19:47:58,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:47:58,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:48:02,067 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:48:02,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-28 19:48:02,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-28 19:48:02,185 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:48:02,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-28 19:48:13,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:48:19,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:48:19,536 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:48:19,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:48:21,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 19:48:27,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:48:30,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 19:48:30,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:48:31,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:48:31,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-28 19:48:33,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-28 19:48:36,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:48:38,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:48:44,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:48:44,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-28 19:48:44,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:48:44,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:48:46,141 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:48:47,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 19:48:47,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:48:49,410 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.960e+02 2.344e+02 2.648e+02 3.019e+02 5.195e+02, threshold=5.296e+02, percent-clipped=0.0 2023-09-28 19:48:54,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:48:54,966 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.18 vs. limit=15.0 2023-09-28 19:48:55,601 INFO [train.py:1039] (1/4) Epoch 4, batch 2600, loss[loss=0.2287, simple_loss=0.2986, pruned_loss=0.07943, over 24342.00 frames. ], tot_loss[loss=0.2644, simple_loss=0.3181, pruned_loss=0.1053, over 4708957.47 frames. ], batch size: 77, lr: 2.40e-02, grad_scale: 32.0 2023-09-28 19:48:55,857 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:48:57,766 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=123573.33333333333, ans=0.09899494936611666 2023-09-28 19:48:58,991 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-28 19:49:00,867 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=123573.33333333333, ans=0.125 2023-09-28 19:49:02,182 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-28 19:49:02,236 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:49:04,159 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-28 19:49:04,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-28 19:49:04,314 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-28 19:49:07,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:49:07,508 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-28 19:49:09,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-28 19:49:11,061 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-28 19:49:12,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:49:14,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-28 19:49:16,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-28 19:49:17,994 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-28 19:49:18,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-28 19:49:18,986 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.68 vs. limit=15.0 2023-09-28 19:49:20,975 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-28 19:49:21,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-28 19:49:27,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:49:27,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:49:27,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:49:27,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-28 19:49:27,618 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=123706.66666666667, ans=0.0 2023-09-28 19:49:28,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:49:34,169 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=123706.66666666667, ans=0.0 2023-09-28 19:49:37,229 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-28 19:49:41,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:49:41,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:49:43,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-28 19:49:44,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:49:44,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:49:44,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-28 19:49:49,594 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=11.20 vs. limit=15.0 2023-09-28 19:49:50,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-28 19:49:50,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:49:53,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:49:56,331 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-28 19:49:56,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:49:56,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:50:01,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:50:01,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:50:01,259 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-28 19:50:04,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:50:05,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:50:07,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:50:13,224 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=123840.0, ans=0.0 2023-09-28 19:50:14,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-28 19:50:16,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:50:18,380 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 19:50:20,012 INFO [train.py:1039] (1/4) Epoch 4, batch 2650, loss[loss=0.2824, simple_loss=0.3443, pruned_loss=0.1102, over 24334.00 frames. ], tot_loss[loss=0.265, simple_loss=0.3186, pruned_loss=0.1058, over 4708074.30 frames. ], batch size: 77, lr: 2.40e-02, grad_scale: 16.0 2023-09-28 19:50:21,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-28 19:50:21,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:50:21,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 19:50:24,052 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-28 19:50:24,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:50:25,869 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=123906.66666666667, ans=0.1 2023-09-28 19:50:26,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:50:30,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 19:50:30,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:50:33,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:50:33,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-28 19:50:33,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 19:50:33,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:50:37,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-28 19:50:38,258 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=123973.33333333333, ans=0.07 2023-09-28 19:50:39,494 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-28 19:50:43,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:50:48,143 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-28 19:50:48,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:50:48,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-28 19:50:53,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:50:53,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-28 19:50:53,647 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:50:53,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:50:54,082 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=124040.0, ans=0.125 2023-09-28 19:51:01,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-28 19:51:01,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-28 19:51:02,221 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.36 vs. limit=15.0 2023-09-28 19:51:03,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-28 19:51:06,686 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=124040.0, ans=0.0 2023-09-28 19:51:07,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-28 19:51:07,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:51:09,590 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:51:09,651 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:51:11,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:51:11,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:51:14,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:51:15,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:51:17,258 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:51:17,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-28 19:51:19,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:51:20,268 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=19.29 vs. limit=22.5 2023-09-28 19:51:21,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:51:21,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:51:24,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:51:24,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:51:24,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-28 19:51:27,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:51:29,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:51:29,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:51:31,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-28 19:51:35,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:51:35,984 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:51:37,257 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.892e+02 2.373e+02 3.005e+02 3.591e+02 5.745e+02, threshold=6.010e+02, percent-clipped=4.0 2023-09-28 19:51:39,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:51:40,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:51:42,359 INFO [train.py:1039] (1/4) Epoch 4, batch 2700, loss[loss=0.2809, simple_loss=0.3458, pruned_loss=0.1081, over 24014.00 frames. ], tot_loss[loss=0.2652, simple_loss=0.3192, pruned_loss=0.1056, over 4718980.38 frames. ], batch size: 80, lr: 2.40e-02, grad_scale: 16.0 2023-09-28 19:51:42,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:51:42,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:51:44,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:51:44,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-28 19:51:48,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:51:50,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 19:51:51,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:51:51,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:51:53,956 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:51:55,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:51:55,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:51:55,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:51:56,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-28 19:51:57,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-28 19:51:57,122 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:51:58,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-28 19:51:58,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 19:52:00,323 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:52:00,661 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=124306.66666666667, ans=0.0 2023-09-28 19:52:05,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-28 19:52:05,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-28 19:52:06,643 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=9.90 vs. limit=15.0 2023-09-28 19:52:07,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:52:12,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:52:12,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:52:18,711 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:52:18,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:52:18,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:52:18,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:52:21,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:52:26,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:52:26,824 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:52:26,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:52:30,491 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=124440.0, ans=0.1 2023-09-28 19:52:31,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:52:31,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:52:33,647 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=124440.0, ans=0.1 2023-09-28 19:52:35,904 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.86 vs. limit=15.0 2023-09-28 19:52:40,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:52:40,389 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:52:45,551 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:52:45,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:52:48,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:52:48,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:52:50,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:52:52,184 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:52:53,895 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=124506.66666666667, ans=0.2 2023-09-28 19:52:55,040 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:52:55,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:52:58,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:53:00,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:53:00,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:53:02,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-28 19:53:02,457 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=124506.66666666667, ans=0.125 2023-09-28 19:53:03,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:53:05,058 INFO [train.py:1039] (1/4) Epoch 4, batch 2750, loss[loss=0.2426, simple_loss=0.3074, pruned_loss=0.08892, over 24462.00 frames. ], tot_loss[loss=0.2648, simple_loss=0.3192, pruned_loss=0.1052, over 4719460.95 frames. ], batch size: 66, lr: 2.39e-02, grad_scale: 16.0 2023-09-28 19:53:06,624 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:53:06,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-28 19:53:08,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-28 19:53:08,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:53:10,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:53:10,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:53:15,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:53:15,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-28 19:53:15,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:53:18,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:53:20,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 19:53:20,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:53:20,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:53:20,414 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-28 19:53:20,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:53:20,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:53:27,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-28 19:53:30,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:53:30,209 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:53:30,322 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:53:32,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-28 19:53:32,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:53:33,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:53:35,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:53:35,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:53:35,673 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=124640.0, ans=0.0 2023-09-28 19:53:38,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 19:53:38,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 19:53:40,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:53:40,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:53:43,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 19:53:46,509 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.89 vs. limit=15.0 2023-09-28 19:53:50,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:53:52,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 19:53:52,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:53:56,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:53:56,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-28 19:53:58,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:54:04,473 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:54:04,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:54:04,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-28 19:54:10,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:54:11,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-28 19:54:14,639 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=124840.0, ans=0.0 2023-09-28 19:54:17,434 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-28 19:54:19,054 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:54:19,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-28 19:54:20,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:54:22,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:54:24,071 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.875e+02 2.756e+02 3.214e+02 3.920e+02 6.552e+02, threshold=6.428e+02, percent-clipped=3.0 2023-09-28 19:54:24,267 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-28 19:54:25,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:54:28,567 INFO [train.py:1039] (1/4) Epoch 4, batch 2800, loss[loss=0.2507, simple_loss=0.3128, pruned_loss=0.09434, over 24344.00 frames. ], tot_loss[loss=0.2631, simple_loss=0.3176, pruned_loss=0.1043, over 4714936.57 frames. ], batch size: 61, lr: 2.39e-02, grad_scale: 32.0 2023-09-28 19:54:28,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-28 19:54:30,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:54:30,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:54:30,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-28 19:54:30,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:54:31,808 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:54:33,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:54:33,519 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-28 19:54:33,520 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-28 19:54:37,858 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=16.40 vs. limit=15.0 2023-09-28 19:54:38,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:54:40,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 19:54:41,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:54:43,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:54:43,979 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.61 vs. limit=15.0 2023-09-28 19:54:47,207 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-28 19:54:48,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-28 19:54:50,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-28 19:54:50,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:54:50,693 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:54:50,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:54:55,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:54:55,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:54:55,634 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-28 19:54:57,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:55:07,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:55:08,818 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:55:11,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:55:13,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:55:13,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:55:19,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:55:19,970 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-28 19:55:21,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:55:21,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:55:23,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:55:24,969 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=125106.66666666667, ans=0.0 2023-09-28 19:55:26,279 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:55:26,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:55:32,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:55:36,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:55:36,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:55:36,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 19:55:36,177 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 19:55:37,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 19:55:37,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:55:37,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-28 19:55:39,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:55:40,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:55:40,735 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:55:42,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-28 19:55:42,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:55:42,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:55:43,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:55:44,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-28 19:55:52,387 INFO [train.py:1039] (1/4) Epoch 4, batch 2850, loss[loss=0.2441, simple_loss=0.3128, pruned_loss=0.0877, over 24453.00 frames. ], tot_loss[loss=0.2619, simple_loss=0.3166, pruned_loss=0.1036, over 4731853.31 frames. ], batch size: 66, lr: 2.39e-02, grad_scale: 32.0 2023-09-28 19:55:52,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:55:52,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 19:55:54,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:55:57,595 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:56:00,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:56:00,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:56:00,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:56:04,001 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:56:04,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:56:05,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:56:07,036 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-28 19:56:15,859 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-28 19:56:15,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:56:17,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-28 19:56:17,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:56:21,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-28 19:56:21,125 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-28 19:56:22,747 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:56:35,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:56:37,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:56:37,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:56:37,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 19:56:37,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 19:56:38,768 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:56:40,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 19:56:40,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-28 19:56:43,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-28 19:56:43,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:56:45,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:56:45,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:56:48,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:56:48,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:56:50,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:56:52,988 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:56:55,054 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:56:55,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:56:56,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:56:58,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:57:03,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:57:04,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-28 19:57:05,020 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-28 19:57:06,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 19:57:06,767 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=125506.66666666667, ans=0.04949747468305833 2023-09-28 19:57:08,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:57:08,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-28 19:57:09,341 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.742e+02 2.413e+02 2.730e+02 3.344e+02 4.987e+02, threshold=5.460e+02, percent-clipped=0.0 2023-09-28 19:57:09,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:57:10,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:57:10,919 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:57:10,951 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:57:10,952 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-28 19:57:12,321 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-28 19:57:12,327 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:57:13,700 INFO [train.py:1039] (1/4) Epoch 4, batch 2900, loss[loss=0.2779, simple_loss=0.3213, pruned_loss=0.1172, over 23476.00 frames. ], tot_loss[loss=0.2616, simple_loss=0.3169, pruned_loss=0.1032, over 4735376.38 frames. ], batch size: 285, lr: 2.38e-02, grad_scale: 32.0 2023-09-28 19:57:13,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:57:18,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-28 19:57:18,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:57:20,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:57:20,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-28 19:57:25,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:57:25,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-28 19:57:26,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-28 19:57:30,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:57:30,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:57:30,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:57:32,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:57:36,186 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:57:37,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:57:40,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-28 19:57:40,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-28 19:57:42,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-28 19:57:43,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:57:45,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-28 19:57:47,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-28 19:57:50,296 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:57:50,300 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-28 19:57:50,327 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:57:53,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:57:53,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-28 19:57:56,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:57:57,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:58:01,186 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.08 vs. limit=10.0 2023-09-28 19:58:01,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:58:03,703 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:58:07,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-28 19:58:07,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-28 19:58:07,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:58:11,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:58:14,332 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=11.22 vs. limit=15.0 2023-09-28 19:58:15,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-28 19:58:15,329 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:58:15,594 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=125773.33333333333, ans=0.125 2023-09-28 19:58:19,998 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:58:27,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:58:27,786 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:58:30,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-28 19:58:32,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:58:34,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-28 19:58:34,388 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:58:34,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:58:35,885 INFO [train.py:1039] (1/4) Epoch 4, batch 2950, loss[loss=0.2639, simple_loss=0.3282, pruned_loss=0.09984, over 24471.00 frames. ], tot_loss[loss=0.2634, simple_loss=0.3187, pruned_loss=0.104, over 4741469.93 frames. ], batch size: 66, lr: 2.38e-02, grad_scale: 32.0 2023-09-28 19:58:40,020 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=125906.66666666667, ans=0.0 2023-09-28 19:58:42,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:58:42,217 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=125906.66666666667, ans=0.0 2023-09-28 19:58:43,443 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-28 19:58:43,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:58:43,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:58:46,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:58:48,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:58:48,787 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-28 19:58:48,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-28 19:58:50,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 19:58:50,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:58:57,788 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:58:59,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:59:01,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:59:01,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:59:04,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:59:04,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:59:06,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:59:07,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:59:07,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:59:12,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-28 19:59:16,002 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-28 19:59:16,034 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-28 19:59:16,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:59:18,236 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-28 19:59:21,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-28 19:59:21,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:59:21,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:59:21,141 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-28 19:59:21,148 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-28 19:59:22,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-28 19:59:24,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:59:25,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:59:27,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:59:28,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:59:28,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:59:30,305 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-28 19:59:31,838 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:59:31,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-28 19:59:37,209 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:59:38,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:59:40,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-28 19:59:40,263 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:59:41,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-28 19:59:45,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:59:46,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:59:46,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:59:50,604 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:59:50,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 19:59:52,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:59:53,509 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.776e+02 2.372e+02 2.758e+02 3.353e+02 4.666e+02, threshold=5.516e+02, percent-clipped=0.0 2023-09-28 19:59:53,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:59:53,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:59:53,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-28 19:59:53,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:59:55,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:59:56,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:59:56,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-28 19:59:58,294 INFO [train.py:1039] (1/4) Epoch 4, batch 3000, loss[loss=0.2695, simple_loss=0.3122, pruned_loss=0.1133, over 23834.00 frames. ], tot_loss[loss=0.2646, simple_loss=0.3194, pruned_loss=0.1049, over 4727147.28 frames. ], batch size: 195, lr: 2.38e-02, grad_scale: 32.0 2023-09-28 19:59:58,295 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-28 20:00:13,222 INFO [train.py:1071] (1/4) Epoch 4, validation: loss=0.3352, simple_loss=0.3262, pruned_loss=0.1721, over 1125622.00 frames. 2023-09-28 20:00:13,223 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-28 20:00:13,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:00:13,943 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=20.59 vs. limit=15.0 2023-09-28 20:00:15,053 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:00:16,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:00:19,669 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-28 20:00:19,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-28 20:00:23,868 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-28 20:00:23,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:00:25,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-28 20:00:25,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:00:31,609 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 20:00:40,800 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:00:47,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-28 20:00:49,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:00:52,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:00:52,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:00:52,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:00:56,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:00:56,145 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-28 20:00:58,565 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=126373.33333333333, ans=0.2 2023-09-28 20:00:59,826 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-28 20:01:01,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:01:02,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 20:01:05,810 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:01:05,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:01:05,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:01:05,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:01:11,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:01:11,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:01:11,937 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:01:12,300 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=126440.0, ans=0.125 2023-09-28 20:01:13,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:01:15,231 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-28 20:01:16,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:01:16,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:01:16,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:01:17,818 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=7.43 vs. limit=15.0 2023-09-28 20:01:22,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:01:22,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:01:23,517 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-28 20:01:23,588 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-28 20:01:25,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:01:25,172 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-28 20:01:26,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:01:28,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-28 20:01:32,036 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-28 20:01:32,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 20:01:32,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-28 20:01:34,274 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-28 20:01:34,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 20:01:34,498 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=126573.33333333333, ans=0.0 2023-09-28 20:01:35,611 INFO [train.py:1039] (1/4) Epoch 4, batch 3050, loss[loss=0.2622, simple_loss=0.3127, pruned_loss=0.1058, over 23528.00 frames. ], tot_loss[loss=0.2644, simple_loss=0.3194, pruned_loss=0.1046, over 4725261.74 frames. ], batch size: 134, lr: 2.38e-02, grad_scale: 32.0 2023-09-28 20:01:35,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:01:37,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:01:37,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-28 20:01:37,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:01:37,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:01:40,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-28 20:01:43,217 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:01:44,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:01:44,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 20:01:48,076 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:01:51,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-28 20:01:54,013 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.72 vs. limit=15.0 2023-09-28 20:01:56,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-28 20:01:56,397 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-28 20:01:56,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:02:00,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:02:01,264 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=11.92 vs. limit=15.0 2023-09-28 20:02:05,708 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:02:05,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:02:07,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:02:10,094 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=22.96 vs. limit=22.5 2023-09-28 20:02:10,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:02:10,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-28 20:02:10,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:02:11,156 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=126706.66666666667, ans=0.0 2023-09-28 20:02:12,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:02:12,258 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:02:12,387 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:02:12,737 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=126706.66666666667, ans=0.0 2023-09-28 20:02:15,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:02:19,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:02:19,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-28 20:02:19,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:02:19,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:02:20,155 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=126706.66666666667, ans=0.125 2023-09-28 20:02:20,160 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=126706.66666666667, ans=0.2 2023-09-28 20:02:22,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:02:24,385 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:02:24,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:02:25,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:02:32,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:02:32,658 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:02:41,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:02:41,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:02:41,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:02:41,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:02:43,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 20:02:43,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:02:44,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-28 20:02:45,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:02:45,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:02:47,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-28 20:02:48,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:02:52,270 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:02:53,680 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.758e+02 2.370e+02 2.708e+02 3.419e+02 5.330e+02, threshold=5.417e+02, percent-clipped=0.0 2023-09-28 20:02:53,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:02:56,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 20:02:58,159 INFO [train.py:1039] (1/4) Epoch 4, batch 3100, loss[loss=0.2682, simple_loss=0.3389, pruned_loss=0.09876, over 24303.00 frames. ], tot_loss[loss=0.2646, simple_loss=0.3196, pruned_loss=0.1048, over 4722440.44 frames. ], batch size: 74, lr: 2.37e-02, grad_scale: 32.0 2023-09-28 20:02:59,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-28 20:03:01,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-28 20:03:01,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-28 20:03:03,790 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=126906.66666666667, ans=0.125 2023-09-28 20:03:04,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:03:07,176 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:03:07,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:03:09,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-28 20:03:14,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:03:20,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-28 20:03:21,130 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=126973.33333333333, ans=0.0 2023-09-28 20:03:28,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 20:03:28,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:03:29,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:03:29,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:03:31,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-28 20:03:32,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:03:33,035 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-28 20:03:33,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:03:33,530 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=127040.0, ans=0.125 2023-09-28 20:03:34,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:03:36,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-28 20:03:36,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:03:41,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-28 20:03:41,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-28 20:03:43,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-28 20:03:45,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:03:45,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:03:48,453 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:03:48,483 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:03:48,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:03:52,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:03:52,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:03:55,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:03:55,117 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:03:55,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:03:55,132 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 20:03:59,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:04:01,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-28 20:04:02,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-28 20:04:02,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-28 20:04:04,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:04:04,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:04:04,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-28 20:04:11,166 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=127173.33333333333, ans=0.2 2023-09-28 20:04:17,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-28 20:04:21,072 INFO [train.py:1039] (1/4) Epoch 4, batch 3150, loss[loss=0.2684, simple_loss=0.3135, pruned_loss=0.1117, over 23766.00 frames. ], tot_loss[loss=0.2626, simple_loss=0.317, pruned_loss=0.1041, over 4710230.66 frames. ], batch size: 179, lr: 2.37e-02, grad_scale: 32.0 2023-09-28 20:04:21,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:04:21,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:04:22,854 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:04:22,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:04:24,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-28 20:04:24,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:04:24,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-28 20:04:26,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-28 20:04:29,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:04:32,610 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-28 20:04:35,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-28 20:04:35,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:04:37,557 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-28 20:04:37,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-28 20:04:40,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-28 20:04:40,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-28 20:04:40,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-28 20:04:40,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:04:41,222 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:04:43,056 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:04:44,730 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-28 20:04:45,530 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.98 vs. limit=15.0 2023-09-28 20:04:46,453 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 20:04:47,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:04:47,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:04:48,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:04:50,592 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-28 20:04:53,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-28 20:04:54,904 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:04:56,555 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-28 20:04:58,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:04:58,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-28 20:05:01,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-28 20:05:03,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:05:03,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 20:05:03,257 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 20:05:04,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:05:04,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:05:06,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-28 20:05:06,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-28 20:05:07,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-28 20:05:07,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:05:07,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:05:08,312 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer_ff3.min_abs, batch_count=127373.33333333333, ans=0.2 2023-09-28 20:05:09,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:05:09,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:05:11,100 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-28 20:05:11,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:05:13,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-28 20:05:13,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:05:13,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-28 20:05:14,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-28 20:05:17,854 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:05:17,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:05:19,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-28 20:05:19,831 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=127440.0, ans=0.0 2023-09-28 20:05:21,213 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 20:05:21,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:05:21,527 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=127440.0, ans=0.0 2023-09-28 20:05:25,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:05:26,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:05:26,727 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:05:33,449 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:05:34,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:05:38,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-28 20:05:39,688 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.688e+02 2.347e+02 2.789e+02 3.421e+02 6.245e+02, threshold=5.579e+02, percent-clipped=5.0 2023-09-28 20:05:43,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:05:43,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-28 20:05:44,422 INFO [train.py:1039] (1/4) Epoch 4, batch 3200, loss[loss=0.2826, simple_loss=0.3219, pruned_loss=0.1217, over 23525.00 frames. ], tot_loss[loss=0.262, simple_loss=0.3155, pruned_loss=0.1043, over 4689249.66 frames. ], batch size: 256, lr: 2.37e-02, grad_scale: 32.0 2023-09-28 20:05:48,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:05:49,678 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:05:49,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-28 20:05:49,800 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=127573.33333333333, ans=0.125 2023-09-28 20:05:52,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:05:57,390 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:06:02,493 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:06:04,214 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=127640.0, ans=0.125 2023-09-28 20:06:12,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:06:23,093 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.18 vs. limit=15.0 2023-09-28 20:06:23,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-28 20:06:23,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:06:27,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-28 20:06:29,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 20:06:32,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:06:32,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 20:06:34,222 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.11 vs. limit=15.0 2023-09-28 20:06:35,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:06:37,848 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.28 vs. limit=15.0 2023-09-28 20:06:38,826 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-28 20:06:40,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-28 20:06:43,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-28 20:06:45,676 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-28 20:06:48,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:06:54,037 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:06:54,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 20:06:54,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:06:54,250 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=127840.0, ans=0.0 2023-09-28 20:06:54,920 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.82 vs. limit=12.0 2023-09-28 20:06:55,543 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-28 20:06:55,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 20:06:59,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:06:59,634 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-28 20:07:01,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-28 20:07:01,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-28 20:07:02,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-28 20:07:05,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:07:07,649 INFO [train.py:1039] (1/4) Epoch 4, batch 3250, loss[loss=0.2512, simple_loss=0.3187, pruned_loss=0.09179, over 24488.00 frames. ], tot_loss[loss=0.2615, simple_loss=0.3156, pruned_loss=0.1037, over 4696857.46 frames. ], batch size: 66, lr: 2.37e-02, grad_scale: 32.0 2023-09-28 20:07:09,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-28 20:07:09,338 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-28 20:07:09,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:07:09,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:07:12,425 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-28 20:07:16,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:07:16,423 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=127906.66666666667, ans=0.125 2023-09-28 20:07:17,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:07:28,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:07:28,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-28 20:07:30,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:07:30,765 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:07:32,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:07:32,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:07:32,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:07:34,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:07:34,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-28 20:07:34,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:07:36,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:07:36,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:07:36,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:07:37,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:07:39,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:07:41,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:07:41,627 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:07:43,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:07:44,518 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:07:44,549 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:07:51,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-28 20:07:51,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:07:52,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:07:52,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:07:55,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:07:59,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:08:06,707 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:08:06,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:08:06,771 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-28 20:08:06,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:08:07,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 20:08:07,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:08:11,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-28 20:08:11,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-28 20:08:11,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:08:13,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:08:13,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:08:14,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-28 20:08:15,102 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=128173.33333333333, ans=0.0 2023-09-28 20:08:16,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:08:19,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:08:19,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:08:21,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-28 20:08:21,385 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:08:23,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 20:08:23,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-28 20:08:25,981 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.753e+02 2.244e+02 2.605e+02 3.006e+02 4.571e+02, threshold=5.210e+02, percent-clipped=0.0 2023-09-28 20:08:26,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:08:26,211 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-28 20:08:27,819 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-28 20:08:29,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-28 20:08:29,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:08:30,591 INFO [train.py:1039] (1/4) Epoch 4, batch 3300, loss[loss=0.2768, simple_loss=0.3247, pruned_loss=0.1145, over 23715.00 frames. ], tot_loss[loss=0.263, simple_loss=0.3171, pruned_loss=0.1045, over 4698103.68 frames. ], batch size: 256, lr: 2.36e-02, grad_scale: 32.0 2023-09-28 20:08:35,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:08:35,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:08:35,908 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=128240.0, ans=0.125 2023-09-28 20:08:37,049 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:08:39,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 20:08:39,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 20:08:42,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:08:43,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:08:47,256 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=128306.66666666667, ans=0.1 2023-09-28 20:08:49,060 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-28 20:08:49,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:08:49,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:08:52,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:08:52,137 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-28 20:08:52,329 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=128306.66666666667, ans=0.2 2023-09-28 20:08:55,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:08:55,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 20:08:57,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:08:57,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:08:57,270 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-28 20:09:00,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:09:00,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:09:02,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:09:02,066 WARNING [train.py:1197] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-28 20:09:04,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-28 20:09:04,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:09:06,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:09:08,041 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-28 20:09:10,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-28 20:09:10,504 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=128373.33333333333, ans=0.0 2023-09-28 20:09:11,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:09:15,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-28 20:09:18,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:09:19,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-28 20:09:21,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:09:21,613 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=128440.0, ans=0.125 2023-09-28 20:09:24,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:09:25,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:09:25,747 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:09:25,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-28 20:09:28,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:09:28,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:09:30,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:09:31,880 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-28 20:09:34,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-28 20:09:35,277 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.95 vs. limit=10.0 2023-09-28 20:09:37,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-28 20:09:39,112 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:09:39,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:09:40,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:09:40,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:09:43,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 20:09:43,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:09:43,761 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-28 20:09:45,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:09:46,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 20:09:50,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-28 20:09:50,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:09:50,625 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:09:51,893 INFO [train.py:1039] (1/4) Epoch 4, batch 3350, loss[loss=0.2759, simple_loss=0.3341, pruned_loss=0.1088, over 23827.00 frames. ], tot_loss[loss=0.2634, simple_loss=0.3181, pruned_loss=0.1044, over 4711575.11 frames. ], batch size: 85, lr: 2.36e-02, grad_scale: 32.0 2023-09-28 20:09:53,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:09:53,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:09:54,772 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=128573.33333333333, ans=0.0 2023-09-28 20:09:55,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:09:56,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:09:56,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:10:00,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:10:00,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:10:02,419 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=128573.33333333333, ans=0.0 2023-09-28 20:10:04,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:10:06,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:10:06,507 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=128573.33333333333, ans=0.09899494936611666 2023-09-28 20:10:09,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:10:09,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:10:10,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:10:10,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-28 20:10:13,810 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-28 20:10:13,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:10:15,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-28 20:10:15,561 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-28 20:10:16,981 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:10:18,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:10:18,797 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=128640.0, ans=0.0 2023-09-28 20:10:20,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:10:20,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-28 20:10:21,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:10:21,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:10:22,557 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.98 vs. limit=15.0 2023-09-28 20:10:23,714 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:10:26,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:10:26,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:10:28,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:10:32,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:10:33,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:10:33,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:10:37,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:10:37,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:10:39,844 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:10:39,861 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:10:40,186 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=128706.66666666667, ans=0.0 2023-09-28 20:10:43,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:10:44,559 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.70 vs. limit=15.0 2023-09-28 20:10:45,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-28 20:10:45,088 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 20:10:45,134 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-28 20:10:46,426 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:10:46,600 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-28 20:10:48,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:10:49,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:10:57,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:10:58,885 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-28 20:10:58,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 20:11:01,052 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:11:03,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:11:08,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:11:09,997 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.829e+02 2.437e+02 2.848e+02 3.538e+02 5.302e+02, threshold=5.697e+02, percent-clipped=3.0 2023-09-28 20:11:11,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-28 20:11:11,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 20:11:11,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:11:14,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:11:14,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-28 20:11:14,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:11:14,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-28 20:11:16,246 INFO [train.py:1039] (1/4) Epoch 4, batch 3400, loss[loss=0.3868, simple_loss=0.4004, pruned_loss=0.1866, over 19696.00 frames. ], tot_loss[loss=0.2672, simple_loss=0.321, pruned_loss=0.1067, over 4701563.04 frames. ], batch size: 389, lr: 2.36e-02, grad_scale: 32.0 2023-09-28 20:11:16,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:11:16,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:11:18,104 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-28 20:11:18,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:11:18,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-28 20:11:19,976 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=128906.66666666667, ans=0.125 2023-09-28 20:11:24,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-28 20:11:24,307 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-28 20:11:24,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:11:30,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:11:30,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 20:11:31,816 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:11:33,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-28 20:11:37,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:11:39,525 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 20:11:40,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-28 20:11:45,223 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:11:47,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:11:47,309 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:11:47,678 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=129040.0, ans=0.125 2023-09-28 20:11:48,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-28 20:11:56,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:12:01,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-28 20:12:06,332 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:12:07,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:12:07,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-28 20:12:07,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:12:09,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:12:11,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:12:11,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:12:13,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:12:16,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 20:12:16,738 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:12:23,981 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:12:25,582 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-28 20:12:34,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 20:12:37,268 INFO [train.py:1039] (1/4) Epoch 4, batch 3450, loss[loss=0.2414, simple_loss=0.3114, pruned_loss=0.08572, over 24653.00 frames. ], tot_loss[loss=0.2657, simple_loss=0.3204, pruned_loss=0.1055, over 4711629.22 frames. ], batch size: 68, lr: 2.36e-02, grad_scale: 32.0 2023-09-28 20:12:38,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-28 20:12:42,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-28 20:12:42,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:12:43,907 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:12:43,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-28 20:12:46,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:12:51,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-28 20:12:54,587 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=129306.66666666667, ans=0.125 2023-09-28 20:12:55,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:12:57,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:12:59,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:12:59,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:13:01,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:13:08,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-28 20:13:12,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-28 20:13:14,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 20:13:14,325 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:13:15,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:13:22,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-28 20:13:22,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:13:25,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:13:27,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:13:30,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-28 20:13:30,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:13:32,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-28 20:13:32,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:13:33,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:13:36,433 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=129440.0, ans=0.125 2023-09-28 20:13:37,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:13:40,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-28 20:13:43,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:13:48,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:13:48,839 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=129506.66666666667, ans=0.125 2023-09-28 20:13:50,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:13:51,711 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:13:54,862 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=129506.66666666667, ans=0.125 2023-09-28 20:13:55,853 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.743e+02 2.310e+02 2.657e+02 3.151e+02 5.022e+02, threshold=5.313e+02, percent-clipped=0.0 2023-09-28 20:13:57,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:13:57,579 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:13:57,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:13:59,137 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:14:01,127 INFO [train.py:1039] (1/4) Epoch 4, batch 3500, loss[loss=0.2664, simple_loss=0.2852, pruned_loss=0.1238, over 19495.00 frames. ], tot_loss[loss=0.2635, simple_loss=0.3179, pruned_loss=0.1046, over 4696090.74 frames. ], batch size: 389, lr: 2.35e-02, grad_scale: 32.0 2023-09-28 20:14:04,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:14:05,940 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:14:08,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-28 20:14:11,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 20:14:12,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-28 20:14:12,991 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=129573.33333333333, ans=0.0 2023-09-28 20:14:15,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:14:15,740 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-28 20:14:22,037 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:14:23,566 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:14:23,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:14:23,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:14:25,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-28 20:14:25,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:14:25,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:14:25,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-28 20:14:29,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:14:31,054 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-28 20:14:32,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:14:33,237 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.46 vs. limit=15.0 2023-09-28 20:14:33,297 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten.whitening_limit, batch_count=129706.66666666667, ans=22.5 2023-09-28 20:14:37,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:14:37,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-28 20:14:37,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:14:40,959 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:14:43,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:14:43,490 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:14:45,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:14:45,139 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:14:48,242 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-28 20:14:48,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-28 20:14:49,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-28 20:14:51,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:14:51,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:14:52,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:14:52,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:14:55,187 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.55 vs. limit=6.0 2023-09-28 20:14:56,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 20:14:57,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:15:04,639 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:15:06,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-28 20:15:06,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-28 20:15:06,138 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:15:07,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:15:09,893 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:15:11,357 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:15:14,333 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-28 20:15:14,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:15:15,954 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:15:18,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-28 20:15:19,725 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-28 20:15:21,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:15:22,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:15:22,946 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:15:22,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:15:24,304 INFO [train.py:1039] (1/4) Epoch 4, batch 3550, loss[loss=0.2624, simple_loss=0.3223, pruned_loss=0.1013, over 23686.00 frames. ], tot_loss[loss=0.2623, simple_loss=0.3157, pruned_loss=0.1044, over 4686208.28 frames. ], batch size: 85, lr: 2.35e-02, grad_scale: 32.0 2023-09-28 20:15:27,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:15:34,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:15:37,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 20:15:41,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:15:43,689 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:15:46,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:15:46,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:15:46,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:15:51,429 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:15:51,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:15:52,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:15:52,926 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-28 20:15:53,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 20:15:59,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:15:59,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:16:01,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:16:01,402 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:16:02,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:16:02,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-28 20:16:02,916 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:16:04,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:16:04,693 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=130040.0, ans=0.125 2023-09-28 20:16:05,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 20:16:09,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:16:11,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:16:13,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:16:15,630 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.74 vs. limit=10.0 2023-09-28 20:16:16,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-28 20:16:18,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:16:18,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-28 20:16:19,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:16:21,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:16:21,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:16:24,552 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-28 20:16:24,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:16:31,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:16:32,914 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-28 20:16:33,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:16:36,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:16:38,016 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=130173.33333333333, ans=0.125 2023-09-28 20:16:39,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-28 20:16:42,187 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.807e+02 2.286e+02 2.757e+02 3.216e+02 5.394e+02, threshold=5.514e+02, percent-clipped=1.0 2023-09-28 20:16:44,540 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-28 20:16:44,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:16:44,808 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=130173.33333333333, ans=0.125 2023-09-28 20:16:46,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:16:46,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:16:48,262 INFO [train.py:1039] (1/4) Epoch 4, batch 3600, loss[loss=0.2825, simple_loss=0.3254, pruned_loss=0.1198, over 23805.00 frames. ], tot_loss[loss=0.2618, simple_loss=0.316, pruned_loss=0.1038, over 4705022.56 frames. ], batch size: 179, lr: 2.35e-02, grad_scale: 32.0 2023-09-28 20:16:48,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:16:48,819 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=130240.0, ans=0.1 2023-09-28 20:16:50,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:16:54,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:16:55,027 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=130240.0, ans=0.1 2023-09-28 20:16:56,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:16:57,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:16:58,179 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=130240.0, ans=0.125 2023-09-28 20:17:00,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:17:01,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:17:01,582 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-28 20:17:04,885 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 20:17:06,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:17:06,752 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=130306.66666666667, ans=0.125 2023-09-28 20:17:10,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:17:13,968 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:17:15,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:17:15,741 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:17:15,779 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-28 20:17:17,256 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:17:20,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:17:21,039 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:17:22,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:17:24,788 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:17:26,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:17:28,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-28 20:17:35,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:17:35,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 20:17:36,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-28 20:17:40,316 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=130440.0, ans=0.1 2023-09-28 20:17:41,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:17:43,702 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=10.30 vs. limit=15.0 2023-09-28 20:17:45,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:17:47,728 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:17:54,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-28 20:17:54,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:17:54,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-28 20:17:55,143 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.02 vs. limit=15.0 2023-09-28 20:17:57,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-28 20:18:00,685 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-28 20:18:02,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:18:02,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:18:03,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-28 20:18:05,421 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:18:05,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:18:05,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:18:07,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-28 20:18:07,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-28 20:18:10,889 INFO [train.py:1039] (1/4) Epoch 4, batch 3650, loss[loss=0.2782, simple_loss=0.325, pruned_loss=0.1157, over 23905.00 frames. ], tot_loss[loss=0.2619, simple_loss=0.3167, pruned_loss=0.1035, over 4711614.30 frames. ], batch size: 195, lr: 2.34e-02, grad_scale: 32.0 2023-09-28 20:18:11,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:18:11,194 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-28 20:18:17,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-28 20:18:18,903 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:18:23,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-28 20:18:24,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-28 20:18:26,944 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=130640.0, ans=0.07 2023-09-28 20:18:30,377 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:18:30,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-28 20:18:30,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:18:33,910 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=130640.0, ans=0.1 2023-09-28 20:18:35,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-28 20:18:35,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:18:37,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-28 20:18:37,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:18:39,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:18:39,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-28 20:18:39,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 20:18:41,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:18:41,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:18:43,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:18:46,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-28 20:18:47,832 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-28 20:18:48,091 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=130706.66666666667, ans=0.125 2023-09-28 20:18:49,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:18:52,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-28 20:18:53,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:18:53,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:18:54,584 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=20.98 vs. limit=22.5 2023-09-28 20:18:58,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:19:00,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:19:00,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-28 20:19:02,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:19:03,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:19:05,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:19:06,925 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:19:09,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:19:09,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:19:10,850 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=130773.33333333333, ans=0.125 2023-09-28 20:19:11,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 20:19:13,426 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:19:13,540 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:19:21,136 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-28 20:19:22,820 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:19:24,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:19:25,732 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-28 20:19:25,822 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:19:27,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-28 20:19:28,551 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.789e+02 2.316e+02 2.706e+02 3.127e+02 4.745e+02, threshold=5.412e+02, percent-clipped=0.0 2023-09-28 20:19:28,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:19:30,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-28 20:19:30,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:19:30,609 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=130840.0, ans=0.0 2023-09-28 20:19:33,354 INFO [train.py:1039] (1/4) Epoch 4, batch 3700, loss[loss=0.2765, simple_loss=0.3243, pruned_loss=0.1143, over 23627.00 frames. ], tot_loss[loss=0.262, simple_loss=0.317, pruned_loss=0.1035, over 4706852.91 frames. ], batch size: 149, lr: 2.34e-02, grad_scale: 32.0 2023-09-28 20:19:34,976 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 20:19:36,754 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 20:19:38,022 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:19:39,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:19:42,622 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:19:42,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-28 20:19:42,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:19:42,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 20:19:44,752 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 20:19:47,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 20:19:51,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:19:51,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:19:53,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:19:53,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:19:53,459 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 20:19:55,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:19:56,675 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-28 20:19:56,974 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=130973.33333333333, ans=0.2 2023-09-28 20:20:05,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:20:07,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 20:20:07,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:20:07,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-28 20:20:08,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:20:12,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:20:13,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-28 20:20:15,044 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:20:16,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:20:20,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:20:20,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 20:20:23,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 20:20:26,626 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:20:26,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-28 20:20:28,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:20:28,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-28 20:20:32,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:20:32,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:20:37,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:20:37,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-28 20:20:39,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:20:39,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-28 20:20:40,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:20:40,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:20:40,960 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=131173.33333333334, ans=0.125 2023-09-28 20:20:43,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:20:45,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-28 20:20:45,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-28 20:20:47,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:20:47,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:20:48,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:20:50,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:20:53,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:20:55,679 INFO [train.py:1039] (1/4) Epoch 4, batch 3750, loss[loss=0.2546, simple_loss=0.3248, pruned_loss=0.0922, over 23969.00 frames. ], tot_loss[loss=0.2629, simple_loss=0.3185, pruned_loss=0.1036, over 4716123.07 frames. ], batch size: 80, lr: 2.34e-02, grad_scale: 32.0 2023-09-28 20:20:55,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:20:57,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:20:59,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-28 20:21:00,326 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=131240.0, ans=0.125 2023-09-28 20:21:01,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 20:21:04,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-28 20:21:04,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-28 20:21:06,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:21:07,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:21:09,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:21:09,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:21:12,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:21:17,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:21:18,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 20:21:20,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:21:22,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:21:23,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-28 20:21:23,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:21:25,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:21:25,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:21:28,052 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=131373.33333333334, ans=0.0 2023-09-28 20:21:30,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-28 20:21:34,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-28 20:21:36,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:21:36,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:21:39,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:21:41,352 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=131373.33333333334, ans=0.1 2023-09-28 20:21:43,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:21:45,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-28 20:21:50,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-28 20:21:54,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:21:57,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:21:59,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:21:59,511 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=131506.66666666666, ans=0.0 2023-09-28 20:22:03,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:22:05,422 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=131506.66666666666, ans=0.2 2023-09-28 20:22:07,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 20:22:07,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-28 20:22:10,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:22:11,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:22:13,150 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.774e+02 2.509e+02 2.927e+02 3.521e+02 5.743e+02, threshold=5.855e+02, percent-clipped=1.0 2023-09-28 20:22:13,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-28 20:22:17,705 INFO [train.py:1039] (1/4) Epoch 4, batch 3800, loss[loss=0.2733, simple_loss=0.3246, pruned_loss=0.111, over 23347.00 frames. ], tot_loss[loss=0.2626, simple_loss=0.3187, pruned_loss=0.1033, over 4720885.54 frames. ], batch size: 93, lr: 2.34e-02, grad_scale: 32.0 2023-09-28 20:22:18,211 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=131573.33333333334, ans=0.125 2023-09-28 20:22:23,889 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:22:26,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:22:27,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 20:22:27,326 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=131573.33333333334, ans=0.0 2023-09-28 20:22:28,557 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-28 20:22:30,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:22:31,743 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:22:33,809 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-28 20:22:35,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 20:22:35,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:22:38,176 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:22:39,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:22:39,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:22:39,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:22:42,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-28 20:22:45,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-28 20:22:45,252 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:22:48,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:22:51,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:22:52,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 20:22:54,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-28 20:22:54,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:22:54,852 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=131706.66666666666, ans=0.07 2023-09-28 20:22:56,422 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=131706.66666666666, ans=0.1 2023-09-28 20:22:57,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:22:57,765 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=131706.66666666666, ans=0.125 2023-09-28 20:22:58,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:23:02,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 20:23:02,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-28 20:23:05,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:23:12,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:23:14,983 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=131773.33333333334, ans=0.125 2023-09-28 20:23:17,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:23:19,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-28 20:23:22,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-28 20:23:24,107 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:23:24,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:23:25,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:23:27,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-28 20:23:30,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-28 20:23:30,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-28 20:23:31,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:23:33,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:23:33,765 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=131840.0, ans=0.125 2023-09-28 20:23:39,613 INFO [train.py:1039] (1/4) Epoch 4, batch 3850, loss[loss=0.2743, simple_loss=0.332, pruned_loss=0.1083, over 23931.00 frames. ], tot_loss[loss=0.2619, simple_loss=0.3173, pruned_loss=0.1032, over 4721584.33 frames. ], batch size: 86, lr: 2.33e-02, grad_scale: 32.0 2023-09-28 20:23:39,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:23:39,874 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 20:23:45,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:23:45,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-28 20:23:48,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:23:48,283 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:23:52,457 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 20:23:55,511 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:23:58,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-28 20:24:00,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-28 20:24:05,146 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:24:06,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:24:07,067 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=131973.33333333334, ans=0.1 2023-09-28 20:24:08,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:24:09,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:24:12,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:24:14,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:24:14,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:24:14,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:24:16,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:24:18,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:24:19,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:24:19,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:24:20,981 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=132040.0, ans=0.0 2023-09-28 20:24:22,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-28 20:24:22,203 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-28 20:24:22,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:24:22,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:24:25,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:24:25,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:24:25,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-28 20:24:28,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-28 20:24:31,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:24:33,103 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-28 20:24:33,395 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=132106.66666666666, ans=0.1 2023-09-28 20:24:36,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-28 20:24:42,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:24:43,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:24:48,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:24:48,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-28 20:24:52,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-28 20:24:54,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:24:54,535 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=132173.33333333334, ans=0.0 2023-09-28 20:24:56,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:24:57,995 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.696e+02 2.471e+02 2.833e+02 3.573e+02 5.682e+02, threshold=5.667e+02, percent-clipped=0.0 2023-09-28 20:24:59,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 20:24:59,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:25:01,167 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:25:01,279 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:25:01,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:25:01,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-28 20:25:01,617 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=132240.0, ans=0.125 2023-09-28 20:25:02,737 INFO [train.py:1039] (1/4) Epoch 4, batch 3900, loss[loss=0.2471, simple_loss=0.2668, pruned_loss=0.1137, over 19039.00 frames. ], tot_loss[loss=0.2607, simple_loss=0.3158, pruned_loss=0.1028, over 4719667.72 frames. ], batch size: 388, lr: 2.33e-02, grad_scale: 32.0 2023-09-28 20:25:02,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:25:04,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-28 20:25:04,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:25:04,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:25:06,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:25:07,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:25:09,148 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:25:09,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:25:09,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:25:09,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:25:09,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-28 20:25:09,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:25:12,741 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=132240.0, ans=0.125 2023-09-28 20:25:12,747 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=132240.0, ans=0.0 2023-09-28 20:25:13,991 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:25:15,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 20:25:15,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:25:16,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:25:20,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 20:25:20,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:25:23,475 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:25:25,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-28 20:25:25,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:25:27,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-28 20:25:28,067 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.57 vs. limit=15.0 2023-09-28 20:25:28,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:25:30,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-28 20:25:31,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-28 20:25:31,302 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=132306.66666666666, ans=0.0 2023-09-28 20:25:37,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:25:37,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:25:37,828 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:25:39,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:25:42,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:25:44,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:25:46,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:25:46,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:25:48,392 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:25:54,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:25:54,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:26:02,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 20:26:05,305 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:26:15,013 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:26:18,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:26:18,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-28 20:26:18,230 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-28 20:26:18,254 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:26:19,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-28 20:26:21,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:26:21,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-28 20:26:24,474 INFO [train.py:1039] (1/4) Epoch 4, batch 3950, loss[loss=0.2649, simple_loss=0.3123, pruned_loss=0.1087, over 23650.00 frames. ], tot_loss[loss=0.2605, simple_loss=0.3152, pruned_loss=0.1029, over 4714329.44 frames. ], batch size: 256, lr: 2.33e-02, grad_scale: 32.0 2023-09-28 20:26:29,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:26:30,613 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-28 20:26:32,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:26:34,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:26:37,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:26:42,427 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-28 20:26:43,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:26:44,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-28 20:26:44,123 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-28 20:26:45,374 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:26:47,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:26:48,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-28 20:26:48,478 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:26:51,613 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-28 20:26:54,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:26:56,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:26:56,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:26:56,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:26:56,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:26:56,829 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=132706.66666666666, ans=0.0 2023-09-28 20:27:10,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:27:10,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:27:10,585 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=132706.66666666666, ans=0.2 2023-09-28 20:27:10,586 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=132706.66666666666, ans=0.0 2023-09-28 20:27:15,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-28 20:27:21,940 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-28 20:27:21,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-28 20:27:22,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:27:24,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:27:29,988 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=132840.0, ans=0.0 2023-09-28 20:27:31,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:27:31,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:27:32,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:27:32,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:27:34,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-28 20:27:37,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:27:39,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:27:42,855 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.941e+02 2.456e+02 2.836e+02 3.414e+02 5.372e+02, threshold=5.673e+02, percent-clipped=0.0 2023-09-28 20:27:42,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-28 20:27:44,049 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=132840.0, ans=0.125 2023-09-28 20:27:48,134 INFO [train.py:1039] (1/4) Epoch 4, batch 4000, loss[loss=0.2576, simple_loss=0.3224, pruned_loss=0.09643, over 24578.00 frames. ], tot_loss[loss=0.2606, simple_loss=0.3158, pruned_loss=0.1027, over 4717007.47 frames. ], batch size: 68, lr: 2.33e-02, grad_scale: 32.0 2023-09-28 20:27:55,046 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=132906.66666666666, ans=0.125 2023-09-28 20:27:56,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:28:04,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:28:08,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:28:10,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:28:10,382 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:28:10,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-28 20:28:11,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-28 20:28:11,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-28 20:28:12,163 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=132973.33333333334, ans=0.0 2023-09-28 20:28:13,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:28:13,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-28 20:28:14,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:28:19,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:28:19,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:28:19,454 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:28:19,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:28:19,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-28 20:28:21,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:28:24,115 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-28 20:28:24,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:28:24,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:28:28,903 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-28 20:28:29,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 20:28:29,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:28:37,043 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-28 20:28:38,455 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:28:40,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:28:41,599 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-28 20:28:43,141 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:28:43,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-28 20:28:43,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:28:44,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:28:44,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:28:46,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:28:46,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-28 20:28:47,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:28:50,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-28 20:28:51,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:28:53,145 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-28 20:28:57,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:29:02,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 20:29:05,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:29:05,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:29:05,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:29:06,491 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=23.06 vs. limit=22.5 2023-09-28 20:29:07,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:29:08,948 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=133240.0, ans=0.0 2023-09-28 20:29:10,079 INFO [train.py:1039] (1/4) Epoch 4, batch 4050, loss[loss=0.2493, simple_loss=0.2997, pruned_loss=0.09942, over 23676.00 frames. ], tot_loss[loss=0.2618, simple_loss=0.3169, pruned_loss=0.1033, over 4712542.22 frames. ], batch size: 232, lr: 2.32e-02, grad_scale: 32.0 2023-09-28 20:29:10,584 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 20:29:13,331 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:29:16,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-28 20:29:16,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-28 20:29:17,990 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:29:19,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:29:19,564 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:29:21,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-28 20:29:21,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:29:25,181 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.10 vs. limit=15.0 2023-09-28 20:29:25,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:29:30,236 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:29:30,333 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 20:29:30,640 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=133306.66666666666, ans=0.125 2023-09-28 20:29:31,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:29:31,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:29:38,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:29:42,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-28 20:29:44,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 20:29:47,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-28 20:29:47,159 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-28 20:29:47,415 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=133373.33333333334, ans=0.2 2023-09-28 20:29:48,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:29:54,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-28 20:29:55,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:29:55,826 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=17.14 vs. limit=15.0 2023-09-28 20:29:59,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:30:04,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:30:04,612 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:30:04,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:30:08,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:30:13,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-28 20:30:13,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 20:30:14,917 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:30:16,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-28 20:30:20,565 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 20:30:21,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:30:28,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-28 20:30:29,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:30:29,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:30:30,988 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.888e+02 2.307e+02 2.673e+02 3.242e+02 5.499e+02, threshold=5.347e+02, percent-clipped=0.0 2023-09-28 20:30:31,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-28 20:30:32,603 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-28 20:30:32,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:30:35,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:30:36,491 INFO [train.py:1039] (1/4) Epoch 4, batch 4100, loss[loss=0.2617, simple_loss=0.3176, pruned_loss=0.1029, over 23357.00 frames. ], tot_loss[loss=0.2627, simple_loss=0.3184, pruned_loss=0.1035, over 4719459.97 frames. ], batch size: 93, lr: 2.32e-02, grad_scale: 32.0 2023-09-28 20:30:36,622 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:30:36,662 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:30:44,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-28 20:30:46,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-28 20:30:47,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-28 20:30:48,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-28 20:30:49,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:30:49,555 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:30:51,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:30:51,669 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:30:51,784 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-28 20:30:54,988 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:30:55,254 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=133640.0, ans=0.125 2023-09-28 20:30:56,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:30:56,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:30:56,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:30:58,332 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=133640.0, ans=0.125 2023-09-28 20:31:02,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 20:31:03,800 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:31:03,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:31:05,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-28 20:31:05,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:31:05,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:31:05,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:31:05,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:31:06,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-28 20:31:12,001 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:31:13,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-28 20:31:13,655 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=133706.66666666666, ans=0.0 2023-09-28 20:31:15,688 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:31:17,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:31:17,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-28 20:31:18,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:31:18,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:31:20,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:31:21,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-28 20:31:23,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-28 20:31:24,789 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:31:25,058 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-28 20:31:27,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:31:27,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:31:27,471 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=133773.33333333334, ans=0.125 2023-09-28 20:31:30,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:31:34,844 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:31:39,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:31:39,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:31:50,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:31:50,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:31:53,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:31:55,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:31:55,313 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=133840.0, ans=0.0 2023-09-28 20:31:58,096 INFO [train.py:1039] (1/4) Epoch 4, batch 4150, loss[loss=0.3301, simple_loss=0.3492, pruned_loss=0.1556, over 19714.00 frames. ], tot_loss[loss=0.2638, simple_loss=0.3192, pruned_loss=0.1042, over 4716129.87 frames. ], batch size: 389, lr: 2.32e-02, grad_scale: 32.0 2023-09-28 20:31:59,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:31:59,945 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:32:01,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:32:01,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:32:05,054 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=133906.66666666666, ans=0.0 2023-09-28 20:32:06,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-28 20:32:07,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:32:07,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-28 20:32:09,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-28 20:32:09,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-28 20:32:11,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:32:15,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:32:15,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:32:20,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:32:22,909 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:32:23,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-28 20:32:23,938 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=133973.33333333334, ans=0.125 2023-09-28 20:32:26,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 20:32:26,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:32:28,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-28 20:32:31,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:32:34,323 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:32:35,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-28 20:32:39,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-28 20:32:39,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:32:39,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-28 20:32:39,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:32:39,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:32:42,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:32:43,410 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.31 vs. limit=15.0 2023-09-28 20:32:44,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:32:48,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-28 20:32:51,766 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-28 20:32:53,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:32:56,721 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-28 20:32:56,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:32:58,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-28 20:32:59,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:33:01,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:33:02,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:33:02,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-28 20:33:02,887 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:33:02,891 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-28 20:33:04,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 20:33:06,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-28 20:33:06,156 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:33:06,162 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:33:06,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 20:33:07,769 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-28 20:33:09,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:33:09,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 20:33:10,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:33:12,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:33:13,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-28 20:33:13,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-28 20:33:16,747 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.811e+02 2.342e+02 2.661e+02 3.100e+02 4.687e+02, threshold=5.322e+02, percent-clipped=0.0 2023-09-28 20:33:19,939 INFO [train.py:1039] (1/4) Epoch 4, batch 4200, loss[loss=0.2332, simple_loss=0.2914, pruned_loss=0.08749, over 24404.00 frames. ], tot_loss[loss=0.2626, simple_loss=0.3173, pruned_loss=0.104, over 4694510.10 frames. ], batch size: 58, lr: 2.32e-02, grad_scale: 16.0 2023-09-28 20:33:20,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:33:21,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-28 20:33:23,213 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:33:23,602 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=134240.0, ans=0.2 2023-09-28 20:33:25,467 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:33:26,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:33:27,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:33:27,063 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:33:30,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-28 20:33:35,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-28 20:33:35,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:33:38,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:33:40,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:33:43,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-28 20:33:45,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:33:45,218 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:33:47,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-28 20:33:47,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:33:48,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:33:48,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:33:48,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:33:50,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 20:33:50,607 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=134306.66666666666, ans=10.0 2023-09-28 20:33:53,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-28 20:33:53,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:33:57,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-28 20:33:59,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:34:03,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:34:04,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:34:05,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:34:05,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-28 20:34:05,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:34:07,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:34:13,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-28 20:34:14,127 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.41 vs. limit=15.0 2023-09-28 20:34:14,917 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:34:21,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:34:21,729 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=134440.0, ans=0.125 2023-09-28 20:34:23,996 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=12.07 vs. limit=15.0 2023-09-28 20:34:24,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-28 20:34:27,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:34:29,803 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=134506.66666666666, ans=0.0 2023-09-28 20:34:31,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 20:34:32,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:34:35,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-28 20:34:42,019 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-28 20:34:43,558 INFO [train.py:1039] (1/4) Epoch 4, batch 4250, loss[loss=0.2289, simple_loss=0.2881, pruned_loss=0.08492, over 24312.00 frames. ], tot_loss[loss=0.261, simple_loss=0.3161, pruned_loss=0.1029, over 4696312.63 frames. ], batch size: 56, lr: 2.31e-02, grad_scale: 16.0 2023-09-28 20:34:45,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:34:45,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-28 20:34:46,002 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=7.49 vs. limit=15.0 2023-09-28 20:34:48,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:34:53,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-28 20:34:53,489 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-28 20:34:53,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:34:56,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:34:56,917 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=134573.33333333334, ans=0.07 2023-09-28 20:34:59,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:35:03,057 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=134640.0, ans=0.2 2023-09-28 20:35:04,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:35:06,280 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:35:08,456 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:35:08,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:35:10,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:35:11,964 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:35:13,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:35:15,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:35:18,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:35:19,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-28 20:35:19,927 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=134706.66666666666, ans=0.125 2023-09-28 20:35:21,595 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=134706.66666666666, ans=10.0 2023-09-28 20:35:22,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-28 20:35:22,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:35:24,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:35:24,382 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:35:26,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:35:26,026 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:35:26,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:35:26,413 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=134706.66666666666, ans=0.2 2023-09-28 20:35:31,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-28 20:35:32,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:35:35,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:35:36,082 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=134773.33333333334, ans=0.0 2023-09-28 20:35:36,838 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.71 vs. limit=15.0 2023-09-28 20:35:37,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:35:37,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=134773.33333333334, ans=0.1 2023-09-28 20:35:38,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-28 20:35:38,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:35:40,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-28 20:35:42,413 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:35:45,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-28 20:35:46,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:35:46,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:35:48,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-28 20:35:49,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 20:35:51,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-28 20:35:53,821 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=6.32 vs. limit=15.0 2023-09-28 20:35:54,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:35:56,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:35:58,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:36:00,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:36:02,334 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.761e+02 2.342e+02 2.586e+02 3.220e+02 5.035e+02, threshold=5.173e+02, percent-clipped=0.0 2023-09-28 20:36:02,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:36:02,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:36:04,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:36:04,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-28 20:36:05,386 INFO [train.py:1039] (1/4) Epoch 4, batch 4300, loss[loss=0.2917, simple_loss=0.3371, pruned_loss=0.1231, over 23897.00 frames. ], tot_loss[loss=0.2602, simple_loss=0.3148, pruned_loss=0.1027, over 4691340.42 frames. ], batch size: 195, lr: 2.31e-02, grad_scale: 16.0 2023-09-28 20:36:05,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:36:08,802 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=134906.66666666666, ans=0.125 2023-09-28 20:36:11,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:36:11,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:36:14,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:36:17,217 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=134906.66666666666, ans=0.1 2023-09-28 20:36:21,958 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=134973.33333333334, ans=0.0 2023-09-28 20:36:23,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:36:23,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-28 20:36:23,431 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=134973.33333333334, ans=0.04949747468305833 2023-09-28 20:36:26,222 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:36:27,842 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-28 20:36:27,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:36:27,905 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-28 20:36:31,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 20:36:32,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:36:37,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-28 20:36:37,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:36:39,560 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-28 20:36:41,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 20:36:42,711 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:36:44,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:36:44,464 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:36:45,945 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:36:46,255 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=135040.0, ans=0.0 2023-09-28 20:36:47,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:36:49,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:36:49,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-28 20:36:49,747 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-28 20:36:49,958 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=135040.0, ans=0.1 2023-09-28 20:36:53,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:36:56,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:36:56,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 20:36:56,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:36:57,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:36:57,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-28 20:36:57,934 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-28 20:36:58,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-28 20:36:59,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:36:59,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-28 20:36:59,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-28 20:37:05,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:37:07,171 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-28 20:37:07,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:37:08,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:37:08,907 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:37:10,624 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-28 20:37:10,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:37:10,750 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:37:12,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:37:12,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:37:12,968 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=135173.33333333334, ans=0.125 2023-09-28 20:37:14,133 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:37:16,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:37:18,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:37:20,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:37:21,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:37:21,280 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=135173.33333333334, ans=0.125 2023-09-28 20:37:28,131 INFO [train.py:1039] (1/4) Epoch 4, batch 4350, loss[loss=0.2716, simple_loss=0.3343, pruned_loss=0.1045, over 24364.00 frames. ], tot_loss[loss=0.2623, simple_loss=0.3169, pruned_loss=0.1039, over 4691075.57 frames. ], batch size: 77, lr: 2.31e-02, grad_scale: 16.0 2023-09-28 20:37:28,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-28 20:37:28,397 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-28 20:37:34,505 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:37:37,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:37:39,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:37:39,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:37:45,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:37:47,429 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:37:50,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:37:50,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:37:53,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-28 20:37:55,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:37:58,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-28 20:38:04,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-28 20:38:06,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:38:06,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:38:12,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:38:13,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-28 20:38:16,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:38:18,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 20:38:23,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=135440.0, ans=0.1 2023-09-28 20:38:24,901 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-28 20:38:26,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:38:26,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:38:26,784 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=135440.0, ans=0.05 2023-09-28 20:38:27,926 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-28 20:38:28,269 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=135440.0, ans=0.125 2023-09-28 20:38:29,404 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-28 20:38:29,413 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:38:29,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:38:30,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:38:30,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:38:31,749 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:38:31,805 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:38:35,509 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-28 20:38:35,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:38:35,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:38:35,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:38:37,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-28 20:38:38,140 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=135506.66666666666, ans=0.125 2023-09-28 20:38:39,154 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-28 20:38:39,174 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-28 20:38:39,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-28 20:38:42,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:38:42,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:38:42,568 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=135506.66666666666, ans=0.125 2023-09-28 20:38:43,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:38:43,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:38:46,863 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.833e+02 2.300e+02 2.585e+02 3.110e+02 4.848e+02, threshold=5.170e+02, percent-clipped=0.0 2023-09-28 20:38:47,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-28 20:38:48,664 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-28 20:38:48,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:38:50,004 INFO [train.py:1039] (1/4) Epoch 4, batch 4400, loss[loss=0.2667, simple_loss=0.3383, pruned_loss=0.09748, over 23998.00 frames. ], tot_loss[loss=0.2626, simple_loss=0.3173, pruned_loss=0.1039, over 4710666.01 frames. ], batch size: 80, lr: 2.31e-02, grad_scale: 32.0 2023-09-28 20:38:53,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:38:53,320 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:38:56,879 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:38:58,718 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 20:38:59,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-28 20:38:59,857 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-28 20:38:59,925 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-28 20:38:59,960 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-28 20:39:01,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 20:39:01,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:39:03,186 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-28 20:39:04,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:39:05,082 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=135640.0, ans=0.0 2023-09-28 20:39:06,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:39:07,522 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-28 20:39:11,976 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:39:11,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-28 20:39:12,049 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-28 20:39:15,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-28 20:39:15,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-28 20:39:15,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-28 20:39:15,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:39:17,515 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:39:17,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:39:19,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:39:20,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-28 20:39:20,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-28 20:39:22,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:39:25,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:39:25,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:39:26,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:39:28,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:39:28,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-28 20:39:29,650 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-28 20:39:33,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:39:38,880 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten.whitening_limit, batch_count=135773.33333333334, ans=15.0 2023-09-28 20:39:39,674 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:39:41,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-28 20:39:43,797 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.min_positive, batch_count=135773.33333333334, ans=0.025 2023-09-28 20:39:45,168 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:39:45,369 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=135773.33333333334, ans=0.125 2023-09-28 20:39:49,025 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=135773.33333333334, ans=0.2 2023-09-28 20:39:49,132 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=135773.33333333334, ans=0.05 2023-09-28 20:39:50,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:39:51,786 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:39:51,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-28 20:39:51,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:39:51,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-28 20:39:51,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:39:53,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-28 20:39:58,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-28 20:40:01,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-28 20:40:02,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-28 20:40:02,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:40:02,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-28 20:40:04,088 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:40:05,846 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:40:08,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-28 20:40:11,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:40:13,250 INFO [train.py:1039] (1/4) Epoch 4, batch 4450, loss[loss=0.2916, simple_loss=0.3441, pruned_loss=0.1196, over 23339.00 frames. ], tot_loss[loss=0.2644, simple_loss=0.3187, pruned_loss=0.1051, over 4707846.34 frames. ], batch size: 93, lr: 2.30e-02, grad_scale: 32.0 2023-09-28 20:40:13,585 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=135906.66666666666, ans=0.125 2023-09-28 20:40:13,749 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=135906.66666666666, ans=0.0 2023-09-28 20:40:16,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:40:16,286 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:40:20,741 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=135906.66666666666, ans=0.2 2023-09-28 20:40:23,672 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 20:40:26,570 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:40:26,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:40:26,909 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=135906.66666666666, ans=0.125 2023-09-28 20:40:31,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:40:31,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:40:33,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:40:34,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:40:36,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-28 20:40:36,543 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:40:38,007 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:40:38,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:40:38,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-28 20:40:39,643 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 20:40:44,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:40:46,126 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:40:46,311 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:40:48,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:40:49,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:40:55,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 20:40:57,300 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-28 20:40:57,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-28 20:40:57,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:41:00,993 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=7.00 vs. limit=12.0 2023-09-28 20:41:01,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:41:02,544 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.26 vs. limit=15.0 2023-09-28 20:41:03,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-28 20:41:05,133 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=136106.66666666666, ans=0.0 2023-09-28 20:41:06,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-28 20:41:09,485 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:41:09,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-28 20:41:09,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:41:09,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:41:09,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:41:09,648 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:41:12,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:41:15,684 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-28 20:41:17,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-28 20:41:19,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 20:41:20,868 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:41:22,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:41:24,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:41:25,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 20:41:26,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:41:31,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-28 20:41:32,907 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.860e+02 2.347e+02 2.673e+02 3.318e+02 4.703e+02, threshold=5.347e+02, percent-clipped=0.0 2023-09-28 20:41:33,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:41:33,995 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten.whitening_limit, batch_count=136173.33333333334, ans=22.5 2023-09-28 20:41:35,979 INFO [train.py:1039] (1/4) Epoch 4, batch 4500, loss[loss=0.2561, simple_loss=0.2993, pruned_loss=0.1065, over 23853.00 frames. ], tot_loss[loss=0.2633, simple_loss=0.3183, pruned_loss=0.1042, over 4715522.12 frames. ], batch size: 164, lr: 2.30e-02, grad_scale: 32.0 2023-09-28 20:41:36,426 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=136240.0, ans=0.125 2023-09-28 20:41:39,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:41:39,626 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=136240.0, ans=0.125 2023-09-28 20:41:40,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-28 20:41:40,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-28 20:41:41,378 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=136240.0, ans=0.1 2023-09-28 20:41:42,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:41:45,879 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:41:47,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:41:47,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 20:41:48,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:41:48,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:41:48,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:42:00,884 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=136306.66666666666, ans=0.125 2023-09-28 20:42:02,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:42:04,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:42:07,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:42:08,780 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:42:08,940 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:42:15,063 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 20:42:20,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:42:24,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 20:42:27,462 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:42:27,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-28 20:42:29,607 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:42:29,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:42:31,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:42:31,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:42:34,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:42:34,882 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-28 20:42:34,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 20:42:34,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:42:39,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:42:40,003 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:42:43,085 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:42:44,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-28 20:42:46,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:42:47,087 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.82 vs. limit=12.0 2023-09-28 20:42:47,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-28 20:42:50,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-28 20:42:50,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-28 20:42:55,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-28 20:42:56,794 INFO [train.py:1039] (1/4) Epoch 4, batch 4550, loss[loss=0.2765, simple_loss=0.3365, pruned_loss=0.1082, over 23748.00 frames. ], tot_loss[loss=0.2615, simple_loss=0.3171, pruned_loss=0.103, over 4712996.56 frames. ], batch size: 85, lr: 2.30e-02, grad_scale: 32.0 2023-09-28 20:42:57,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-28 20:42:59,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:43:03,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:43:05,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:43:09,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:43:09,637 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=136573.33333333334, ans=0.125 2023-09-28 20:43:12,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:43:14,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:43:16,260 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:43:16,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:43:16,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:43:16,730 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.18 vs. limit=12.0 2023-09-28 20:43:20,403 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:43:20,473 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:43:23,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:43:26,844 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-28 20:43:28,247 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-28 20:43:29,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:43:29,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-28 20:43:32,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-28 20:43:34,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:43:37,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-28 20:43:39,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 20:43:42,346 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=136706.66666666666, ans=0.04949747468305833 2023-09-28 20:43:43,613 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=136706.66666666666, ans=0.2 2023-09-28 20:43:44,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:43:44,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:43:44,898 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:43:46,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-28 20:43:49,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:43:51,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:43:51,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:43:52,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:43:53,170 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=136773.33333333334, ans=0.1 2023-09-28 20:43:54,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-28 20:43:55,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-28 20:43:56,032 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:43:57,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-28 20:44:00,458 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-28 20:44:00,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:44:00,722 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:44:02,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:44:02,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:44:02,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:44:05,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 20:44:05,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-28 20:44:05,806 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=136840.0, ans=0.1 2023-09-28 20:44:06,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:44:06,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 20:44:08,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-28 20:44:08,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:44:08,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-28 20:44:11,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:44:11,934 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:44:12,186 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=136840.0, ans=0.2 2023-09-28 20:44:15,288 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.851e+02 2.285e+02 2.509e+02 2.934e+02 4.311e+02, threshold=5.019e+02, percent-clipped=0.0 2023-09-28 20:44:15,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:44:15,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:44:17,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-28 20:44:19,147 INFO [train.py:1039] (1/4) Epoch 4, batch 4600, loss[loss=0.2595, simple_loss=0.3298, pruned_loss=0.09456, over 24652.00 frames. ], tot_loss[loss=0.261, simple_loss=0.3163, pruned_loss=0.1028, over 4720354.18 frames. ], batch size: 68, lr: 2.30e-02, grad_scale: 32.0 2023-09-28 20:44:19,254 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:44:20,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-28 20:44:22,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:44:23,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:44:24,783 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=14.56 vs. limit=15.0 2023-09-28 20:44:26,967 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:44:26,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 20:44:27,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:44:28,544 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-28 20:44:30,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:44:34,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:44:36,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:44:37,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:44:42,420 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=136973.33333333334, ans=0.1 2023-09-28 20:44:42,494 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=136973.33333333334, ans=0.125 2023-09-28 20:44:45,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-28 20:44:47,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:44:50,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:44:54,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:44:54,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:44:55,038 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=24.06 vs. limit=22.5 2023-09-28 20:44:58,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-28 20:44:58,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 20:44:59,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:45:01,389 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=13.82 vs. limit=15.0 2023-09-28 20:45:03,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:45:03,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-28 20:45:05,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:45:07,255 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=137106.66666666666, ans=0.125 2023-09-28 20:45:09,233 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.42 vs. limit=10.0 2023-09-28 20:45:09,920 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-28 20:45:11,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-28 20:45:14,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:45:16,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:45:20,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:45:20,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 20:45:21,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:45:22,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-28 20:45:22,505 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:45:23,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:45:25,509 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:45:25,659 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:45:27,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:45:28,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-28 20:45:28,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-28 20:45:28,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-28 20:45:28,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:45:31,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:45:31,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:45:33,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:45:39,411 INFO [train.py:1039] (1/4) Epoch 4, batch 4650, loss[loss=0.248, simple_loss=0.3109, pruned_loss=0.09256, over 24491.00 frames. ], tot_loss[loss=0.2599, simple_loss=0.3154, pruned_loss=0.1022, over 4710478.75 frames. ], batch size: 66, lr: 2.29e-02, grad_scale: 32.0 2023-09-28 20:45:42,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:45:45,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:45:45,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:45:45,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:45:45,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:45:47,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:45:49,395 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:45:53,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-28 20:45:57,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:45:59,221 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-28 20:45:59,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:45:59,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-28 20:46:00,793 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:46:02,142 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-28 20:46:02,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-28 20:46:02,194 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:46:02,301 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:46:07,076 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=137306.66666666666, ans=0.035 2023-09-28 20:46:08,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:46:08,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:46:09,915 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-28 20:46:10,423 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=137306.66666666666, ans=0.125 2023-09-28 20:46:13,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:46:13,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-28 20:46:16,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:46:16,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:46:17,791 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-28 20:46:20,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:46:23,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:46:28,731 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:46:32,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:46:35,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:46:35,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:46:37,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:46:38,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-28 20:46:40,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-28 20:46:40,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 20:46:40,533 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-28 20:46:42,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:46:49,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:46:49,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:46:50,000 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-28 20:46:50,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:46:51,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:46:51,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:46:53,164 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:46:56,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:46:56,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:46:56,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:46:57,260 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=137506.66666666666, ans=0.125 2023-09-28 20:46:58,904 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.838e+02 2.187e+02 2.531e+02 2.951e+02 4.992e+02, threshold=5.061e+02, percent-clipped=0.0 2023-09-28 20:46:59,440 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=137506.66666666666, ans=0.1 2023-09-28 20:47:00,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:47:02,488 INFO [train.py:1039] (1/4) Epoch 4, batch 4700, loss[loss=0.2456, simple_loss=0.3015, pruned_loss=0.09486, over 24627.00 frames. ], tot_loss[loss=0.2605, simple_loss=0.3166, pruned_loss=0.1022, over 4723701.60 frames. ], batch size: 60, lr: 2.29e-02, grad_scale: 32.0 2023-09-28 20:47:02,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:47:02,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 20:47:02,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-28 20:47:05,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:47:06,819 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-28 20:47:10,315 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=137573.33333333334, ans=0.0 2023-09-28 20:47:14,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:47:14,863 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:47:16,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:47:17,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:47:19,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 20:47:25,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-28 20:47:25,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-28 20:47:27,236 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:47:27,511 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=137640.0, ans=0.1 2023-09-28 20:47:28,774 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:47:28,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:47:32,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:47:40,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:47:42,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 20:47:45,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:47:47,149 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=137706.66666666666, ans=0.125 2023-09-28 20:47:51,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-28 20:47:52,751 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:47:54,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:47:57,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-28 20:47:58,862 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:48:03,533 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:48:05,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-28 20:48:07,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:48:07,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:48:08,400 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.11 vs. limit=22.5 2023-09-28 20:48:10,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:48:12,126 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:48:12,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-28 20:48:12,313 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-28 20:48:15,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:48:17,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:48:17,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:48:17,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-28 20:48:19,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:48:22,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-28 20:48:25,427 INFO [train.py:1039] (1/4) Epoch 4, batch 4750, loss[loss=0.2696, simple_loss=0.3157, pruned_loss=0.1117, over 23791.00 frames. ], tot_loss[loss=0.2611, simple_loss=0.317, pruned_loss=0.1026, over 4710766.68 frames. ], batch size: 212, lr: 2.29e-02, grad_scale: 32.0 2023-09-28 20:48:25,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:48:27,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:48:31,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:48:31,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:48:33,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-28 20:48:33,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:48:36,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-28 20:48:38,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:48:39,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:48:39,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:48:46,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-28 20:48:51,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:48:54,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-28 20:48:54,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:48:59,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:48:59,299 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:48:59,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:48:59,444 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-28 20:48:59,449 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-28 20:49:05,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-28 20:49:08,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:49:10,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:49:11,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:49:11,806 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-28 20:49:11,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:49:15,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:49:18,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:49:18,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-28 20:49:20,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-28 20:49:20,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:49:21,969 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:49:22,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:49:22,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 20:49:24,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-28 20:49:27,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-28 20:49:29,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:49:32,520 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:49:32,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-28 20:49:33,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:49:34,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:49:37,016 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-28 20:49:37,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:49:38,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:49:41,774 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=138173.33333333334, ans=0.0 2023-09-28 20:49:43,067 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:49:43,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-28 20:49:44,415 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.660e+02 2.351e+02 2.944e+02 3.482e+02 5.215e+02, threshold=5.888e+02, percent-clipped=1.0 2023-09-28 20:49:44,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-28 20:49:46,139 INFO [train.py:1039] (1/4) Epoch 4, batch 4800, loss[loss=0.2288, simple_loss=0.2902, pruned_loss=0.08369, over 19007.00 frames. ], tot_loss[loss=0.2635, simple_loss=0.3188, pruned_loss=0.1041, over 4711606.00 frames. ], batch size: 41, lr: 2.29e-02, grad_scale: 32.0 2023-09-28 20:49:46,255 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-28 20:49:47,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:49:49,288 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:49:51,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-28 20:49:56,385 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:49:58,377 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:50:05,085 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:50:05,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:50:05,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:50:06,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-28 20:50:08,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:50:08,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:50:09,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:50:13,062 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:50:14,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:50:16,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:50:17,945 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=138373.33333333334, ans=0.2 2023-09-28 20:50:19,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:50:19,065 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 20:50:19,090 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:50:19,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:50:22,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:50:25,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:50:25,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:50:25,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-28 20:50:27,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 20:50:29,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:50:29,796 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=138373.33333333334, ans=0.1 2023-09-28 20:50:33,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-28 20:50:33,044 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-28 20:50:34,456 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:50:34,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:50:35,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:50:35,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:50:35,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:50:36,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:50:37,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:50:41,532 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:50:41,845 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=138440.0, ans=0.125 2023-09-28 20:50:44,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:50:47,524 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:50:50,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-28 20:50:50,977 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:50:52,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:50:52,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:50:53,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:50:57,583 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=138506.66666666666, ans=0.125 2023-09-28 20:50:58,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:50:58,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 20:50:58,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:50:58,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:51:00,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:51:00,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:51:04,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:51:04,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:51:04,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:51:06,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-28 20:51:06,671 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=138506.66666666666, ans=0.125 2023-09-28 20:51:09,197 INFO [train.py:1039] (1/4) Epoch 4, batch 4850, loss[loss=0.2294, simple_loss=0.3026, pruned_loss=0.07816, over 24479.00 frames. ], tot_loss[loss=0.2626, simple_loss=0.3183, pruned_loss=0.1035, over 4717996.30 frames. ], batch size: 69, lr: 2.28e-02, grad_scale: 32.0 2023-09-28 20:51:09,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-28 20:51:09,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:51:09,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:51:10,700 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:51:10,702 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:51:14,233 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:51:21,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-28 20:51:23,395 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:51:23,696 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=138640.0, ans=0.0 2023-09-28 20:51:26,635 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:51:28,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 20:51:28,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:51:32,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:51:34,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:51:36,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:51:36,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-28 20:51:41,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:51:43,569 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:51:43,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 20:51:44,988 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:51:45,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-28 20:51:48,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:51:48,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:51:53,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:51:53,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-28 20:51:54,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-28 20:51:55,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 20:52:01,624 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=138773.33333333334, ans=0.125 2023-09-28 20:52:02,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:52:02,874 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-28 20:52:02,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:52:05,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:52:06,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-28 20:52:08,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-28 20:52:08,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:52:09,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-28 20:52:09,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:52:09,877 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:52:11,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-28 20:52:15,805 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=138840.0, ans=0.025 2023-09-28 20:52:20,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:52:24,766 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=18.84 vs. limit=15.0 2023-09-28 20:52:27,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:52:27,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:52:28,870 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=138840.0, ans=0.0 2023-09-28 20:52:29,910 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.911e+02 2.282e+02 2.559e+02 2.982e+02 4.179e+02, threshold=5.119e+02, percent-clipped=0.0 2023-09-28 20:52:31,426 INFO [train.py:1039] (1/4) Epoch 4, batch 4900, loss[loss=0.2612, simple_loss=0.283, pruned_loss=0.1198, over 19084.00 frames. ], tot_loss[loss=0.2617, simple_loss=0.3171, pruned_loss=0.1032, over 4700675.81 frames. ], batch size: 388, lr: 2.28e-02, grad_scale: 32.0 2023-09-28 20:52:32,135 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.63 vs. limit=12.0 2023-09-28 20:52:33,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-28 20:52:33,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:52:41,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:52:42,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:52:42,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:52:45,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-28 20:52:49,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-28 20:52:54,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-28 20:52:56,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-28 20:52:56,495 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-28 20:52:56,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:52:56,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:52:58,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:52:58,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:52:58,771 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-28 20:52:59,200 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=138973.33333333334, ans=0.2 2023-09-28 20:53:01,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-28 20:53:02,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 20:53:03,680 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer_ff2.min_abs, batch_count=139040.0, ans=0.1 2023-09-28 20:53:04,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-28 20:53:06,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-28 20:53:07,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:53:08,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:53:09,656 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:53:09,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-28 20:53:11,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:53:12,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:53:13,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-28 20:53:13,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-28 20:53:17,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-28 20:53:19,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:53:23,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:53:23,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:53:23,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:53:23,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 20:53:25,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:53:25,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-28 20:53:25,565 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=139106.66666666666, ans=0.125 2023-09-28 20:53:28,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:53:29,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-28 20:53:31,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:53:35,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-28 20:53:35,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:53:35,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-28 20:53:36,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-28 20:53:38,657 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=139173.33333333334, ans=0.1 2023-09-28 20:53:42,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:53:44,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:53:46,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-28 20:53:46,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 20:53:47,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:53:49,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:53:49,752 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=139173.33333333334, ans=0.125 2023-09-28 20:53:52,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:53:52,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:53:52,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:53:54,531 INFO [train.py:1039] (1/4) Epoch 4, batch 4950, loss[loss=0.2436, simple_loss=0.3171, pruned_loss=0.08504, over 24562.00 frames. ], tot_loss[loss=0.2602, simple_loss=0.3153, pruned_loss=0.1026, over 4703259.48 frames. ], batch size: 71, lr: 2.28e-02, grad_scale: 32.0 2023-09-28 20:53:54,624 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-28 20:53:56,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 20:53:59,585 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:53:59,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 20:54:01,388 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=139240.0, ans=0.025 2023-09-28 20:54:02,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-28 20:54:02,749 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-28 20:54:02,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:54:02,986 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=139240.0, ans=0.1 2023-09-28 20:54:06,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-28 20:54:06,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:54:06,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:54:07,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:54:07,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:54:09,616 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:54:11,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:54:12,541 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:54:14,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:54:15,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:54:15,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:54:18,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 20:54:25,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:54:26,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:54:28,489 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:54:28,566 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:54:30,900 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=139373.33333333334, ans=0.0 2023-09-28 20:54:32,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:54:33,735 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-28 20:54:35,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-28 20:54:35,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:54:39,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:54:39,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:54:42,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:54:42,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:54:43,619 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:54:45,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:54:46,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-28 20:54:48,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:54:50,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:54:51,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:54:51,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-28 20:54:53,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 20:54:53,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:54:57,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:54:58,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:54:58,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:55:00,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:55:00,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:55:00,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:55:00,520 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=139506.66666666666, ans=0.0 2023-09-28 20:55:03,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:55:04,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:55:05,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:55:05,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-28 20:55:10,711 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:55:15,178 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.874e+02 2.343e+02 2.664e+02 3.186e+02 5.232e+02, threshold=5.328e+02, percent-clipped=1.0 2023-09-28 20:55:16,768 INFO [train.py:1039] (1/4) Epoch 4, batch 5000, loss[loss=0.2603, simple_loss=0.3264, pruned_loss=0.09711, over 24625.00 frames. ], tot_loss[loss=0.2597, simple_loss=0.3153, pruned_loss=0.1021, over 4708467.63 frames. ], batch size: 68, lr: 2.28e-02, grad_scale: 32.0 2023-09-28 20:55:16,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-28 20:55:16,877 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-28 20:55:21,832 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:55:21,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-28 20:55:23,408 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-28 20:55:24,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-28 20:55:26,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:55:28,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-28 20:55:28,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:55:28,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:55:30,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-28 20:55:30,204 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:55:31,592 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:55:31,838 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=139640.0, ans=0.125 2023-09-28 20:55:33,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-28 20:55:33,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:55:33,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:55:34,171 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=139640.0, ans=0.125 2023-09-28 20:55:35,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-28 20:55:36,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-28 20:55:36,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:55:36,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-28 20:55:36,909 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 20:55:38,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:55:38,447 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 20:55:38,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-28 20:55:38,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-28 20:55:40,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-28 20:55:40,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:55:42,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:55:42,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-28 20:55:43,786 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-28 20:55:45,344 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:55:45,495 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:55:48,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-28 20:55:49,887 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-28 20:55:51,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:55:54,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:55:57,643 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-28 20:55:58,161 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.72 vs. limit=12.0 2023-09-28 20:55:59,381 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:56:02,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:56:02,754 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:56:04,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-28 20:56:06,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:56:06,050 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:56:06,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:56:07,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-28 20:56:09,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:56:12,615 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.02 vs. limit=15.0 2023-09-28 20:56:13,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:56:13,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:56:20,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-28 20:56:24,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:56:32,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:56:32,857 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=139840.0, ans=0.125 2023-09-28 20:56:34,095 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:56:34,106 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 20:56:34,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:56:36,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 20:56:36,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-28 20:56:37,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:56:38,961 INFO [train.py:1039] (1/4) Epoch 4, batch 5050, loss[loss=0.2643, simple_loss=0.3239, pruned_loss=0.1024, over 24475.00 frames. ], tot_loss[loss=0.2615, simple_loss=0.3169, pruned_loss=0.103, over 4714604.44 frames. ], batch size: 66, lr: 2.27e-02, grad_scale: 32.0 2023-09-28 20:56:42,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:56:42,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-28 20:56:44,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:56:48,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:56:49,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:56:51,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-28 20:56:53,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:56:53,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:56:54,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 20:56:56,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:56:57,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-28 20:57:05,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-28 20:57:05,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-28 20:57:07,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:57:07,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-28 20:57:07,433 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:57:10,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:57:10,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:57:10,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:57:10,606 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-28 20:57:12,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-28 20:57:14,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:57:19,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:57:22,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:57:22,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-28 20:57:22,821 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=140040.0, ans=0.125 2023-09-28 20:57:24,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:57:26,551 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=140040.0, ans=0.0 2023-09-28 20:57:28,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-28 20:57:29,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:57:30,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:57:30,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:57:30,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:57:31,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:57:33,508 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=140106.66666666666, ans=0.125 2023-09-28 20:57:34,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:57:34,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:57:34,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:57:34,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:57:36,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-28 20:57:36,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:57:39,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:57:43,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:57:43,847 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-28 20:57:43,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-28 20:57:45,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:57:45,526 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:57:45,576 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-28 20:57:49,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:57:49,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-28 20:57:49,199 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:57:54,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:57:54,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:57:54,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-28 20:57:55,136 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=140173.33333333334, ans=0.2 2023-09-28 20:57:56,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-28 20:58:00,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:58:00,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:58:00,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:58:02,028 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.727e+02 2.333e+02 2.668e+02 3.236e+02 5.838e+02, threshold=5.336e+02, percent-clipped=1.0 2023-09-28 20:58:03,582 INFO [train.py:1039] (1/4) Epoch 4, batch 5100, loss[loss=0.2545, simple_loss=0.3063, pruned_loss=0.1014, over 23662.00 frames. ], tot_loss[loss=0.2623, simple_loss=0.3181, pruned_loss=0.1032, over 4713982.95 frames. ], batch size: 149, lr: 2.27e-02, grad_scale: 32.0 2023-09-28 20:58:05,158 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-28 20:58:08,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:58:11,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-28 20:58:11,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-28 20:58:11,611 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=140240.0, ans=0.0 2023-09-28 20:58:12,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:58:14,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:58:16,097 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=140240.0, ans=0.1 2023-09-28 20:58:18,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:58:18,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-28 20:58:20,181 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-28 20:58:22,148 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=140306.66666666666, ans=0.1 2023-09-28 20:58:25,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:58:25,563 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:58:28,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:58:34,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-28 20:58:34,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:58:35,002 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=140373.33333333334, ans=0.2 2023-09-28 20:58:36,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:58:36,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-28 20:58:37,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:58:39,502 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:58:39,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-28 20:58:41,502 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-28 20:58:41,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:58:42,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-28 20:58:42,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-28 20:58:46,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:58:55,116 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:58:58,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-28 20:58:58,195 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-28 20:58:58,221 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-28 20:59:01,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-28 20:59:01,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:59:06,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-28 20:59:10,676 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-28 20:59:12,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 20:59:14,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:59:16,044 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-28 20:59:17,644 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-28 20:59:19,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-28 20:59:24,964 INFO [train.py:1039] (1/4) Epoch 4, batch 5150, loss[loss=0.2793, simple_loss=0.3194, pruned_loss=0.1196, over 23771.00 frames. ], tot_loss[loss=0.261, simple_loss=0.3178, pruned_loss=0.1021, over 4729929.24 frames. ], batch size: 179, lr: 2.27e-02, grad_scale: 32.0 2023-09-28 20:59:25,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:59:25,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:59:25,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:59:26,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:59:26,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 20:59:26,968 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 20:59:27,789 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=7.06 vs. limit=15.0 2023-09-28 20:59:28,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:59:28,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-28 20:59:28,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-28 20:59:29,766 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-28 20:59:29,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:59:29,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-28 20:59:31,346 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:59:32,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 20:59:34,373 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:59:36,470 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:59:41,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 20:59:41,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-28 20:59:43,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:59:44,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:59:45,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:59:45,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:59:46,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:59:47,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:59:47,079 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:59:47,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-28 20:59:49,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:59:50,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:59:50,967 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=140640.0, ans=0.0 2023-09-28 20:59:52,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 20:59:54,040 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-28 20:59:55,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:59:57,533 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=140706.66666666666, ans=0.125 2023-09-28 21:00:01,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:00:03,328 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=140706.66666666666, ans=0.0 2023-09-28 21:00:04,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-28 21:00:07,887 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:00:13,391 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=140773.33333333334, ans=10.0 2023-09-28 21:00:14,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:00:18,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:00:23,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:00:23,567 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:00:27,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-28 21:00:31,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:00:31,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:00:33,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 21:00:36,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:00:37,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:00:37,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-28 21:00:42,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:00:42,620 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 21:00:45,324 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.700e+02 2.271e+02 2.546e+02 2.924e+02 4.595e+02, threshold=5.092e+02, percent-clipped=0.0 2023-09-28 21:00:45,527 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:00:45,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:00:47,427 INFO [train.py:1039] (1/4) Epoch 4, batch 5200, loss[loss=0.3584, simple_loss=0.3813, pruned_loss=0.1677, over 19938.00 frames. ], tot_loss[loss=0.2624, simple_loss=0.3184, pruned_loss=0.1032, over 4716660.47 frames. ], batch size: 388, lr: 2.27e-02, grad_scale: 32.0 2023-09-28 21:00:47,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-28 21:00:47,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-28 21:00:49,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:00:49,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:00:51,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:00:54,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:00:57,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:01:01,348 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=140906.66666666666, ans=0.0 2023-09-28 21:01:02,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-28 21:01:02,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:01:02,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:01:05,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:01:07,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:01:07,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:01:07,788 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=140973.33333333334, ans=0.2 2023-09-28 21:01:08,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-28 21:01:11,410 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.70 vs. limit=6.0 2023-09-28 21:01:11,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 21:01:12,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:01:15,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-28 21:01:16,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-28 21:01:18,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-28 21:01:19,788 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-28 21:01:21,781 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-28 21:01:23,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-28 21:01:23,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:01:23,567 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-28 21:01:24,907 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:01:25,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:01:25,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:01:27,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-28 21:01:28,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:01:30,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:01:33,688 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-28 21:01:33,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-28 21:01:35,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-28 21:01:37,794 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten.whitening_limit, batch_count=141106.66666666666, ans=15.0 2023-09-28 21:01:38,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-28 21:01:38,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 21:01:43,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:01:43,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:01:45,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-28 21:01:45,209 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:01:46,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-28 21:01:46,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:01:46,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 21:01:50,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:01:53,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:01:55,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:01:56,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:01:56,582 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:02:03,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:02:04,031 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-28 21:02:05,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:02:05,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:02:08,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:02:08,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=141240.0, ans=0.125 2023-09-28 21:02:09,650 INFO [train.py:1039] (1/4) Epoch 4, batch 5250, loss[loss=0.236, simple_loss=0.301, pruned_loss=0.08553, over 24511.00 frames. ], tot_loss[loss=0.2609, simple_loss=0.3169, pruned_loss=0.1024, over 4721445.58 frames. ], batch size: 63, lr: 2.27e-02, grad_scale: 32.0 2023-09-28 21:02:09,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-28 21:02:09,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:02:11,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:02:16,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:02:17,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:02:18,941 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:02:23,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:02:25,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:02:26,625 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.48 vs. limit=15.0 2023-09-28 21:02:28,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:02:30,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 21:02:33,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-28 21:02:33,055 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:02:34,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:02:44,552 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=141373.33333333334, ans=0.1 2023-09-28 21:02:46,556 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=13.44 vs. limit=15.0 2023-09-28 21:02:56,299 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=15.24 vs. limit=22.5 2023-09-28 21:03:21,503 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.854e+02 2.354e+02 2.746e+02 3.335e+02 6.410e+02, threshold=5.493e+02, percent-clipped=2.0 2023-09-28 21:03:22,901 INFO [train.py:1039] (1/4) Epoch 4, batch 5300, loss[loss=0.2495, simple_loss=0.2928, pruned_loss=0.1031, over 23772.00 frames. ], tot_loss[loss=0.2587, simple_loss=0.314, pruned_loss=0.1017, over 4710610.72 frames. ], batch size: 212, lr: 2.26e-02, grad_scale: 32.0 2023-09-28 21:03:27,289 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=141573.33333333334, ans=0.0 2023-09-28 21:03:30,638 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.18 vs. limit=6.0 2023-09-28 21:03:37,323 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=141640.0, ans=0.125 2023-09-28 21:03:38,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:03:38,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 21:03:38,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 21:03:38,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:03:38,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:03:38,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:03:39,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:03:39,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:03:39,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:03:39,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:03:39,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 21:03:39,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:03:39,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 21:03:40,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 21:03:40,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 21:03:40,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 21:03:40,321 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 21:03:40,449 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 21:03:40,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:03:41,533 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:03:41,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:03:41,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:03:41,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:03:42,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:03:42,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:03:42,421 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:03:42,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:03:42,613 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:03:42,620 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:03:42,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:03:42,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:03:43,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 21:03:43,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:03:44,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:03:44,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 21:03:44,192 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 21:03:44,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:03:44,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:03:44,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 21:03:45,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 21:03:45,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 21:03:45,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 21:03:46,022 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:03:46,177 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 21:03:46,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 21:03:46,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 21:03:46,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:03:46,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 21:03:46,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 21:03:46,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 21:03:47,042 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-28 21:03:55,446 INFO [train.py:1039] (1/4) Epoch 5, batch 0, loss[loss=0.2488, simple_loss=0.3034, pruned_loss=0.09709, over 23578.00 frames. ], tot_loss[loss=0.2488, simple_loss=0.3034, pruned_loss=0.09709, over 23578.00 frames. ], batch size: 149, lr: 2.11e-02, grad_scale: 32.0 2023-09-28 21:03:55,446 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-28 21:04:10,263 INFO [train.py:1071] (1/4) Epoch 5, validation: loss=0.3547, simple_loss=0.3281, pruned_loss=0.1907, over 1125622.00 frames. 2023-09-28 21:04:10,264 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-28 21:04:10,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-28 21:04:12,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:04:14,204 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:04:19,065 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:04:19,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:04:20,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:04:21,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-28 21:04:23,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-28 21:04:25,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:04:26,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:04:31,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:04:31,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:04:32,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:04:32,737 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:04:32,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-28 21:04:34,772 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=141720.0, ans=0.125 2023-09-28 21:04:35,930 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:04:45,327 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.73 vs. limit=15.0 2023-09-28 21:04:46,683 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 21:04:46,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:04:48,240 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-28 21:04:52,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:04:52,055 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:04:53,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:04:58,112 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:04:59,797 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=141853.33333333334, ans=0.125 2023-09-28 21:05:02,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:05:07,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-28 21:05:08,183 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.40 vs. limit=15.0 2023-09-28 21:05:10,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-28 21:05:10,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:05:10,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:05:11,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:05:11,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:05:14,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-28 21:05:17,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:05:19,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:05:23,102 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-28 21:05:27,475 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-28 21:05:28,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:05:31,966 INFO [train.py:1039] (1/4) Epoch 5, batch 50, loss[loss=0.2491, simple_loss=0.3204, pruned_loss=0.08886, over 24684.00 frames. ], tot_loss[loss=0.2576, simple_loss=0.3133, pruned_loss=0.101, over 1061793.21 frames. ], batch size: 73, lr: 2.10e-02, grad_scale: 32.0 2023-09-28 21:05:32,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:05:35,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:05:36,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-28 21:05:36,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 21:05:36,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:05:39,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:05:40,066 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.28 vs. limit=15.0 2023-09-28 21:05:42,572 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:05:45,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:05:48,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-28 21:05:48,694 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:05:55,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:05:57,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-28 21:05:59,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-28 21:06:02,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:06:02,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:06:02,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:06:02,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:06:05,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-28 21:06:05,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 21:06:05,773 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:06:06,043 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=142120.0, ans=0.0 2023-09-28 21:06:12,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:06:13,602 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-28 21:06:14,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 21:06:15,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-28 21:06:18,042 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:06:19,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 21:06:19,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-28 21:06:20,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:06:22,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-28 21:06:23,520 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.91 vs. limit=15.0 2023-09-28 21:06:24,377 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=142186.66666666666, ans=0.05 2023-09-28 21:06:29,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:06:29,408 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:06:29,685 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=142186.66666666666, ans=0.125 2023-09-28 21:06:31,347 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.771e+02 2.197e+02 2.413e+02 2.834e+02 4.473e+02, threshold=4.826e+02, percent-clipped=0.0 2023-09-28 21:06:31,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:06:33,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:06:33,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:06:36,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-28 21:06:36,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-28 21:06:38,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:06:39,927 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:06:40,782 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=10.11 vs. limit=15.0 2023-09-28 21:06:41,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:06:42,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:06:42,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-28 21:06:44,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-28 21:06:44,496 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-28 21:06:47,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:06:47,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:06:47,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-28 21:06:48,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-28 21:06:50,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:06:50,610 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-28 21:06:52,031 INFO [train.py:1039] (1/4) Epoch 5, batch 100, loss[loss=0.2493, simple_loss=0.3253, pruned_loss=0.08667, over 24666.00 frames. ], tot_loss[loss=0.2549, simple_loss=0.3141, pruned_loss=0.09785, over 1884606.70 frames. ], batch size: 73, lr: 2.10e-02, grad_scale: 32.0 2023-09-28 21:06:52,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-28 21:06:53,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:06:55,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:06:55,684 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=142320.0, ans=0.125 2023-09-28 21:06:55,737 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=142320.0, ans=0.1 2023-09-28 21:06:57,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:07:01,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:07:05,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-28 21:07:05,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:07:12,640 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:07:12,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:07:12,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-28 21:07:12,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:07:12,743 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:07:14,425 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=142386.66666666666, ans=0.1 2023-09-28 21:07:15,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-28 21:07:17,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-28 21:07:17,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:07:17,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:07:17,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:07:21,218 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.68 vs. limit=10.0 2023-09-28 21:07:22,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-28 21:07:22,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:07:23,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:07:23,822 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-28 21:07:26,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 21:07:30,056 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-28 21:07:30,095 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-28 21:07:31,658 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:07:31,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:07:36,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:07:39,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:07:39,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:07:43,199 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.96 vs. limit=12.0 2023-09-28 21:07:46,319 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=142520.0, ans=0.2 2023-09-28 21:07:47,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:07:47,451 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-28 21:07:49,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-28 21:07:52,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:07:52,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:07:53,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:07:54,337 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=142520.0, ans=0.125 2023-09-28 21:07:57,110 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=142586.66666666666, ans=0.125 2023-09-28 21:07:58,370 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:08:01,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:08:03,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:08:06,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:08:06,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:08:08,729 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=11.31 vs. limit=15.0 2023-09-28 21:08:09,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:08:09,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:08:09,684 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:08:09,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-28 21:08:11,192 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-28 21:08:11,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:08:11,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 21:08:12,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:08:13,512 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:08:13,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 21:08:13,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 21:08:14,815 INFO [train.py:1039] (1/4) Epoch 5, batch 150, loss[loss=0.2776, simple_loss=0.3158, pruned_loss=0.1197, over 22764.00 frames. ], tot_loss[loss=0.2544, simple_loss=0.3127, pruned_loss=0.09805, over 2515337.31 frames. ], batch size: 322, lr: 2.10e-02, grad_scale: 32.0 2023-09-28 21:08:14,891 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-28 21:08:14,901 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:08:15,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:08:16,450 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:08:16,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:08:18,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:08:18,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:08:23,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:08:23,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:08:23,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:08:24,616 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.77 vs. limit=22.5 2023-09-28 21:08:26,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:08:27,017 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=142653.33333333334, ans=0.1 2023-09-28 21:08:28,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:08:29,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:08:31,365 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:08:35,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-28 21:08:35,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-28 21:08:35,804 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-28 21:08:38,809 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:08:38,817 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 21:08:40,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:08:42,369 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:08:42,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:08:42,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:08:42,711 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=142720.0, ans=0.125 2023-09-28 21:08:43,786 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:08:43,965 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-28 21:08:46,390 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.38 vs. limit=6.0 2023-09-28 21:08:47,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:08:52,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:08:56,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 21:08:57,859 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-28 21:09:00,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:09:00,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:09:02,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:09:05,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:09:07,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:09:08,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:09:08,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:09:08,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-28 21:09:14,557 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.662e+02 2.253e+02 2.610e+02 3.187e+02 7.657e+02, threshold=5.219e+02, percent-clipped=8.0 2023-09-28 21:09:14,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:09:14,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:09:14,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:09:14,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:09:18,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:09:20,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 21:09:22,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-28 21:09:23,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 21:09:25,312 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:09:27,668 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:09:27,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-28 21:09:29,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:09:29,051 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-28 21:09:32,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:09:36,815 INFO [train.py:1039] (1/4) Epoch 5, batch 200, loss[loss=0.292, simple_loss=0.3304, pruned_loss=0.1268, over 23795.00 frames. ], tot_loss[loss=0.2569, simple_loss=0.3145, pruned_loss=0.09971, over 3013054.85 frames. ], batch size: 179, lr: 2.10e-02, grad_scale: 32.0 2023-09-28 21:09:37,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:09:38,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:09:40,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-28 21:09:41,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:09:41,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:09:43,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-28 21:09:46,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-28 21:09:48,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:09:48,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:09:53,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:09:53,396 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:09:53,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:09:54,673 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.12 vs. limit=15.0 2023-09-28 21:09:55,749 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=143053.33333333334, ans=0.125 2023-09-28 21:09:59,033 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=143053.33333333334, ans=0.125 2023-09-28 21:10:07,572 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=143053.33333333334, ans=0.125 2023-09-28 21:10:07,720 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=143053.33333333334, ans=0.0 2023-09-28 21:10:12,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:10:13,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:10:13,764 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=143120.0, ans=0.2 2023-09-28 21:10:15,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:10:16,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:10:17,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 21:10:17,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 21:10:20,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:10:22,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:10:22,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:10:22,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:10:24,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-28 21:10:25,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 21:10:25,663 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:10:31,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:10:34,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:10:43,611 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:10:43,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:10:52,765 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:10:54,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-28 21:10:55,778 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:10:55,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:10:55,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:10:57,212 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:10:59,100 INFO [train.py:1039] (1/4) Epoch 5, batch 250, loss[loss=0.2172, simple_loss=0.2848, pruned_loss=0.07475, over 24372.00 frames. ], tot_loss[loss=0.2566, simple_loss=0.3142, pruned_loss=0.09948, over 3396579.76 frames. ], batch size: 61, lr: 2.10e-02, grad_scale: 32.0 2023-09-28 21:10:59,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-28 21:11:00,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:11:02,081 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-28 21:11:03,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:11:07,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:11:07,373 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:11:08,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:11:10,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:11:10,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:11:11,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:11:16,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:11:29,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:11:31,104 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:11:32,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:11:37,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-28 21:11:39,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-28 21:11:40,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:11:40,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:11:41,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 21:11:41,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:11:41,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:11:44,792 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:11:47,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-28 21:11:47,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:11:49,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:11:49,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:11:49,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:11:51,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 21:11:53,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:11:53,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:11:56,833 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:11:57,001 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:11:57,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:11:59,960 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.784e+02 2.236e+02 2.772e+02 3.274e+02 8.100e+02, threshold=5.544e+02, percent-clipped=4.0 2023-09-28 21:12:01,721 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:12:03,690 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 21:12:04,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:12:08,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:12:08,919 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=143586.66666666666, ans=0.0 2023-09-28 21:12:16,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:12:16,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:12:19,943 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-28 21:12:21,360 INFO [train.py:1039] (1/4) Epoch 5, batch 300, loss[loss=0.2666, simple_loss=0.3327, pruned_loss=0.1003, over 24402.00 frames. ], tot_loss[loss=0.2551, simple_loss=0.3118, pruned_loss=0.09918, over 3679383.09 frames. ], batch size: 77, lr: 2.09e-02, grad_scale: 32.0 2023-09-28 21:12:21,475 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:12:21,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 21:12:24,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-28 21:12:24,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-28 21:12:26,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:12:26,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-28 21:12:30,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:12:33,303 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:12:36,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:12:36,867 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=143720.0, ans=0.125 2023-09-28 21:12:38,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-28 21:12:39,716 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:12:41,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 21:12:41,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-28 21:12:41,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:12:45,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-28 21:12:52,006 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 21:12:52,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-28 21:12:55,271 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-28 21:12:55,332 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:12:58,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:12:59,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:12:59,875 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-28 21:12:59,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:13:02,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:13:03,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:13:05,639 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:13:10,171 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-28 21:13:10,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-28 21:13:11,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:13:14,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:13:14,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-28 21:13:16,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:13:20,443 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:13:23,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:13:23,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-28 21:13:28,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:13:28,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 21:13:31,328 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:13:31,492 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:13:33,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-28 21:13:33,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 21:13:33,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:13:34,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-28 21:13:36,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:13:38,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:13:38,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:13:38,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:13:40,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:13:44,699 INFO [train.py:1039] (1/4) Epoch 5, batch 350, loss[loss=0.2315, simple_loss=0.2872, pruned_loss=0.08791, over 24412.00 frames. ], tot_loss[loss=0.253, simple_loss=0.3087, pruned_loss=0.09862, over 3879169.13 frames. ], batch size: 58, lr: 2.09e-02, grad_scale: 32.0 2023-09-28 21:13:44,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:13:44,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 21:13:50,152 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:13:56,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:13:59,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:13:59,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:14:04,015 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-28 21:14:05,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:14:05,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-28 21:14:08,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:14:08,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-28 21:14:09,004 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=144053.33333333334, ans=0.125 2023-09-28 21:14:10,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:14:14,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-28 21:14:15,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:14:17,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:14:18,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:14:20,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:14:20,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:14:21,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:14:21,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:14:21,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:14:23,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:14:23,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:14:30,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:14:30,913 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:14:32,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:14:32,393 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:14:32,620 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=144186.66666666666, ans=0.0 2023-09-28 21:14:38,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-28 21:14:38,552 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:14:43,798 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=144186.66666666666, ans=15.0 2023-09-28 21:14:44,465 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.897e+02 2.143e+02 2.367e+02 2.704e+02 4.411e+02, threshold=4.734e+02, percent-clipped=0.0 2023-09-28 21:14:44,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:14:44,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:14:44,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:14:45,137 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=144186.66666666666, ans=0.0 2023-09-28 21:14:48,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-28 21:14:48,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:14:50,778 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-28 21:14:52,246 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-28 21:14:52,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:14:55,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:14:55,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-28 21:14:56,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:15:02,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:15:04,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:15:04,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:15:04,295 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:15:05,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:15:07,258 INFO [train.py:1039] (1/4) Epoch 5, batch 400, loss[loss=0.2401, simple_loss=0.2999, pruned_loss=0.09014, over 24653.00 frames. ], tot_loss[loss=0.2514, simple_loss=0.308, pruned_loss=0.09736, over 4068605.77 frames. ], batch size: 65, lr: 2.09e-02, grad_scale: 32.0 2023-09-28 21:15:08,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:15:11,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:15:13,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-28 21:15:13,279 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:15:14,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:15:14,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:15:16,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:15:18,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:15:20,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:15:23,905 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-28 21:15:27,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-28 21:15:27,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:15:27,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-28 21:15:28,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:15:34,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:15:34,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:15:34,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-28 21:15:34,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:15:34,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:15:34,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:15:35,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:15:37,197 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-28 21:15:38,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-28 21:15:40,594 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=144453.33333333334, ans=0.125 2023-09-28 21:15:43,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:15:43,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:15:45,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-28 21:15:46,641 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-28 21:15:49,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:15:49,997 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=144453.33333333334, ans=0.125 2023-09-28 21:15:52,684 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:16:01,003 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-28 21:16:04,599 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-28 21:16:06,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-28 21:16:08,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:16:09,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:16:09,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-28 21:16:13,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:16:15,414 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=144586.66666666666, ans=0.125 2023-09-28 21:16:16,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 21:16:18,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:16:21,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:16:21,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-28 21:16:24,157 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-28 21:16:24,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-28 21:16:25,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:16:25,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:16:28,810 INFO [train.py:1039] (1/4) Epoch 5, batch 450, loss[loss=0.2476, simple_loss=0.3008, pruned_loss=0.09725, over 23658.00 frames. ], tot_loss[loss=0.2537, simple_loss=0.3098, pruned_loss=0.09885, over 4209413.30 frames. ], batch size: 149, lr: 2.09e-02, grad_scale: 32.0 2023-09-28 21:16:28,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-28 21:16:32,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 21:16:32,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:16:34,001 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-28 21:16:36,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-28 21:16:36,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:16:37,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:16:39,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:16:39,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-28 21:16:39,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:16:40,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 21:16:44,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 21:16:53,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:16:54,387 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:16:56,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-28 21:16:56,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-28 21:16:59,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:16:59,293 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=144720.0, ans=0.125 2023-09-28 21:17:02,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:17:05,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:17:09,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:17:11,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:17:14,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-28 21:17:15,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-28 21:17:16,048 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=144786.66666666666, ans=0.125 2023-09-28 21:17:17,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-28 21:17:17,150 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:17:19,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:17:19,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:17:21,557 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-28 21:17:22,908 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-28 21:17:22,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:17:24,922 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=144853.33333333334, ans=0.125 2023-09-28 21:17:25,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:17:25,995 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-28 21:17:27,668 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=144853.33333333334, ans=0.125 2023-09-28 21:17:30,466 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.848e+02 2.241e+02 2.627e+02 3.194e+02 6.560e+02, threshold=5.254e+02, percent-clipped=4.0 2023-09-28 21:17:30,597 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-28 21:17:30,670 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:17:32,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-28 21:17:32,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-28 21:17:35,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:17:36,841 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-28 21:17:36,894 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 21:17:38,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-28 21:17:43,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:17:43,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-28 21:17:45,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-28 21:17:47,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:17:48,969 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=144920.0, ans=0.2 2023-09-28 21:17:51,509 INFO [train.py:1039] (1/4) Epoch 5, batch 500, loss[loss=0.2496, simple_loss=0.3105, pruned_loss=0.0943, over 23411.00 frames. ], tot_loss[loss=0.2526, simple_loss=0.3094, pruned_loss=0.09795, over 4317828.20 frames. ], batch size: 134, lr: 2.08e-02, grad_scale: 32.0 2023-09-28 21:17:53,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:17:55,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:17:56,527 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.66 vs. limit=10.0 2023-09-28 21:17:56,994 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:17:57,032 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-28 21:18:01,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:18:01,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:18:01,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:18:01,891 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-28 21:18:02,168 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=144986.66666666666, ans=0.125 2023-09-28 21:18:03,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-28 21:18:03,454 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:18:06,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 21:18:11,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 21:18:11,165 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-28 21:18:12,935 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:18:14,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:18:14,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:18:18,660 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=145053.33333333334, ans=0.125 2023-09-28 21:18:25,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:18:25,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-28 21:18:26,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-28 21:18:27,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:18:27,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-28 21:18:27,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 21:18:31,084 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=145120.0, ans=0.0 2023-09-28 21:18:32,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:18:33,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:18:33,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:18:33,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:18:33,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-28 21:18:38,248 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-28 21:18:39,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:18:41,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:18:43,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:18:43,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:18:43,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-28 21:18:43,848 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.48 vs. limit=15.0 2023-09-28 21:18:46,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-28 21:18:47,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:18:51,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:18:55,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:18:55,847 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=145186.66666666666, ans=0.07 2023-09-28 21:18:58,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:18:59,066 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=145253.33333333334, ans=0.0 2023-09-28 21:19:05,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:19:07,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-28 21:19:08,999 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:19:09,022 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:19:10,787 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=145253.33333333334, ans=0.2 2023-09-28 21:19:12,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-28 21:19:12,171 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-28 21:19:15,051 INFO [train.py:1039] (1/4) Epoch 5, batch 550, loss[loss=0.2288, simple_loss=0.2864, pruned_loss=0.0856, over 24427.00 frames. ], tot_loss[loss=0.2529, simple_loss=0.3104, pruned_loss=0.09768, over 4412105.94 frames. ], batch size: 58, lr: 2.08e-02, grad_scale: 32.0 2023-09-28 21:19:15,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:19:20,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-28 21:19:20,357 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=145320.0, ans=0.125 2023-09-28 21:19:20,898 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=24.51 vs. limit=22.5 2023-09-28 21:19:21,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-28 21:19:23,054 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:19:23,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-28 21:19:23,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:19:23,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:19:24,716 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:19:24,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:19:24,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:19:26,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:19:28,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:19:30,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-28 21:19:30,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:19:30,561 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=145386.66666666666, ans=0.0 2023-09-28 21:19:35,245 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:19:35,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:19:38,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:19:40,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:19:44,945 WARNING [train.py:1197] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-28 21:19:46,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-28 21:19:48,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:19:52,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:19:52,762 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:19:54,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:19:55,987 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=145453.33333333334, ans=0.125 2023-09-28 21:19:56,116 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 21:19:57,684 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=145453.33333333334, ans=0.125 2023-09-28 21:20:00,922 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:20:00,931 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-28 21:20:01,070 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:20:03,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 21:20:06,462 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:20:06,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 21:20:06,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-28 21:20:08,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:20:09,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-28 21:20:11,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-28 21:20:11,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:20:11,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:20:13,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:20:13,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:20:16,500 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.773e+02 2.228e+02 2.515e+02 3.038e+02 5.618e+02, threshold=5.030e+02, percent-clipped=1.0 2023-09-28 21:20:16,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:20:16,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:20:19,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:20:21,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:20:21,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 21:20:22,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 21:20:23,464 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=145586.66666666666, ans=0.1 2023-09-28 21:20:24,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:20:24,604 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:20:26,104 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:20:26,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-28 21:20:27,794 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-28 21:20:33,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-28 21:20:38,076 INFO [train.py:1039] (1/4) Epoch 5, batch 600, loss[loss=0.2546, simple_loss=0.3066, pruned_loss=0.1013, over 23450.00 frames. ], tot_loss[loss=0.254, simple_loss=0.3113, pruned_loss=0.09835, over 4479366.04 frames. ], batch size: 119, lr: 2.08e-02, grad_scale: 16.0 2023-09-28 21:20:38,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-28 21:20:39,837 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:20:39,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 21:20:41,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:20:46,558 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=5.20 vs. limit=15.0 2023-09-28 21:20:48,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:20:50,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 21:20:52,123 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-28 21:20:54,393 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.15 vs. limit=15.0 2023-09-28 21:20:55,157 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:20:55,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:20:58,370 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:21:01,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-28 21:21:02,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:21:06,258 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=145720.0, ans=0.125 2023-09-28 21:21:07,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-28 21:21:11,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:21:11,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:21:12,147 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=145786.66666666666, ans=0.125 2023-09-28 21:21:13,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:21:19,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:21:19,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:21:19,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:21:26,575 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:21:31,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:21:31,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:21:31,224 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:21:39,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-28 21:21:44,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-28 21:21:44,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:21:49,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-28 21:21:49,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:21:51,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-28 21:21:51,423 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:21:51,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:21:53,956 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=145920.0, ans=0.04949747468305833 2023-09-28 21:21:58,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 21:21:59,556 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-28 21:22:00,847 INFO [train.py:1039] (1/4) Epoch 5, batch 650, loss[loss=0.258, simple_loss=0.3287, pruned_loss=0.09369, over 24435.00 frames. ], tot_loss[loss=0.2519, simple_loss=0.31, pruned_loss=0.0969, over 4538171.18 frames. ], batch size: 69, lr: 2.08e-02, grad_scale: 16.0 2023-09-28 21:22:01,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:22:02,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:22:05,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:22:07,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-28 21:22:08,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:22:13,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:22:13,511 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:22:17,296 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:22:20,567 WARNING [train.py:1197] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-28 21:22:24,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:22:25,653 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:22:30,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:22:30,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 21:22:32,050 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=146053.33333333334, ans=0.125 2023-09-28 21:22:33,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:22:33,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:22:34,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 21:22:36,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:22:38,037 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:22:40,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:22:40,987 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-28 21:22:41,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:22:41,044 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:22:43,333 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.13 vs. limit=15.0 2023-09-28 21:22:44,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:22:44,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:22:46,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:22:46,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:22:47,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-28 21:22:50,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:22:51,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:22:53,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-28 21:22:53,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:22:53,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 21:22:53,951 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=146186.66666666666, ans=0.125 2023-09-28 21:22:55,182 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-28 21:22:56,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-28 21:22:57,103 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=146186.66666666666, ans=0.2 2023-09-28 21:22:58,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:22:58,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:22:58,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:22:58,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:23:01,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:23:04,874 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.759e+02 2.282e+02 2.474e+02 2.887e+02 4.172e+02, threshold=4.947e+02, percent-clipped=0.0 2023-09-28 21:23:06,659 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:23:06,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:23:08,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:23:11,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:23:11,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 21:23:12,926 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:23:19,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 21:23:19,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:23:19,868 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:23:19,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:23:24,869 INFO [train.py:1039] (1/4) Epoch 5, batch 700, loss[loss=0.2339, simple_loss=0.2842, pruned_loss=0.0918, over 23748.00 frames. ], tot_loss[loss=0.2517, simple_loss=0.3087, pruned_loss=0.09735, over 4558430.72 frames. ], batch size: 212, lr: 2.08e-02, grad_scale: 16.0 2023-09-28 21:23:27,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-28 21:23:27,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-28 21:23:29,127 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=146320.0, ans=0.0 2023-09-28 21:23:30,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-28 21:23:31,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:23:33,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:23:33,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-28 21:23:39,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:23:42,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:23:44,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:23:46,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-28 21:23:46,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:23:49,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:23:52,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 21:23:52,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:23:55,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-28 21:24:00,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-28 21:24:05,469 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-28 21:24:05,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:24:07,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:24:10,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:24:12,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-28 21:24:15,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:24:15,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 21:24:15,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-28 21:24:21,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:24:23,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:24:25,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:24:30,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:24:31,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-28 21:24:37,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-28 21:24:38,542 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-28 21:24:38,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:24:41,176 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.82 vs. limit=22.5 2023-09-28 21:24:41,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:24:42,714 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:24:44,374 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:24:44,388 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-28 21:24:47,199 INFO [train.py:1039] (1/4) Epoch 5, batch 750, loss[loss=0.2659, simple_loss=0.312, pruned_loss=0.1099, over 23848.00 frames. ], tot_loss[loss=0.252, simple_loss=0.3097, pruned_loss=0.09718, over 4600941.00 frames. ], batch size: 212, lr: 2.07e-02, grad_scale: 16.0 2023-09-28 21:24:47,667 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=146653.33333333334, ans=0.0 2023-09-28 21:24:48,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-28 21:24:48,945 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-28 21:24:48,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-28 21:24:50,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-28 21:24:52,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-28 21:24:52,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:24:53,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-28 21:24:55,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:24:55,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:24:58,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:24:59,390 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=146653.33333333334, ans=0.0 2023-09-28 21:25:00,557 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:25:02,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-28 21:25:02,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:25:04,060 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:25:05,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 21:25:08,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:25:12,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:25:14,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:25:14,503 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-28 21:25:14,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-28 21:25:17,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:25:19,198 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:25:19,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-28 21:25:20,519 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=146786.66666666666, ans=0.125 2023-09-28 21:25:21,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-28 21:25:21,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:25:23,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-28 21:25:24,852 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-28 21:25:24,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-28 21:25:24,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:25:24,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 21:25:25,321 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=146786.66666666666, ans=0.125 2023-09-28 21:25:28,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:25:34,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:25:34,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:25:34,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 21:25:37,296 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.86 vs. limit=6.0 2023-09-28 21:25:37,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:25:39,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:25:39,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-28 21:25:40,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:25:42,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-28 21:25:43,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:25:47,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:25:47,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-28 21:25:47,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:25:51,067 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.840e+02 2.300e+02 2.781e+02 3.196e+02 5.681e+02, threshold=5.563e+02, percent-clipped=1.0 2023-09-28 21:25:52,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:25:55,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:25:55,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:25:55,455 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=146920.0, ans=0.0 2023-09-28 21:25:58,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 21:26:01,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-28 21:26:01,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:26:02,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:26:07,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:26:07,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:26:10,185 INFO [train.py:1039] (1/4) Epoch 5, batch 800, loss[loss=0.2366, simple_loss=0.3141, pruned_loss=0.07953, over 24670.00 frames. ], tot_loss[loss=0.2527, simple_loss=0.3106, pruned_loss=0.09745, over 4630924.44 frames. ], batch size: 68, lr: 2.07e-02, grad_scale: 32.0 2023-09-28 21:26:10,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:26:12,306 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-28 21:26:18,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:26:18,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:26:19,096 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=146986.66666666666, ans=0.125 2023-09-28 21:26:22,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:26:22,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:26:23,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:26:23,856 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:26:24,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:26:29,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:26:31,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 21:26:31,520 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=147053.33333333334, ans=0.0 2023-09-28 21:26:34,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-28 21:26:35,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:26:37,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:26:37,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:26:37,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:26:38,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-28 21:26:38,859 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:26:38,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-28 21:26:42,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:26:43,707 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:26:45,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:26:47,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:26:50,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:26:50,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:26:56,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:26:56,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:26:56,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-28 21:26:58,635 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-28 21:26:58,692 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-28 21:26:58,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:27:00,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:27:01,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:27:01,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:27:07,084 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-28 21:27:07,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-28 21:27:08,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-28 21:27:10,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:27:13,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:27:18,188 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:27:18,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-28 21:27:19,750 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:27:22,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-28 21:27:31,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 21:27:32,902 INFO [train.py:1039] (1/4) Epoch 5, batch 850, loss[loss=0.2585, simple_loss=0.3309, pruned_loss=0.09309, over 24057.00 frames. ], tot_loss[loss=0.255, simple_loss=0.3124, pruned_loss=0.0988, over 4653982.44 frames. ], batch size: 80, lr: 2.07e-02, grad_scale: 32.0 2023-09-28 21:27:33,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:27:34,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-28 21:27:34,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:27:36,962 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:27:37,387 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=147320.0, ans=0.2 2023-09-28 21:27:38,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-28 21:27:38,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:27:40,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:27:42,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:27:43,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 21:27:44,536 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.14 vs. limit=15.0 2023-09-28 21:27:45,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:27:46,674 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-28 21:27:46,756 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-28 21:27:46,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-28 21:27:48,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 21:27:48,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:27:51,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:27:51,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:27:52,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 21:27:53,205 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=147386.66666666666, ans=0.2 2023-09-28 21:27:58,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:27:58,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:27:58,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-28 21:28:03,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-28 21:28:06,312 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:28:07,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-28 21:28:09,937 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.16 vs. limit=15.0 2023-09-28 21:28:11,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-28 21:28:13,232 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-28 21:28:14,760 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-28 21:28:14,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:28:14,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:28:14,826 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 21:28:18,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:28:19,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:28:19,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-28 21:28:21,758 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=147520.0, ans=0.1 2023-09-28 21:28:23,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:28:24,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:28:24,597 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:28:24,629 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-28 21:28:26,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:28:28,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-28 21:28:28,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-28 21:28:34,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:28:34,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:28:35,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:28:36,313 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.721e+02 2.245e+02 2.598e+02 3.142e+02 5.686e+02, threshold=5.195e+02, percent-clipped=1.0 2023-09-28 21:28:36,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:28:36,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:28:38,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:28:40,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:28:41,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-28 21:28:42,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:28:43,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-28 21:28:51,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-28 21:28:53,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:28:53,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-28 21:28:54,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:28:54,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:28:56,151 INFO [train.py:1039] (1/4) Epoch 5, batch 900, loss[loss=0.2661, simple_loss=0.3151, pruned_loss=0.1086, over 23667.00 frames. ], tot_loss[loss=0.2564, simple_loss=0.3136, pruned_loss=0.09956, over 4661714.53 frames. ], batch size: 149, lr: 2.07e-02, grad_scale: 32.0 2023-09-28 21:28:57,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-28 21:29:05,658 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:29:07,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:29:07,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-28 21:29:10,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:29:12,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-28 21:29:12,728 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-28 21:29:14,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:29:14,071 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:29:14,142 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 21:29:15,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:29:15,921 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=147720.0, ans=0.0 2023-09-28 21:29:24,585 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.35 vs. limit=15.0 2023-09-28 21:29:26,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:29:26,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:29:28,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 21:29:31,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:29:37,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-28 21:29:37,986 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.70 vs. limit=6.0 2023-09-28 21:29:38,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:29:42,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-28 21:29:42,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:29:42,293 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-28 21:29:43,697 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-28 21:29:52,294 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-28 21:29:52,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:29:52,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 21:29:54,287 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=147853.33333333334, ans=0.0 2023-09-28 21:29:59,985 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:30:00,013 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:30:00,228 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=147920.0, ans=0.0 2023-09-28 21:30:01,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-28 21:30:01,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:30:05,140 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-28 21:30:08,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:30:08,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:30:10,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:30:10,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:30:15,031 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-28 21:30:15,114 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-28 21:30:16,719 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-28 21:30:16,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-28 21:30:18,138 INFO [train.py:1039] (1/4) Epoch 5, batch 950, loss[loss=0.2138, simple_loss=0.2759, pruned_loss=0.07589, over 24443.00 frames. ], tot_loss[loss=0.2569, simple_loss=0.3141, pruned_loss=0.09991, over 4659494.71 frames. ], batch size: 58, lr: 2.07e-02, grad_scale: 32.0 2023-09-28 21:30:19,898 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:30:25,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-28 21:30:30,646 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=147986.66666666666, ans=0.125 2023-09-28 21:30:31,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:30:33,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:30:33,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:30:35,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 21:30:35,375 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-28 21:30:38,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:30:40,717 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:30:40,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:30:40,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:30:42,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-28 21:30:43,151 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=148053.33333333334, ans=0.125 2023-09-28 21:30:44,277 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-28 21:30:45,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:30:47,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-28 21:30:47,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:30:50,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:30:50,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:30:51,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:30:53,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-28 21:30:56,922 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 21:30:57,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:31:00,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:31:04,266 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:31:04,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:31:07,311 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-28 21:31:08,977 WARNING [train.py:1197] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 21:31:08,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 21:31:10,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:31:11,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:31:11,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:31:17,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-28 21:31:19,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:31:21,890 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.669e+02 2.107e+02 2.418e+02 2.816e+02 4.980e+02, threshold=4.836e+02, percent-clipped=0.0 2023-09-28 21:31:22,057 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:31:22,152 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:31:22,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-28 21:31:22,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:31:22,206 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:31:23,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-28 21:31:28,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:31:29,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:31:34,553 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=148253.33333333334, ans=0.2 2023-09-28 21:31:35,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:31:35,789 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=148253.33333333334, ans=0.1 2023-09-28 21:31:36,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-28 21:31:36,952 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-28 21:31:38,825 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=148253.33333333334, ans=0.125 2023-09-28 21:31:41,351 INFO [train.py:1039] (1/4) Epoch 5, batch 1000, loss[loss=0.2382, simple_loss=0.2684, pruned_loss=0.104, over 19516.00 frames. ], tot_loss[loss=0.2561, simple_loss=0.3128, pruned_loss=0.0997, over 4655397.85 frames. ], batch size: 388, lr: 2.06e-02, grad_scale: 32.0 2023-09-28 21:31:41,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:31:41,937 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-28 21:31:45,239 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-28 21:31:45,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:31:50,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:31:52,152 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-28 21:31:52,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-28 21:32:00,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:32:00,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:32:02,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:32:02,876 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=5.037e-03 2023-09-28 21:32:04,695 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-28 21:32:08,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-28 21:32:11,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-28 21:32:11,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:32:12,847 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-28 21:32:14,477 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-28 21:32:14,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-28 21:32:14,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:32:16,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:32:18,780 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=148453.33333333334, ans=0.0 2023-09-28 21:32:23,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:32:24,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:32:26,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:32:26,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:32:26,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-28 21:32:28,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:32:28,336 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:32:29,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:32:29,859 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-28 21:32:34,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-28 21:32:34,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-28 21:32:34,634 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=148520.0, ans=0.125 2023-09-28 21:32:38,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-28 21:32:39,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:32:40,013 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=148520.0, ans=0.125 2023-09-28 21:32:47,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:32:47,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:32:47,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:32:49,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:32:50,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-28 21:32:53,088 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:32:53,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-28 21:32:54,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-28 21:32:56,124 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:32:56,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:32:59,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:33:02,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 21:33:03,574 INFO [train.py:1039] (1/4) Epoch 5, batch 1050, loss[loss=0.2474, simple_loss=0.3212, pruned_loss=0.08676, over 24426.00 frames. ], tot_loss[loss=0.2542, simple_loss=0.3109, pruned_loss=0.09874, over 4661143.99 frames. ], batch size: 69, lr: 2.06e-02, grad_scale: 32.0 2023-09-28 21:33:03,727 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:33:07,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:33:09,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:33:10,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 21:33:12,827 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:33:15,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:33:16,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 21:33:18,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-28 21:33:20,050 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=148720.0, ans=0.0 2023-09-28 21:33:21,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:33:21,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:33:21,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:33:24,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:33:25,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-28 21:33:26,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:33:28,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-28 21:33:29,732 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:33:29,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-28 21:33:29,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-28 21:33:34,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:33:35,743 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.16 vs. limit=15.0 2023-09-28 21:33:36,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-28 21:33:36,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:33:39,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-28 21:33:39,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-28 21:33:39,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:33:45,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-28 21:33:48,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-28 21:33:48,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:33:53,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 21:33:55,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-28 21:33:55,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:33:56,690 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:34:01,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:34:04,833 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-28 21:34:06,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-28 21:34:06,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-28 21:34:07,755 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.779e+02 2.205e+02 2.391e+02 2.864e+02 4.460e+02, threshold=4.781e+02, percent-clipped=0.0 2023-09-28 21:34:07,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:34:07,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:34:10,993 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-28 21:34:14,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:34:17,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:34:17,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:34:17,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:34:17,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:34:21,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:34:21,224 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-28 21:34:23,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:34:23,299 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-28 21:34:23,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-28 21:34:24,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:34:27,617 INFO [train.py:1039] (1/4) Epoch 5, batch 1100, loss[loss=0.2388, simple_loss=0.3064, pruned_loss=0.08558, over 24447.00 frames. ], tot_loss[loss=0.2529, simple_loss=0.3099, pruned_loss=0.09798, over 4658933.22 frames. ], batch size: 63, lr: 2.06e-02, grad_scale: 32.0 2023-09-28 21:34:29,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:34:34,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:34:37,735 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=148986.66666666666, ans=0.125 2023-09-28 21:34:40,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 21:34:41,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:34:41,807 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:34:43,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-28 21:34:44,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:34:46,625 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=149053.33333333334, ans=0.1 2023-09-28 21:34:47,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:34:49,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:34:51,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:34:52,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-28 21:34:54,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 21:34:56,633 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:34:56,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:34:58,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:34:59,830 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-28 21:35:04,582 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=149120.0, ans=0.1 2023-09-28 21:35:05,787 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:35:08,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-28 21:35:11,135 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-28 21:35:11,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:35:12,716 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=149120.0, ans=0.1 2023-09-28 21:35:15,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:35:15,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-28 21:35:17,051 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:35:17,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-28 21:35:18,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:35:18,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:35:18,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:35:20,218 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:35:20,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-28 21:35:26,870 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:35:26,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-28 21:35:28,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:35:33,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 21:35:35,789 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-28 21:35:35,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-28 21:35:37,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:35:40,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:35:41,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:35:41,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-28 21:35:43,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:35:45,399 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:35:45,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-28 21:35:45,603 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:35:47,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-28 21:35:48,567 INFO [train.py:1039] (1/4) Epoch 5, batch 1150, loss[loss=0.2231, simple_loss=0.2864, pruned_loss=0.07995, over 24310.00 frames. ], tot_loss[loss=0.2527, simple_loss=0.3098, pruned_loss=0.09781, over 4669691.04 frames. ], batch size: 56, lr: 2.06e-02, grad_scale: 32.0 2023-09-28 21:35:48,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:35:48,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:35:50,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-28 21:35:56,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:35:57,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:35:59,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:36:01,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:36:01,418 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-28 21:36:01,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:36:04,031 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=149386.66666666666, ans=0.5 2023-09-28 21:36:05,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-28 21:36:05,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:36:05,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 21:36:12,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-28 21:36:14,517 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:36:17,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:36:19,093 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:36:19,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-28 21:36:19,189 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-28 21:36:21,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:36:24,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-28 21:36:26,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:36:28,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:36:29,417 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=149453.33333333334, ans=0.5 2023-09-28 21:36:41,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:36:47,902 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:36:47,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-28 21:36:48,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:36:48,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:36:50,954 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.787e+02 2.166e+02 2.435e+02 2.809e+02 4.003e+02, threshold=4.871e+02, percent-clipped=0.0 2023-09-28 21:36:52,848 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-28 21:36:54,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:37:04,051 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-28 21:37:04,504 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=149586.66666666666, ans=0.05 2023-09-28 21:37:04,542 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=149586.66666666666, ans=0.0 2023-09-28 21:37:06,068 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=149586.66666666666, ans=0.125 2023-09-28 21:37:07,755 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:37:07,930 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:37:09,313 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-28 21:37:09,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:37:10,881 INFO [train.py:1039] (1/4) Epoch 5, batch 1200, loss[loss=0.2724, simple_loss=0.3425, pruned_loss=0.1011, over 24312.00 frames. ], tot_loss[loss=0.2535, simple_loss=0.3106, pruned_loss=0.09815, over 4676391.96 frames. ], batch size: 74, lr: 2.06e-02, grad_scale: 32.0 2023-09-28 21:37:13,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:37:18,721 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.41 vs. limit=15.0 2023-09-28 21:37:20,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:37:20,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-28 21:37:22,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:37:22,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:37:22,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:37:25,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:37:28,684 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 21:37:30,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:37:30,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:37:31,014 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.03 vs. limit=15.0 2023-09-28 21:37:31,977 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-28 21:37:35,589 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-28 21:37:38,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:37:41,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:37:43,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:37:47,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:37:47,365 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-28 21:37:48,121 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=22.60 vs. limit=22.5 2023-09-28 21:37:48,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:37:55,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-28 21:37:55,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:37:55,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-28 21:37:57,204 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:37:57,527 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=149786.66666666666, ans=0.125 2023-09-28 21:38:00,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-28 21:38:04,204 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=13.29 vs. limit=15.0 2023-09-28 21:38:04,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-28 21:38:04,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:38:06,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:38:09,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:38:09,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-28 21:38:11,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:38:11,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:38:12,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:38:13,061 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-28 21:38:14,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 21:38:14,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-28 21:38:14,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 21:38:18,251 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:38:18,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:38:20,108 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=149920.0, ans=0.0 2023-09-28 21:38:23,405 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-28 21:38:24,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:38:28,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-28 21:38:30,652 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=149920.0, ans=0.0 2023-09-28 21:38:31,834 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-28 21:38:33,135 INFO [train.py:1039] (1/4) Epoch 5, batch 1250, loss[loss=0.2526, simple_loss=0.3041, pruned_loss=0.1005, over 23838.00 frames. ], tot_loss[loss=0.254, simple_loss=0.3111, pruned_loss=0.09847, over 4695462.95 frames. ], batch size: 195, lr: 2.05e-02, grad_scale: 32.0 2023-09-28 21:38:34,758 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:38:36,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-28 21:38:37,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:38:39,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:38:41,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-28 21:38:44,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:38:46,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:38:47,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-28 21:38:49,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:38:51,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 21:38:51,736 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.60 vs. limit=15.0 2023-09-28 21:38:56,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 21:38:56,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:38:58,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 21:38:58,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:39:00,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-28 21:39:03,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 21:39:03,499 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-28 21:39:03,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:39:04,959 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:39:06,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:39:09,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:39:09,613 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=150120.0, ans=0.2 2023-09-28 21:39:12,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-28 21:39:16,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-28 21:39:17,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:39:20,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:39:20,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-28 21:39:20,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:39:20,991 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-28 21:39:21,207 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=150186.66666666666, ans=0.125 2023-09-28 21:39:22,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:39:22,325 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:39:26,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:39:29,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:39:29,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:39:31,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-28 21:39:31,404 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-28 21:39:32,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-28 21:39:36,091 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.883e+02 2.272e+02 2.528e+02 2.863e+02 4.623e+02, threshold=5.057e+02, percent-clipped=0.0 2023-09-28 21:39:36,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:39:37,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-28 21:39:37,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:39:39,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-28 21:39:40,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:39:41,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-28 21:39:41,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-28 21:39:42,520 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:39:42,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-28 21:39:42,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:39:46,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-28 21:39:48,036 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:39:50,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:39:52,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:39:53,905 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-28 21:39:55,815 INFO [train.py:1039] (1/4) Epoch 5, batch 1300, loss[loss=0.2462, simple_loss=0.2898, pruned_loss=0.1013, over 23585.00 frames. ], tot_loss[loss=0.2549, simple_loss=0.3116, pruned_loss=0.09906, over 4680840.74 frames. ], batch size: 232, lr: 2.05e-02, grad_scale: 32.0 2023-09-28 21:39:58,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:39:58,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-28 21:40:03,909 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:40:04,135 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=150320.0, ans=0.125 2023-09-28 21:40:06,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-28 21:40:06,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:40:08,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:40:10,599 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:40:10,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-28 21:40:16,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:40:17,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-28 21:40:20,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-28 21:40:22,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 21:40:26,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:40:27,047 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:40:28,879 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=150453.33333333334, ans=0.0 2023-09-28 21:40:30,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:40:30,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:40:32,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:40:32,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-28 21:40:33,155 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=150453.33333333334, ans=0.125 2023-09-28 21:40:34,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-28 21:40:39,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:40:39,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 21:40:42,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-28 21:40:42,575 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 21:40:45,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:40:48,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:40:48,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-28 21:40:48,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:40:48,568 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-28 21:40:51,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:40:53,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:40:53,273 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:40:58,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-28 21:40:59,573 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-28 21:41:01,080 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-28 21:41:06,934 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:41:09,966 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-28 21:41:11,521 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:41:18,070 INFO [train.py:1039] (1/4) Epoch 5, batch 1350, loss[loss=0.2281, simple_loss=0.2956, pruned_loss=0.0803, over 18808.00 frames. ], tot_loss[loss=0.2539, simple_loss=0.3103, pruned_loss=0.09876, over 4682130.39 frames. ], batch size: 41, lr: 2.05e-02, grad_scale: 32.0 2023-09-28 21:41:18,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-28 21:41:21,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:41:24,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:41:26,686 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=150653.33333333334, ans=0.07 2023-09-28 21:41:27,899 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:41:27,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:41:28,233 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 21:41:31,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:41:31,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:41:35,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:41:36,235 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=150720.0, ans=0.0 2023-09-28 21:41:38,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-28 21:41:41,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-28 21:41:41,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:41:43,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-28 21:41:43,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:41:44,207 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.45 vs. limit=22.5 2023-09-28 21:41:46,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:41:46,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-28 21:41:47,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-28 21:41:51,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-28 21:41:52,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:41:52,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-28 21:42:03,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:42:03,946 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=150786.66666666666, ans=0.07 2023-09-28 21:42:15,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:42:16,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:42:16,102 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-28 21:42:19,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:42:20,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-28 21:42:20,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-28 21:42:22,129 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.780e+02 2.340e+02 2.561e+02 2.889e+02 4.488e+02, threshold=5.123e+02, percent-clipped=0.0 2023-09-28 21:42:22,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:42:25,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:42:27,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-28 21:42:30,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:42:35,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-28 21:42:36,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-28 21:42:40,599 INFO [train.py:1039] (1/4) Epoch 5, batch 1400, loss[loss=0.2548, simple_loss=0.308, pruned_loss=0.1008, over 23730.00 frames. ], tot_loss[loss=0.2526, simple_loss=0.3089, pruned_loss=0.09819, over 4679392.20 frames. ], batch size: 149, lr: 2.05e-02, grad_scale: 16.0 2023-09-28 21:42:43,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-28 21:42:45,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:42:48,185 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:42:49,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:42:52,835 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=150986.66666666666, ans=0.0 2023-09-28 21:42:55,398 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-28 21:42:56,968 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-28 21:43:08,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:43:09,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:43:11,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:43:11,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-28 21:43:16,583 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:43:16,729 WARNING [train.py:1197] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 21:43:17,128 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=151120.0, ans=0.0 2023-09-28 21:43:27,122 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:43:27,198 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:43:32,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-28 21:43:32,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-28 21:43:32,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:43:33,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:43:35,521 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:43:35,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:43:37,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:43:37,217 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:43:38,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-28 21:43:38,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:43:40,932 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.62 vs. limit=22.5 2023-09-28 21:43:43,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:43:48,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:43:56,615 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-28 21:43:58,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 21:43:58,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:44:01,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 21:44:02,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:44:03,484 INFO [train.py:1039] (1/4) Epoch 5, batch 1450, loss[loss=0.2378, simple_loss=0.3088, pruned_loss=0.08337, over 24675.00 frames. ], tot_loss[loss=0.2515, simple_loss=0.3082, pruned_loss=0.09739, over 4693578.13 frames. ], batch size: 68, lr: 2.05e-02, grad_scale: 16.0 2023-09-28 21:44:05,116 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:44:08,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-28 21:44:10,151 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:44:10,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:44:10,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-28 21:44:14,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:44:15,429 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.03 vs. limit=15.0 2023-09-28 21:44:16,207 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 21:44:16,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:44:17,755 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-28 21:44:19,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 21:44:19,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-28 21:44:19,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:44:20,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:44:20,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-28 21:44:24,911 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:44:25,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:44:25,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 21:44:26,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:44:26,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:44:29,384 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:44:31,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:44:34,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:44:34,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:44:37,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:44:37,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:44:41,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:44:41,026 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:44:41,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:44:41,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:44:43,569 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=151453.33333333334, ans=0.0 2023-09-28 21:44:43,687 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer_ff2.min_abs, batch_count=151453.33333333334, ans=0.1 2023-09-28 21:44:44,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-28 21:44:47,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:44:52,391 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-28 21:44:53,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:44:54,076 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=151520.0, ans=0.0 2023-09-28 21:44:55,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-28 21:44:57,544 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:44:59,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-28 21:45:04,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:45:06,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-28 21:45:07,709 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.862e+02 2.279e+02 2.648e+02 3.024e+02 3.849e+02, threshold=5.296e+02, percent-clipped=0.0 2023-09-28 21:45:07,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-28 21:45:09,469 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:45:11,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:45:12,696 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:45:14,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-28 21:45:17,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-28 21:45:17,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-28 21:45:19,425 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:45:20,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 21:45:21,306 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=151586.66666666666, ans=0.09899494936611666 2023-09-28 21:45:25,527 INFO [train.py:1039] (1/4) Epoch 5, batch 1500, loss[loss=0.2646, simple_loss=0.3187, pruned_loss=0.1053, over 23305.00 frames. ], tot_loss[loss=0.2528, simple_loss=0.3097, pruned_loss=0.0979, over 4694087.48 frames. ], batch size: 119, lr: 2.04e-02, grad_scale: 16.0 2023-09-28 21:45:31,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-28 21:45:31,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-28 21:45:31,653 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:45:31,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:45:33,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:45:35,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:45:37,418 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-28 21:45:39,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 21:45:40,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-28 21:45:40,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:45:42,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:45:43,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:45:44,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:45:50,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:45:50,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-28 21:45:51,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-28 21:45:51,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:45:53,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:45:56,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-28 21:45:58,930 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=151786.66666666666, ans=0.1 2023-09-28 21:46:00,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-28 21:46:01,591 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:46:01,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-28 21:46:04,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-28 21:46:06,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:46:08,273 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:46:08,301 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:46:08,711 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=151786.66666666666, ans=0.0 2023-09-28 21:46:09,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-28 21:46:09,920 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:46:09,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:46:11,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-28 21:46:11,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:46:14,011 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=151853.33333333334, ans=0.125 2023-09-28 21:46:18,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:46:18,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-28 21:46:22,941 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:46:24,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 21:46:29,604 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-28 21:46:30,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:46:30,897 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-28 21:46:32,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:46:33,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:46:34,074 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-28 21:46:35,514 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-28 21:46:39,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-28 21:46:40,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:46:44,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:46:44,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:46:44,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:46:45,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:46:46,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:46:47,897 INFO [train.py:1039] (1/4) Epoch 5, batch 1550, loss[loss=0.2338, simple_loss=0.3127, pruned_loss=0.0775, over 24570.00 frames. ], tot_loss[loss=0.253, simple_loss=0.3102, pruned_loss=0.09789, over 4703433.75 frames. ], batch size: 71, lr: 2.04e-02, grad_scale: 16.0 2023-09-28 21:46:48,130 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-28 21:46:49,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-28 21:46:49,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:46:49,890 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 21:46:51,434 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-28 21:46:52,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-28 21:46:54,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:46:55,982 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:46:56,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:46:56,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:46:57,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:46:57,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:46:57,922 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=151986.66666666666, ans=0.125 2023-09-28 21:47:01,286 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=14.36 vs. limit=15.0 2023-09-28 21:47:02,077 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-28 21:47:02,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:47:02,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:47:02,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 21:47:05,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:47:05,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-28 21:47:07,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:47:08,686 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-28 21:47:08,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-28 21:47:08,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-28 21:47:08,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:47:12,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:47:17,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:47:19,274 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=152120.0, ans=0.1 2023-09-28 21:47:20,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-28 21:47:20,367 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-28 21:47:27,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:47:30,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:47:32,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:47:32,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:47:32,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-28 21:47:37,056 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=152186.66666666666, ans=0.125 2023-09-28 21:47:38,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 21:47:39,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:47:42,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:47:45,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:47:47,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:47:47,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-28 21:47:47,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 21:47:50,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:47:50,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:47:51,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-28 21:47:52,239 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.772e+02 2.346e+02 2.949e+02 3.489e+02 5.626e+02, threshold=5.898e+02, percent-clipped=1.0 2023-09-28 21:47:52,361 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-28 21:47:54,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:47:56,631 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=152253.33333333334, ans=10.0 2023-09-28 21:47:59,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-28 21:48:05,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:48:06,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:48:08,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-28 21:48:09,965 INFO [train.py:1039] (1/4) Epoch 5, batch 1600, loss[loss=0.2209, simple_loss=0.2886, pruned_loss=0.07658, over 24436.00 frames. ], tot_loss[loss=0.2532, simple_loss=0.3108, pruned_loss=0.09777, over 4695966.34 frames. ], batch size: 63, lr: 2.04e-02, grad_scale: 32.0 2023-09-28 21:48:10,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 21:48:12,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:48:12,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:48:12,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:48:13,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:48:18,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:48:19,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-28 21:48:19,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-28 21:48:21,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-28 21:48:25,128 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:48:26,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-28 21:48:26,975 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=152386.66666666666, ans=0.2 2023-09-28 21:48:28,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:48:30,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:48:35,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:48:37,929 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.56 vs. limit=12.0 2023-09-28 21:48:38,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-28 21:48:42,106 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=152453.33333333334, ans=0.1 2023-09-28 21:48:43,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:48:44,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-28 21:48:44,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:48:44,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-28 21:48:46,699 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=152453.33333333334, ans=0.1 2023-09-28 21:48:49,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-28 21:48:58,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:48:59,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-28 21:49:00,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:49:00,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:49:00,120 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:49:01,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-28 21:49:07,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 21:49:08,937 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:49:09,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:49:09,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:49:10,585 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:49:13,353 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:49:13,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:49:16,329 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:49:22,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:49:23,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:49:27,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-28 21:49:27,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:49:28,998 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-28 21:49:32,702 INFO [train.py:1039] (1/4) Epoch 5, batch 1650, loss[loss=0.2223, simple_loss=0.2959, pruned_loss=0.07431, over 24482.00 frames. ], tot_loss[loss=0.2539, simple_loss=0.3111, pruned_loss=0.09832, over 4708650.29 frames. ], batch size: 66, lr: 2.04e-02, grad_scale: 32.0 2023-09-28 21:49:34,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:49:35,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:49:37,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:49:37,290 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-28 21:49:37,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-28 21:49:37,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-28 21:49:37,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-28 21:49:43,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:49:44,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:49:44,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:49:45,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-28 21:49:46,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:49:50,639 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-28 21:49:52,324 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:49:52,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:49:52,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:49:52,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:49:53,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-28 21:49:53,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-28 21:49:58,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:50:02,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-28 21:50:10,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-28 21:50:12,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:50:16,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-28 21:50:16,755 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten.whitening_limit, batch_count=152786.66666666666, ans=15.0 2023-09-28 21:50:19,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:50:22,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:50:22,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:50:22,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:50:23,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:50:23,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:50:27,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:50:27,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:50:28,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:50:28,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:50:30,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:50:30,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 21:50:33,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:50:34,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-28 21:50:36,177 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 2.210e+02 2.496e+02 2.822e+02 4.651e+02, threshold=4.993e+02, percent-clipped=0.0 2023-09-28 21:50:38,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:50:38,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-28 21:50:38,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-28 21:50:40,151 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-28 21:50:40,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:50:41,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:50:41,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:50:41,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:50:41,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-28 21:50:45,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:50:47,585 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:50:47,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:50:49,869 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.40 vs. limit=15.0 2023-09-28 21:50:50,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-28 21:50:51,270 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.28 vs. limit=22.5 2023-09-28 21:50:54,696 INFO [train.py:1039] (1/4) Epoch 5, batch 1700, loss[loss=0.2593, simple_loss=0.3101, pruned_loss=0.1042, over 23490.00 frames. ], tot_loss[loss=0.2529, simple_loss=0.3105, pruned_loss=0.0977, over 4712509.46 frames. ], batch size: 120, lr: 2.04e-02, grad_scale: 16.0 2023-09-28 21:50:56,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:50:56,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:50:56,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-28 21:50:58,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:50:58,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:50:58,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:50:59,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:50:59,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:51:01,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-28 21:51:04,192 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 21:51:09,873 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=153053.33333333334, ans=0.2 2023-09-28 21:51:13,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:51:15,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:51:22,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-28 21:51:22,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:51:24,787 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:51:24,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:51:26,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-28 21:51:26,704 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.max_abs, batch_count=153120.0, ans=10.0 2023-09-28 21:51:28,121 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:51:28,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:51:29,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-28 21:51:31,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-28 21:51:31,606 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=153120.0, ans=0.2 2023-09-28 21:51:34,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-28 21:51:34,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-28 21:51:35,814 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:51:39,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-28 21:51:39,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:51:48,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:51:49,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:51:50,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:51:53,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-28 21:51:54,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-28 21:51:54,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:51:57,495 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:51:57,496 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-28 21:51:58,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:51:58,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:51:58,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:51:59,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:52:00,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:52:00,690 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:52:02,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:52:02,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:52:02,361 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:52:05,520 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:52:05,686 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-28 21:52:09,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:52:10,745 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:52:12,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-28 21:52:13,273 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.01 vs. limit=15.0 2023-09-28 21:52:16,905 INFO [train.py:1039] (1/4) Epoch 5, batch 1750, loss[loss=0.2464, simple_loss=0.3137, pruned_loss=0.08954, over 23316.00 frames. ], tot_loss[loss=0.2509, simple_loss=0.3088, pruned_loss=0.09651, over 4717623.07 frames. ], batch size: 93, lr: 2.03e-02, grad_scale: 16.0 2023-09-28 21:52:17,255 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=153320.0, ans=0.1 2023-09-28 21:52:20,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:52:22,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:52:22,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-28 21:52:25,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-28 21:52:25,393 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:52:25,785 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=153320.0, ans=0.125 2023-09-28 21:52:27,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:52:27,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:52:31,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-28 21:52:34,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:52:36,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-28 21:52:36,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:52:38,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:52:42,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 21:52:44,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-28 21:52:44,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:52:45,653 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-28 21:52:54,890 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:52:57,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:52:57,220 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:53:02,271 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:53:02,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:53:05,825 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:53:06,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:53:09,037 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:53:09,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:53:10,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-28 21:53:13,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:53:17,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-28 21:53:18,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:53:20,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:53:21,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:53:23,262 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.759e+02 2.199e+02 2.496e+02 2.934e+02 4.192e+02, threshold=4.992e+02, percent-clipped=0.0 2023-09-28 21:53:25,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 21:53:25,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-28 21:53:26,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:53:28,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:53:31,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:53:31,995 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=153586.66666666666, ans=0.1 2023-09-28 21:53:33,529 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=153586.66666666666, ans=0.05 2023-09-28 21:53:34,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:53:35,197 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.min_positive, batch_count=153586.66666666666, ans=0.025 2023-09-28 21:53:36,871 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:53:36,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-28 21:53:36,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:53:38,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-28 21:53:38,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:53:38,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-28 21:53:38,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:53:39,941 INFO [train.py:1039] (1/4) Epoch 5, batch 1800, loss[loss=0.2667, simple_loss=0.3278, pruned_loss=0.1028, over 23371.00 frames. ], tot_loss[loss=0.2503, simple_loss=0.3084, pruned_loss=0.09613, over 4714252.85 frames. ], batch size: 93, lr: 2.03e-02, grad_scale: 16.0 2023-09-28 21:53:40,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:53:41,053 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=153653.33333333334, ans=0.04949747468305833 2023-09-28 21:53:42,326 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:53:43,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:53:45,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 21:53:48,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:53:52,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 21:53:53,857 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:53:56,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:54:01,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:54:01,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:54:02,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:54:06,309 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:54:06,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-28 21:54:06,429 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:54:08,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:54:10,512 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=153720.0, ans=0.125 2023-09-28 21:54:13,164 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-28 21:54:16,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-28 21:54:16,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-28 21:54:18,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:54:18,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:54:18,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:54:19,747 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:54:22,921 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=153786.66666666666, ans=0.5 2023-09-28 21:54:26,525 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-28 21:54:28,040 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:54:28,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:54:31,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-28 21:54:31,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-28 21:54:32,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-28 21:54:32,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:54:34,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 21:54:39,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-28 21:54:44,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:54:46,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-28 21:54:46,317 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:54:46,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:54:47,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:54:47,900 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-28 21:54:51,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-28 21:54:51,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:54:55,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-28 21:54:55,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:54:57,670 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:54:57,838 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=153920.0, ans=0.125 2023-09-28 21:54:58,365 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.12 vs. limit=15.0 2023-09-28 21:54:59,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:54:59,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:55:00,006 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer_na.min_abs, batch_count=153920.0, ans=0.02 2023-09-28 21:55:01,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:55:01,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:55:02,693 INFO [train.py:1039] (1/4) Epoch 5, batch 1850, loss[loss=0.2473, simple_loss=0.3108, pruned_loss=0.09193, over 24029.00 frames. ], tot_loss[loss=0.2504, simple_loss=0.3091, pruned_loss=0.09585, over 4720261.58 frames. ], batch size: 80, lr: 2.03e-02, grad_scale: 16.0 2023-09-28 21:55:04,378 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:55:04,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:55:07,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:55:07,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:55:15,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:55:15,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-28 21:55:19,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-28 21:55:21,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-28 21:55:26,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:55:26,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-28 21:55:26,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 21:55:34,726 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=154120.0, ans=0.125 2023-09-28 21:55:35,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:55:36,244 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=154120.0, ans=0.125 2023-09-28 21:55:37,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-28 21:55:40,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:55:40,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:55:44,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-28 21:55:44,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:55:46,043 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 21:55:47,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:55:49,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:55:52,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:55:57,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-28 21:55:57,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:55:58,070 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.48 vs. limit=10.0 2023-09-28 21:55:58,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 21:55:58,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:56:00,768 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:56:02,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:56:05,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-28 21:56:06,196 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.86 vs. limit=15.0 2023-09-28 21:56:08,704 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.700e+02 2.275e+02 2.646e+02 3.136e+02 5.874e+02, threshold=5.291e+02, percent-clipped=3.0 2023-09-28 21:56:08,803 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:56:13,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:56:13,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:56:13,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-28 21:56:13,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-28 21:56:16,373 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-28 21:56:16,500 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-28 21:56:16,830 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=154253.33333333334, ans=0.0 2023-09-28 21:56:18,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 21:56:20,058 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:56:20,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:56:20,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:56:21,543 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-28 21:56:22,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 21:56:22,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:56:24,335 INFO [train.py:1039] (1/4) Epoch 5, batch 1900, loss[loss=0.2507, simple_loss=0.3217, pruned_loss=0.08988, over 24453.00 frames. ], tot_loss[loss=0.2492, simple_loss=0.3085, pruned_loss=0.09492, over 4729870.71 frames. ], batch size: 66, lr: 2.03e-02, grad_scale: 16.0 2023-09-28 21:56:24,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-28 21:56:26,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 21:56:27,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:56:28,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-28 21:56:31,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:56:31,175 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-28 21:56:31,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:56:32,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:56:35,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:56:39,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:56:41,064 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-28 21:56:42,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-28 21:56:44,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:56:46,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:56:46,136 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-28 21:56:46,190 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-28 21:56:49,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-28 21:56:50,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:56:55,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-28 21:56:58,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-28 21:57:04,299 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=154453.33333333334, ans=0.0 2023-09-28 21:57:05,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-28 21:57:08,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-28 21:57:08,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:57:10,765 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-28 21:57:10,772 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-28 21:57:12,012 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-28 21:57:12,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-28 21:57:12,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:57:15,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-28 21:57:15,380 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=154520.0, ans=0.1 2023-09-28 21:57:20,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:57:22,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:57:22,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-28 21:57:25,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 21:57:25,375 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=154520.0, ans=0.04949747468305833 2023-09-28 21:57:26,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-28 21:57:28,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:57:35,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:57:35,395 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:57:35,429 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:57:35,619 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=154586.66666666666, ans=0.0 2023-09-28 21:57:36,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:57:38,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 21:57:38,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-28 21:57:39,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:57:42,836 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:57:42,839 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-28 21:57:46,290 INFO [train.py:1039] (1/4) Epoch 5, batch 1950, loss[loss=0.2633, simple_loss=0.3106, pruned_loss=0.108, over 23565.00 frames. ], tot_loss[loss=0.25, simple_loss=0.3091, pruned_loss=0.09541, over 4738347.38 frames. ], batch size: 135, lr: 2.03e-02, grad_scale: 16.0 2023-09-28 21:57:46,408 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:57:46,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:57:46,488 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:57:48,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:57:51,564 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:57:53,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:57:54,551 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.61 vs. limit=15.0 2023-09-28 21:57:55,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:57:55,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:57:56,964 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=154653.33333333334, ans=0.125 2023-09-28 21:57:58,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-28 21:57:58,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 21:57:59,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:57:59,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:58:01,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:58:03,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:58:03,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:58:06,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:58:09,393 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:58:09,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 21:58:09,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:58:09,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:58:14,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:58:17,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-28 21:58:17,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:58:17,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-28 21:58:17,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-28 21:58:19,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 21:58:19,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:58:21,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:58:24,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:58:29,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:58:32,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 21:58:35,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:58:35,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:58:37,485 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-28 21:58:37,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:58:41,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:58:41,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:58:41,511 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=154853.33333333334, ans=0.125 2023-09-28 21:58:42,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:58:44,618 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=154853.33333333334, ans=0.05 2023-09-28 21:58:48,338 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=154853.33333333334, ans=0.04949747468305833 2023-09-28 21:58:48,656 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.62 vs. limit=22.5 2023-09-28 21:58:50,935 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:58:52,304 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.827e+02 2.269e+02 2.574e+02 2.905e+02 4.607e+02, threshold=5.149e+02, percent-clipped=0.0 2023-09-28 21:58:52,478 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:58:55,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:58:57,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:59:00,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:59:00,759 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:59:02,910 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-28 21:59:02,930 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 21:59:03,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:59:04,520 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-28 21:59:06,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:59:08,890 INFO [train.py:1039] (1/4) Epoch 5, batch 2000, loss[loss=0.2771, simple_loss=0.3168, pruned_loss=0.1187, over 23421.00 frames. ], tot_loss[loss=0.2517, simple_loss=0.3104, pruned_loss=0.09649, over 4722084.75 frames. ], batch size: 285, lr: 2.02e-02, grad_scale: 32.0 2023-09-28 21:59:10,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:59:12,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:59:12,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:59:15,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:59:17,227 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:59:18,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-28 21:59:20,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:59:23,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:59:25,403 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-28 21:59:27,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:59:27,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:59:30,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:59:30,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-28 21:59:33,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:59:35,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:59:35,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:59:37,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-28 21:59:37,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 21:59:38,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-28 21:59:38,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:59:39,426 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.80 vs. limit=12.0 2023-09-28 21:59:41,864 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:59:43,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-28 21:59:43,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:59:45,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:59:46,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:59:46,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-28 21:59:48,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-28 21:59:48,699 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:59:48,714 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:59:55,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:59:55,444 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=155120.0, ans=0.1 2023-09-28 21:59:58,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:59:58,035 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 21:59:58,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:00:01,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:00:01,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:00:01,937 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:00:01,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:00:03,408 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:00:05,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:00:06,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-28 22:00:13,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 22:00:14,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:00:15,186 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=3.147e-02 2023-09-28 22:00:18,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:00:18,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:00:21,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:00:23,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:00:23,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:00:24,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 22:00:24,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:00:24,755 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=155253.33333333334, ans=0.0 2023-09-28 22:00:27,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:00:29,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:00:31,064 INFO [train.py:1039] (1/4) Epoch 5, batch 2050, loss[loss=0.2477, simple_loss=0.2919, pruned_loss=0.1018, over 23799.00 frames. ], tot_loss[loss=0.2511, simple_loss=0.3094, pruned_loss=0.09636, over 4718201.23 frames. ], batch size: 195, lr: 2.02e-02, grad_scale: 32.0 2023-09-28 22:00:31,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:00:31,592 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=155320.0, ans=0.125 2023-09-28 22:00:32,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:00:37,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:00:40,863 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:00:40,966 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:00:41,072 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:00:42,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-28 22:00:42,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:00:44,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:00:45,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-28 22:00:51,841 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=155386.66666666666, ans=0.2 2023-09-28 22:00:53,467 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=155386.66666666666, ans=0.0 2023-09-28 22:00:54,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-28 22:00:54,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:00:56,694 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=155386.66666666666, ans=0.125 2023-09-28 22:00:59,216 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-28 22:01:02,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:01:04,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-28 22:01:04,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-28 22:01:08,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:01:10,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:01:10,313 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=155453.33333333334, ans=0.0 2023-09-28 22:01:11,579 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-28 22:01:13,023 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:01:14,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:01:16,109 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:01:16,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:01:19,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:01:21,288 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:01:23,663 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=155520.0, ans=0.125 2023-09-28 22:01:24,722 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:01:24,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:01:28,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:01:33,927 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:01:34,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-28 22:01:36,989 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.729e+02 2.377e+02 2.692e+02 3.102e+02 5.014e+02, threshold=5.385e+02, percent-clipped=0.0 2023-09-28 22:01:41,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:01:42,854 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:01:45,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:01:47,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-28 22:01:49,242 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-28 22:01:49,242 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:01:49,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:01:51,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 22:01:52,852 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:01:52,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-28 22:01:52,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-28 22:01:54,308 INFO [train.py:1039] (1/4) Epoch 5, batch 2100, loss[loss=0.2409, simple_loss=0.3075, pruned_loss=0.08719, over 24658.00 frames. ], tot_loss[loss=0.2501, simple_loss=0.3078, pruned_loss=0.09616, over 4704560.10 frames. ], batch size: 65, lr: 2.02e-02, grad_scale: 32.0 2023-09-28 22:01:54,549 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:01:56,718 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.47 vs. limit=22.5 2023-09-28 22:01:57,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:01:59,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:02:01,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:02:01,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:02:01,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-28 22:02:04,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:02:04,087 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-28 22:02:04,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-28 22:02:05,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:02:05,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:02:07,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-28 22:02:07,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 22:02:15,418 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-28 22:02:15,432 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 22:02:17,230 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=155720.0, ans=0.2 2023-09-28 22:02:18,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:02:18,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:02:23,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:02:23,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-28 22:02:25,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:02:25,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 22:02:28,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-28 22:02:28,353 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:02:28,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-28 22:02:28,419 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-28 22:02:28,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-28 22:02:31,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-28 22:02:31,759 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=155786.66666666666, ans=0.09899494936611666 2023-09-28 22:02:33,488 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:02:33,959 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=155786.66666666666, ans=0.09899494936611666 2023-09-28 22:02:35,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 22:02:35,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 22:02:38,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:02:39,759 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:02:39,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-28 22:02:39,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:02:39,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:02:41,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:02:43,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-28 22:02:44,726 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-28 22:02:46,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-28 22:02:50,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:02:54,498 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:02:55,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-28 22:03:00,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:03:04,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:03:04,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:03:04,274 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:03:05,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-28 22:03:05,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 22:03:06,104 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=155920.0, ans=0.95 2023-09-28 22:03:07,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:03:07,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-28 22:03:09,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:03:10,588 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:03:12,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-28 22:03:13,732 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-28 22:03:15,125 INFO [train.py:1039] (1/4) Epoch 5, batch 2150, loss[loss=0.2489, simple_loss=0.3095, pruned_loss=0.09421, over 18334.00 frames. ], tot_loss[loss=0.2497, simple_loss=0.3072, pruned_loss=0.09614, over 4711906.93 frames. ], batch size: 39, lr: 2.02e-02, grad_scale: 32.0 2023-09-28 22:03:15,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:03:18,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:03:18,394 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:03:18,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:03:18,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:03:25,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 22:03:27,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:03:28,130 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.13 vs. limit=15.0 2023-09-28 22:03:28,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:03:30,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:03:30,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:03:31,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:03:33,562 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=156053.33333333334, ans=0.0 2023-09-28 22:03:36,148 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:03:36,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:03:36,251 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:03:41,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:03:41,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-28 22:03:42,346 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.75 vs. limit=15.0 2023-09-28 22:03:48,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:03:48,377 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:03:49,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:03:49,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:03:49,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:03:49,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-28 22:03:51,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:03:51,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:03:51,431 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:03:53,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-28 22:03:54,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-28 22:03:55,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:03:56,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:03:57,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 22:03:58,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:04:01,623 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:04:01,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:04:03,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:04:03,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-28 22:04:03,206 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-28 22:04:06,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:04:07,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:04:09,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:04:11,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 22:04:12,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:04:12,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:04:12,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-28 22:04:14,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-28 22:04:14,558 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=156186.66666666666, ans=0.1 2023-09-28 22:04:15,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:04:15,857 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-28 22:04:15,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:04:17,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:04:19,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-28 22:04:19,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:04:19,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-28 22:04:19,572 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-28 22:04:19,572 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-28 22:04:19,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-28 22:04:21,000 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.667e+02 2.209e+02 2.562e+02 3.022e+02 4.431e+02, threshold=5.124e+02, percent-clipped=0.0 2023-09-28 22:04:22,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:04:22,677 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:04:22,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:04:24,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:04:24,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 22:04:27,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:04:27,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:04:31,201 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=156253.33333333334, ans=0.125 2023-09-28 22:04:31,233 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=156253.33333333334, ans=0.125 2023-09-28 22:04:37,511 INFO [train.py:1039] (1/4) Epoch 5, batch 2200, loss[loss=0.2281, simple_loss=0.3026, pruned_loss=0.07678, over 24421.00 frames. ], tot_loss[loss=0.2492, simple_loss=0.3074, pruned_loss=0.09555, over 4731705.73 frames. ], batch size: 69, lr: 2.02e-02, grad_scale: 32.0 2023-09-28 22:04:37,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:04:37,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-28 22:04:42,384 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:04:44,746 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.52 vs. limit=15.0 2023-09-28 22:04:47,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:04:47,912 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=156320.0, ans=0.1 2023-09-28 22:04:49,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:04:49,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:04:50,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-28 22:04:54,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:04:54,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:04:54,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-28 22:05:00,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-28 22:05:01,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 22:05:08,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-28 22:05:10,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:05:11,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:05:12,571 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:05:16,999 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:05:18,344 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-28 22:05:22,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-28 22:05:22,407 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:05:22,491 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-28 22:05:25,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:05:27,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:05:28,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:05:30,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:05:32,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-28 22:05:34,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:05:36,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-28 22:05:37,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:05:37,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:05:37,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:05:38,149 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=156520.0, ans=0.2 2023-09-28 22:05:38,185 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=156520.0, ans=0.2 2023-09-28 22:05:39,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:05:40,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:05:40,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:05:40,973 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:05:42,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-28 22:05:42,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:05:46,122 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 22:05:49,357 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 22:05:50,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:05:54,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:05:54,818 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-28 22:05:55,174 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=156586.66666666666, ans=0.1 2023-09-28 22:05:57,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:05:57,818 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-28 22:05:58,107 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=156586.66666666666, ans=0.125 2023-09-28 22:05:59,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-28 22:05:59,471 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-28 22:06:00,778 INFO [train.py:1039] (1/4) Epoch 5, batch 2250, loss[loss=0.2498, simple_loss=0.3034, pruned_loss=0.09808, over 23862.00 frames. ], tot_loss[loss=0.2492, simple_loss=0.3076, pruned_loss=0.09534, over 4734962.57 frames. ], batch size: 179, lr: 2.01e-02, grad_scale: 32.0 2023-09-28 22:06:02,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:06:02,375 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-28 22:06:03,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:06:05,552 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-28 22:06:07,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:06:09,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:06:10,055 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=156653.33333333334, ans=0.2 2023-09-28 22:06:14,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:06:16,104 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:06:17,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:06:19,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:06:19,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:06:21,687 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=156720.0, ans=0.125 2023-09-28 22:06:22,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-28 22:06:22,796 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:06:22,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:06:25,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-28 22:06:25,915 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:06:25,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:06:26,208 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=156720.0, ans=0.125 2023-09-28 22:06:28,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:06:32,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:06:34,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 22:06:34,228 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-28 22:06:35,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-28 22:06:37,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:06:41,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:06:44,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:06:47,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:06:47,636 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=156786.66666666666, ans=0.05 2023-09-28 22:06:48,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:06:50,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:06:52,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:06:53,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:06:58,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:07:00,041 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-28 22:07:05,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:07:05,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:07:06,667 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.676e+02 2.161e+02 2.544e+02 3.130e+02 4.790e+02, threshold=5.087e+02, percent-clipped=0.0 2023-09-28 22:07:06,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:07:13,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 22:07:16,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-28 22:07:16,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-28 22:07:16,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:07:18,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:07:21,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-28 22:07:22,988 INFO [train.py:1039] (1/4) Epoch 5, batch 2300, loss[loss=0.2293, simple_loss=0.308, pruned_loss=0.07528, over 24652.00 frames. ], tot_loss[loss=0.2493, simple_loss=0.3077, pruned_loss=0.09547, over 4722234.56 frames. ], batch size: 68, lr: 2.01e-02, grad_scale: 32.0 2023-09-28 22:07:24,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:07:24,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:07:25,098 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=156986.66666666666, ans=0.2 2023-09-28 22:07:31,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:07:31,770 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:07:35,366 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-28 22:07:38,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:07:40,241 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=157053.33333333334, ans=0.1 2023-09-28 22:07:44,749 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:07:44,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-28 22:07:44,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:07:44,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:07:44,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-28 22:07:47,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:07:50,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:07:50,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:07:55,138 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:07:58,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:07:59,327 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.20 vs. limit=12.0 2023-09-28 22:08:01,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:08:04,106 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=157120.0, ans=0.1 2023-09-28 22:08:06,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:08:06,932 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:08:10,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:08:13,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:08:17,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:08:18,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 22:08:18,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:08:18,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-28 22:08:23,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 22:08:23,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:08:24,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:08:24,769 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:08:26,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:08:26,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 22:08:26,971 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-28 22:08:28,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-28 22:08:28,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:08:28,360 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:08:29,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-28 22:08:34,681 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=157253.33333333334, ans=0.07 2023-09-28 22:08:35,937 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:08:39,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:08:45,638 INFO [train.py:1039] (1/4) Epoch 5, batch 2350, loss[loss=0.3502, simple_loss=0.3765, pruned_loss=0.162, over 19613.00 frames. ], tot_loss[loss=0.2498, simple_loss=0.3082, pruned_loss=0.0957, over 4727714.36 frames. ], batch size: 389, lr: 2.01e-02, grad_scale: 32.0 2023-09-28 22:08:45,812 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:08:45,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:08:45,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-28 22:08:47,576 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=157320.0, ans=0.125 2023-09-28 22:08:49,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 22:08:49,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:08:49,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 22:08:49,774 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=157320.0, ans=0.125 2023-09-28 22:08:50,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-28 22:08:51,381 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=157320.0, ans=0.125 2023-09-28 22:08:57,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:08:57,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-28 22:09:02,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-28 22:09:05,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:09:08,969 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:09:08,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:09:09,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:09:09,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:09:10,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-28 22:09:12,648 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=157386.66666666666, ans=0.125 2023-09-28 22:09:15,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:09:20,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-28 22:09:21,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:09:23,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:09:24,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:09:27,979 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:09:29,549 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-28 22:09:31,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:09:33,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:09:33,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:09:33,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:09:37,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:09:39,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-28 22:09:40,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:09:43,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:09:43,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:09:44,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-28 22:09:44,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:09:49,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-28 22:09:49,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:09:51,237 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.798e+02 2.164e+02 2.489e+02 2.830e+02 4.285e+02, threshold=4.978e+02, percent-clipped=0.0 2023-09-28 22:09:51,697 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=157586.66666666666, ans=0.0 2023-09-28 22:09:55,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-28 22:09:58,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-28 22:09:59,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:09:59,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-28 22:09:59,856 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-28 22:09:59,902 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-28 22:10:00,088 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=157586.66666666666, ans=0.2 2023-09-28 22:10:02,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-28 22:10:04,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:10:08,326 INFO [train.py:1039] (1/4) Epoch 5, batch 2400, loss[loss=0.2598, simple_loss=0.3034, pruned_loss=0.1081, over 23736.00 frames. ], tot_loss[loss=0.2502, simple_loss=0.3085, pruned_loss=0.09591, over 4728329.93 frames. ], batch size: 212, lr: 2.01e-02, grad_scale: 32.0 2023-09-28 22:10:10,047 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:10:13,709 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:10:15,209 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:10:15,282 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-28 22:10:15,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-28 22:10:24,798 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 22:10:24,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:10:26,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-28 22:10:29,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:10:30,921 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:10:30,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-28 22:10:36,367 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:10:38,031 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-28 22:10:43,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-28 22:10:48,340 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-28 22:10:50,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:10:52,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:10:53,362 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=157786.66666666666, ans=0.125 2023-09-28 22:10:56,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:10:56,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-28 22:10:58,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:11:06,841 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:11:08,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:11:08,738 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 22:11:11,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:11:11,817 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=157853.33333333334, ans=0.125 2023-09-28 22:11:13,013 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:11:13,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-28 22:11:13,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:11:13,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:11:14,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:11:14,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 22:11:14,933 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=157920.0, ans=0.04949747468305833 2023-09-28 22:11:19,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:11:19,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 22:11:19,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-28 22:11:21,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-28 22:11:21,783 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=157920.0, ans=0.125 2023-09-28 22:11:23,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:11:24,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:11:25,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-28 22:11:25,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-28 22:11:25,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-28 22:11:25,163 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-28 22:11:25,407 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=157920.0, ans=0.125 2023-09-28 22:11:26,569 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-28 22:11:26,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:11:28,265 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:11:28,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:11:32,405 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-28 22:11:32,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:11:33,728 INFO [train.py:1039] (1/4) Epoch 5, batch 2450, loss[loss=0.2205, simple_loss=0.2477, pruned_loss=0.0967, over 18958.00 frames. ], tot_loss[loss=0.2482, simple_loss=0.3064, pruned_loss=0.09495, over 4706857.66 frames. ], batch size: 388, lr: 2.01e-02, grad_scale: 32.0 2023-09-28 22:11:33,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-28 22:11:37,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:11:37,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:11:40,302 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:11:40,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:11:41,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-28 22:11:45,409 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=157986.66666666666, ans=0.125 2023-09-28 22:11:48,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:11:48,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:11:51,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:11:51,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:11:51,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:11:51,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-28 22:11:58,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:11:59,795 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 22:11:59,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:12:03,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-28 22:12:05,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:12:06,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:12:07,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:12:07,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=158120.0, ans=0.125 2023-09-28 22:12:09,013 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=158120.0, ans=0.125 2023-09-28 22:12:09,015 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=158120.0, ans=0.125 2023-09-28 22:12:10,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-28 22:12:10,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:12:18,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:12:19,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:12:19,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:12:19,896 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:12:21,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:12:21,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:12:23,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-28 22:12:26,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:12:28,154 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:12:31,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:12:31,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:12:34,277 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.32 vs. limit=15.0 2023-09-28 22:12:37,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:12:37,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-28 22:12:38,658 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:12:39,020 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=158253.33333333334, ans=0.09899494936611666 2023-09-28 22:12:39,970 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.639e+02 2.190e+02 2.639e+02 3.152e+02 5.360e+02, threshold=5.279e+02, percent-clipped=2.0 2023-09-28 22:12:40,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:12:40,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-28 22:12:40,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:12:41,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:12:45,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:12:48,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:12:48,429 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:12:51,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-28 22:12:53,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:12:55,224 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.25 vs. limit=10.0 2023-09-28 22:12:55,803 INFO [train.py:1039] (1/4) Epoch 5, batch 2500, loss[loss=0.2497, simple_loss=0.3181, pruned_loss=0.09063, over 24637.00 frames. ], tot_loss[loss=0.2487, simple_loss=0.3066, pruned_loss=0.09542, over 4698758.49 frames. ], batch size: 68, lr: 2.00e-02, grad_scale: 32.0 2023-09-28 22:13:01,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:13:11,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:13:11,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:13:13,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:13:13,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-28 22:13:20,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 22:13:21,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:13:21,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-28 22:13:21,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 22:13:23,280 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-28 22:13:23,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:13:25,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:13:25,103 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-28 22:13:25,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:13:26,584 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-28 22:13:26,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:13:31,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:13:31,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:13:34,999 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 22:13:36,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-28 22:13:36,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:13:40,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:13:43,672 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:13:48,865 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:13:51,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:13:56,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-28 22:13:58,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-28 22:13:58,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:13:58,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-28 22:14:01,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:14:01,539 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:14:01,730 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-28 22:14:01,730 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-28 22:14:01,739 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-28 22:14:06,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:14:09,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-28 22:14:09,771 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-28 22:14:11,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:14:11,361 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-28 22:14:13,024 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=158586.66666666666, ans=0.125 2023-09-28 22:14:16,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-28 22:14:20,164 INFO [train.py:1039] (1/4) Epoch 5, batch 2550, loss[loss=0.2192, simple_loss=0.2888, pruned_loss=0.07476, over 24303.00 frames. ], tot_loss[loss=0.2493, simple_loss=0.3075, pruned_loss=0.09554, over 4708741.66 frames. ], batch size: 61, lr: 2.00e-02, grad_scale: 32.0 2023-09-28 22:14:20,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:14:21,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:14:23,594 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:14:25,276 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:14:26,692 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-28 22:14:26,960 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=158653.33333333334, ans=0.0 2023-09-28 22:14:28,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-28 22:14:29,813 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=158653.33333333334, ans=0.125 2023-09-28 22:14:29,999 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=158653.33333333334, ans=0.0 2023-09-28 22:14:31,361 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-28 22:14:32,837 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:14:34,482 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:14:37,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:14:37,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 22:14:37,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 22:14:38,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:14:38,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:14:43,310 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:14:43,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-28 22:14:43,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-28 22:14:43,407 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:14:43,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-28 22:14:46,101 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.18 vs. limit=8.0 2023-09-28 22:14:59,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:15:04,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:15:04,447 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:15:04,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:15:07,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 22:15:12,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:15:16,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 22:15:16,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:15:16,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:15:17,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-28 22:15:17,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-28 22:15:18,895 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=158853.33333333334, ans=0.0 2023-09-28 22:15:19,131 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.18 vs. limit=15.0 2023-09-28 22:15:21,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:15:21,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:15:24,943 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.838e+02 2.422e+02 2.826e+02 3.525e+02 6.917e+02, threshold=5.653e+02, percent-clipped=3.0 2023-09-28 22:15:29,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:15:30,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-28 22:15:30,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:15:30,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:15:31,040 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-28 22:15:33,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 22:15:34,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:15:39,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:15:42,643 INFO [train.py:1039] (1/4) Epoch 5, batch 2600, loss[loss=0.3728, simple_loss=0.3954, pruned_loss=0.1751, over 19467.00 frames. ], tot_loss[loss=0.25, simple_loss=0.3084, pruned_loss=0.09573, over 4717063.34 frames. ], batch size: 388, lr: 2.00e-02, grad_scale: 32.0 2023-09-28 22:15:42,733 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:15:45,744 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-28 22:15:47,488 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-28 22:15:47,532 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:15:48,900 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-28 22:15:49,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-28 22:15:49,040 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-28 22:15:52,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:15:52,083 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-28 22:15:53,729 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-28 22:15:55,219 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-28 22:15:56,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:15:58,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-28 22:16:01,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-28 22:16:02,805 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-28 22:16:02,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-28 22:16:03,804 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.31 vs. limit=22.5 2023-09-28 22:16:05,237 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-28 22:16:05,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-28 22:16:15,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:16:15,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:16:16,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:16:16,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-28 22:16:19,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:16:23,975 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-28 22:16:24,268 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=159120.0, ans=0.125 2023-09-28 22:16:28,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:16:28,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:16:30,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-28 22:16:31,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:16:31,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:16:33,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-28 22:16:37,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-28 22:16:37,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:16:37,697 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=159186.66666666666, ans=0.0 2023-09-28 22:16:38,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:16:42,551 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-28 22:16:42,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:16:42,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:16:47,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:16:47,554 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 22:16:48,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:16:50,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-28 22:16:50,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:16:53,368 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:16:53,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:16:53,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=159253.33333333334, ans=0.125 2023-09-28 22:16:58,554 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.87 vs. limit=6.0 2023-09-28 22:16:59,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-28 22:17:00,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:17:02,626 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 22:17:04,014 INFO [train.py:1039] (1/4) Epoch 5, batch 2650, loss[loss=0.2257, simple_loss=0.296, pruned_loss=0.07768, over 24477.00 frames. ], tot_loss[loss=0.2511, simple_loss=0.3094, pruned_loss=0.09637, over 4710719.53 frames. ], batch size: 66, lr: 2.00e-02, grad_scale: 32.0 2023-09-28 22:17:04,734 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=6.51 vs. limit=12.0 2023-09-28 22:17:09,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-28 22:17:09,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:17:10,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 22:17:12,813 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-28 22:17:12,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:17:15,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:17:17,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 22:17:18,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:17:20,977 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.40 vs. limit=6.0 2023-09-28 22:17:21,991 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:17:22,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-28 22:17:22,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 22:17:22,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:17:25,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-28 22:17:28,182 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-28 22:17:31,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:17:32,800 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-28 22:17:32,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:17:32,928 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-28 22:17:32,993 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=159386.66666666666, ans=0.015 2023-09-28 22:17:38,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:17:38,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-28 22:17:38,989 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:17:39,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:17:41,039 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=159453.33333333334, ans=0.125 2023-09-28 22:17:42,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-28 22:17:42,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-28 22:17:47,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-28 22:17:50,708 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-28 22:17:52,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:17:52,152 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:17:52,224 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-28 22:17:53,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:17:53,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:17:55,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:17:58,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:17:59,854 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:17:59,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:18:00,543 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.75 vs. limit=6.0 2023-09-28 22:18:02,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:18:04,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:18:05,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:18:05,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:18:08,808 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.817e+02 2.225e+02 2.541e+02 3.251e+02 5.495e+02, threshold=5.083e+02, percent-clipped=0.0 2023-09-28 22:18:08,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:18:08,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-28 22:18:12,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:18:13,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:18:13,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:18:13,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-28 22:18:13,845 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 22:18:17,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:18:21,533 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:18:22,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:18:24,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:18:24,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-28 22:18:25,852 INFO [train.py:1039] (1/4) Epoch 5, batch 2700, loss[loss=0.3444, simple_loss=0.376, pruned_loss=0.1565, over 19774.00 frames. ], tot_loss[loss=0.2515, simple_loss=0.3099, pruned_loss=0.09657, over 4706920.86 frames. ], batch size: 389, lr: 2.00e-02, grad_scale: 32.0 2023-09-28 22:18:25,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:18:28,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:18:28,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-28 22:18:28,416 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=159653.33333333334, ans=0.2 2023-09-28 22:18:31,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:18:32,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 22:18:35,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:18:35,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:18:35,882 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:18:38,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:18:38,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:18:38,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:18:38,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-28 22:18:38,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-28 22:18:40,394 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:18:41,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-28 22:18:43,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 22:18:43,613 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:18:46,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:18:49,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-28 22:18:49,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:18:54,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:18:55,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:19:01,184 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:19:01,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:19:02,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:19:02,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:19:04,978 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=159786.66666666666, ans=0.125 2023-09-28 22:19:06,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:19:07,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:19:07,843 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-28 22:19:07,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:19:10,932 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=159786.66666666666, ans=0.2 2023-09-28 22:19:10,938 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=159786.66666666666, ans=0.05 2023-09-28 22:19:10,983 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=159786.66666666666, ans=0.125 2023-09-28 22:19:12,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:19:12,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:19:20,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:19:20,421 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=159853.33333333334, ans=0.125 2023-09-28 22:19:21,685 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:19:23,978 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:19:23,982 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:19:25,751 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=159853.33333333334, ans=0.07 2023-09-28 22:19:30,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:19:30,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:19:31,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:19:33,797 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:19:35,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:19:35,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:19:37,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:19:37,477 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=159920.0, ans=0.0 2023-09-28 22:19:38,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:19:38,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:19:41,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-28 22:19:41,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:19:44,933 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:19:44,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-28 22:19:47,731 INFO [train.py:1039] (1/4) Epoch 5, batch 2750, loss[loss=0.2635, simple_loss=0.3274, pruned_loss=0.09977, over 23368.00 frames. ], tot_loss[loss=0.2513, simple_loss=0.3091, pruned_loss=0.09674, over 4695278.37 frames. ], batch size: 93, lr: 1.99e-02, grad_scale: 16.0 2023-09-28 22:19:47,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-28 22:19:47,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:19:54,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:19:54,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:19:57,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:19:57,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-28 22:19:59,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:20:00,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:20:00,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 22:20:02,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:20:03,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:20:03,004 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-28 22:20:03,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:20:03,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:20:11,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-28 22:20:12,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:20:12,830 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:20:14,334 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:20:14,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-28 22:20:15,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:20:17,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:20:17,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:20:19,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:20:23,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 22:20:25,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 22:20:25,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:20:26,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:20:28,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 22:20:34,821 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=160120.0, ans=0.035 2023-09-28 22:20:36,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:20:38,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 22:20:38,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:20:42,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:20:42,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:20:42,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:20:50,022 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:20:50,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:20:50,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-28 22:20:53,562 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=160186.66666666666, ans=0.125 2023-09-28 22:20:54,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:20:56,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-28 22:20:57,661 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.836e+02 2.252e+02 2.534e+02 3.037e+02 4.293e+02, threshold=5.069e+02, percent-clipped=0.0 2023-09-28 22:20:59,918 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=160253.33333333334, ans=0.0 2023-09-28 22:21:02,542 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-28 22:21:04,178 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:21:04,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-28 22:21:04,397 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=160253.33333333334, ans=0.125 2023-09-28 22:21:05,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:21:08,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:21:08,826 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-28 22:21:08,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:21:12,434 INFO [train.py:1039] (1/4) Epoch 5, batch 2800, loss[loss=0.2756, simple_loss=0.3169, pruned_loss=0.1172, over 23813.00 frames. ], tot_loss[loss=0.2494, simple_loss=0.3077, pruned_loss=0.09556, over 4706871.56 frames. ], batch size: 212, lr: 1.99e-02, grad_scale: 32.0 2023-09-28 22:21:12,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-28 22:21:12,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:21:15,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:21:15,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-28 22:21:15,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:21:16,977 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:21:18,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:21:18,636 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-28 22:21:18,637 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-28 22:21:21,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:21:23,846 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=160320.0, ans=0.0 2023-09-28 22:21:25,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 22:21:25,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:21:28,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:21:31,449 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-28 22:21:34,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-28 22:21:36,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-28 22:21:37,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:21:37,627 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:21:37,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:21:40,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:21:42,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:21:42,353 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-28 22:21:44,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:21:51,731 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=160453.33333333334, ans=0.0 2023-09-28 22:21:54,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:21:56,551 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:21:58,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:22:01,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:22:01,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:22:04,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:22:04,479 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-28 22:22:04,824 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=160520.0, ans=0.0 2023-09-28 22:22:05,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:22:06,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:22:06,079 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:22:10,735 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:22:10,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:22:13,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:22:15,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:22:15,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:22:15,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 22:22:17,368 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 22:22:17,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 22:22:17,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:22:19,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-28 22:22:19,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:22:21,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:22:21,693 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:22:25,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-28 22:22:25,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:22:25,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:22:26,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:22:28,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-28 22:22:33,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:22:33,367 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=160586.66666666666, ans=0.0 2023-09-28 22:22:34,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 22:22:34,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:22:35,972 INFO [train.py:1039] (1/4) Epoch 5, batch 2850, loss[loss=0.2602, simple_loss=0.3111, pruned_loss=0.1046, over 23762.00 frames. ], tot_loss[loss=0.2486, simple_loss=0.3066, pruned_loss=0.09524, over 4714727.94 frames. ], batch size: 212, lr: 1.99e-02, grad_scale: 32.0 2023-09-28 22:22:37,613 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:22:37,798 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=160653.33333333334, ans=0.125 2023-09-28 22:22:42,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:22:42,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:22:42,702 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=160653.33333333334, ans=0.0 2023-09-28 22:22:43,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:22:45,545 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:22:45,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:22:47,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-28 22:22:47,295 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-28 22:22:53,864 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-28 22:22:53,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:22:55,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-28 22:22:56,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:22:59,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-28 22:23:01,389 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-28 22:23:03,009 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:23:15,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:23:16,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:23:18,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:23:18,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 22:23:18,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 22:23:19,661 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:23:21,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 22:23:22,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-28 22:23:25,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:23:25,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:23:27,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:23:27,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:23:30,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:23:30,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:23:33,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:23:34,739 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:23:37,558 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:23:38,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:23:40,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:23:41,933 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.779e+02 2.140e+02 2.376e+02 2.803e+02 4.746e+02, threshold=4.753e+02, percent-clipped=0.0 2023-09-28 22:23:42,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:23:44,103 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=160920.0, ans=0.0 2023-09-28 22:23:46,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:23:48,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-28 22:23:48,417 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-28 22:23:49,924 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 22:23:50,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:23:50,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-28 22:23:51,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:23:51,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:23:51,672 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:23:53,188 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:23:53,189 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-28 22:23:53,248 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-28 22:23:53,253 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:23:54,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:23:56,231 INFO [train.py:1039] (1/4) Epoch 5, batch 2900, loss[loss=0.2538, simple_loss=0.3021, pruned_loss=0.1027, over 24399.00 frames. ], tot_loss[loss=0.2481, simple_loss=0.3066, pruned_loss=0.09477, over 4720398.44 frames. ], batch size: 58, lr: 1.99e-02, grad_scale: 32.0 2023-09-28 22:23:58,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-28 22:23:58,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:23:58,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:24:00,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-28 22:24:06,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:24:06,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-28 22:24:08,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-28 22:24:09,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:24:09,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-28 22:24:12,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:24:12,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:24:15,744 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:24:15,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:24:17,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-28 22:24:18,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-28 22:24:19,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-28 22:24:20,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:24:20,948 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=161053.33333333334, ans=0.2 2023-09-28 22:24:23,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-28 22:24:24,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-28 22:24:27,971 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:24:27,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-28 22:24:28,001 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:24:30,506 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.09 vs. limit=22.5 2023-09-28 22:24:31,060 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:24:31,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-28 22:24:34,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:24:34,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:24:39,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:24:42,097 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:24:43,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-28 22:24:45,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-28 22:24:45,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:24:48,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:24:50,480 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.45 vs. limit=22.5 2023-09-28 22:24:51,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-28 22:24:51,437 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:24:56,244 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=161186.66666666666, ans=0.2 2023-09-28 22:24:57,386 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:25:07,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:25:07,402 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-28 22:25:09,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-28 22:25:14,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:25:14,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-28 22:25:15,345 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:25:15,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-28 22:25:20,022 INFO [train.py:1039] (1/4) Epoch 5, batch 2950, loss[loss=0.2032, simple_loss=0.2683, pruned_loss=0.06899, over 24330.00 frames. ], tot_loss[loss=0.2488, simple_loss=0.3078, pruned_loss=0.09486, over 4720672.11 frames. ], batch size: 56, lr: 1.99e-02, grad_scale: 32.0 2023-09-28 22:25:21,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:25:23,559 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-28 22:25:25,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:25:25,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:25:25,416 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=161320.0, ans=0.0 2023-09-28 22:25:26,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:25:28,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:25:29,650 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-28 22:25:29,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-28 22:25:31,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:25:31,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:25:38,810 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:25:40,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:25:45,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:25:45,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:25:49,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:25:49,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:25:49,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:25:51,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:25:51,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:25:56,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-28 22:25:57,752 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-28 22:25:59,162 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-28 22:25:59,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:26:02,251 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-28 22:26:03,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-28 22:26:03,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:26:03,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:26:03,913 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-28 22:26:05,316 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-28 22:26:07,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-28 22:26:08,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:26:09,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:26:10,615 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=7.12 vs. limit=12.0 2023-09-28 22:26:11,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:26:14,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:26:14,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:26:14,600 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-28 22:26:14,660 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:26:14,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-28 22:26:21,863 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:26:23,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:26:24,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-28 22:26:24,132 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:26:25,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-28 22:26:27,639 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.929e+02 2.317e+02 2.702e+02 3.273e+02 4.611e+02, threshold=5.405e+02, percent-clipped=0.0 2023-09-28 22:26:29,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:26:31,679 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.38 vs. limit=15.0 2023-09-28 22:26:32,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:26:32,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:26:34,148 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:26:34,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 22:26:34,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:26:35,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:26:35,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-28 22:26:37,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:26:38,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:26:39,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:26:40,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:26:40,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-28 22:26:41,948 INFO [train.py:1039] (1/4) Epoch 5, batch 3000, loss[loss=0.2784, simple_loss=0.3217, pruned_loss=0.1175, over 23869.00 frames. ], tot_loss[loss=0.2499, simple_loss=0.3085, pruned_loss=0.0956, over 4715184.80 frames. ], batch size: 195, lr: 1.98e-02, grad_scale: 32.0 2023-09-28 22:26:41,949 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-28 22:26:57,276 INFO [train.py:1071] (1/4) Epoch 5, validation: loss=0.3788, simple_loss=0.3301, pruned_loss=0.2137, over 1125622.00 frames. 2023-09-28 22:26:57,277 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-28 22:26:57,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:26:59,183 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:27:00,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-28 22:27:04,970 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-28 22:27:05,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-28 22:27:07,928 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:27:07,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:27:08,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-28 22:27:08,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:27:11,571 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=161720.0, ans=0.0 2023-09-28 22:27:12,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 22:27:16,329 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=161720.0, ans=0.1 2023-09-28 22:27:22,467 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:27:28,599 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=161786.66666666666, ans=0.0 2023-09-28 22:27:30,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-28 22:27:30,848 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=161786.66666666666, ans=0.1 2023-09-28 22:27:32,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:27:35,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 22:27:35,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:27:37,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:27:39,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:27:39,014 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-28 22:27:42,084 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-28 22:27:43,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:27:45,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 22:27:46,690 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:27:46,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:27:48,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:27:48,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:27:52,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 22:27:52,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:27:52,804 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:27:55,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:27:56,157 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=161853.33333333334, ans=0.1 2023-09-28 22:27:57,445 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-28 22:27:59,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:28:00,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:28:01,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:28:04,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:28:04,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:28:06,798 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-28 22:28:08,277 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-28 22:28:08,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:28:08,367 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-28 22:28:09,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:28:11,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-28 22:28:14,968 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-28 22:28:17,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 22:28:17,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-28 22:28:19,143 INFO [train.py:1039] (1/4) Epoch 5, batch 3050, loss[loss=0.2654, simple_loss=0.3183, pruned_loss=0.1063, over 23605.00 frames. ], tot_loss[loss=0.2509, simple_loss=0.3095, pruned_loss=0.09618, over 4708710.42 frames. ], batch size: 149, lr: 1.98e-02, grad_scale: 16.0 2023-09-28 22:28:19,327 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-28 22:28:19,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 22:28:20,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:28:21,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:28:21,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-28 22:28:22,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:28:22,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:28:25,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-28 22:28:27,290 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:28:28,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:28:30,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:28:33,398 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:28:35,618 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=162053.33333333334, ans=0.125 2023-09-28 22:28:38,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-28 22:28:45,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-28 22:28:45,941 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-28 22:28:45,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:28:46,273 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=162053.33333333334, ans=0.025 2023-09-28 22:28:51,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:28:52,963 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:28:52,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:28:54,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:28:57,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:28:57,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-28 22:28:57,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:28:58,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:28:58,890 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:29:00,487 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:29:02,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:29:03,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:29:04,076 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=162120.0, ans=0.125 2023-09-28 22:29:05,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-28 22:29:07,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:29:07,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 22:29:10,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:29:11,733 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:29:11,826 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:29:11,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:29:18,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:29:18,448 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:29:25,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:29:25,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:29:25,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:29:26,601 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=23.10 vs. limit=22.5 2023-09-28 22:29:27,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:29:27,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 22:29:28,682 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.694e+02 2.152e+02 2.476e+02 2.829e+02 3.891e+02, threshold=4.952e+02, percent-clipped=0.0 2023-09-28 22:29:28,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:29:29,098 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=162253.33333333334, ans=0.125 2023-09-28 22:29:30,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-28 22:29:31,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:29:31,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:29:33,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-28 22:29:34,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:29:40,978 INFO [train.py:1039] (1/4) Epoch 5, batch 3100, loss[loss=0.241, simple_loss=0.2942, pruned_loss=0.09386, over 23787.00 frames. ], tot_loss[loss=0.2507, simple_loss=0.3093, pruned_loss=0.09604, over 4717123.17 frames. ], batch size: 179, lr: 1.98e-02, grad_scale: 16.0 2023-09-28 22:29:42,516 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:29:44,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:29:47,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 22:29:49,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-28 22:29:51,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-28 22:29:53,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-28 22:29:53,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:29:58,310 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:29:58,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:30:00,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-28 22:30:03,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:30:07,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-28 22:30:12,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 22:30:13,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:30:14,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:30:14,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:30:14,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-28 22:30:18,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:30:18,758 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-28 22:30:18,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:30:20,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:30:20,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-28 22:30:22,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:30:25,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:30:27,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-28 22:30:27,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-28 22:30:29,384 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=162453.33333333334, ans=0.125 2023-09-28 22:30:30,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:30:31,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:30:34,126 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:30:34,158 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:30:35,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:30:37,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-28 22:30:37,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:30:38,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:30:38,767 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:30:38,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:30:38,781 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 22:30:40,745 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=162520.0, ans=0.0 2023-09-28 22:30:44,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:30:46,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-28 22:30:49,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:30:49,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-28 22:30:51,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:30:51,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:30:53,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-28 22:31:04,555 INFO [train.py:1039] (1/4) Epoch 5, batch 3150, loss[loss=0.2721, simple_loss=0.3224, pruned_loss=0.1109, over 23289.00 frames. ], tot_loss[loss=0.2497, simple_loss=0.3085, pruned_loss=0.09544, over 4711177.82 frames. ], batch size: 105, lr: 1.98e-02, grad_scale: 16.0 2023-09-28 22:31:05,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-28 22:31:07,094 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=162653.33333333334, ans=0.125 2023-09-28 22:31:08,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:31:08,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:31:10,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:31:10,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:31:10,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-28 22:31:11,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:31:11,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-28 22:31:13,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-28 22:31:15,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:31:16,809 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-28 22:31:18,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-28 22:31:18,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:31:18,880 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=162653.33333333334, ans=0.125 2023-09-28 22:31:19,290 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.83 vs. limit=6.0 2023-09-28 22:31:20,070 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-28 22:31:21,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-28 22:31:24,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-28 22:31:25,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-28 22:31:25,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-28 22:31:25,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:31:25,101 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:31:26,733 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:31:30,188 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-28 22:31:30,512 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=162720.0, ans=0.125 2023-09-28 22:31:31,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:31:31,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:31:33,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:31:36,911 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-28 22:31:40,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-28 22:31:42,194 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:31:43,856 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-28 22:31:45,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:31:45,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-28 22:31:48,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-28 22:31:50,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:31:50,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 22:31:51,495 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 22:31:51,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:31:51,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:31:53,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-28 22:31:53,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-28 22:31:54,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-28 22:31:54,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 22:31:54,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:31:57,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:31:57,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:31:57,856 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-28 22:31:59,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:32:01,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-28 22:32:01,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:32:03,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-28 22:32:05,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-28 22:32:06,826 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:32:08,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:32:09,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-28 22:32:09,849 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 22:32:11,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:32:11,585 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=162920.0, ans=0.0 2023-09-28 22:32:13,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:32:15,468 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.706e+02 2.257e+02 2.534e+02 2.930e+02 4.234e+02, threshold=5.067e+02, percent-clipped=0.0 2023-09-28 22:32:15,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:32:17,139 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:32:21,787 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:32:21,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:32:24,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-28 22:32:27,685 INFO [train.py:1039] (1/4) Epoch 5, batch 3200, loss[loss=0.251, simple_loss=0.2929, pruned_loss=0.1046, over 22751.00 frames. ], tot_loss[loss=0.2482, simple_loss=0.3068, pruned_loss=0.09484, over 4709860.87 frames. ], batch size: 322, lr: 1.98e-02, grad_scale: 32.0 2023-09-28 22:32:30,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:32:30,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-28 22:32:32,748 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=162986.66666666666, ans=0.0 2023-09-28 22:32:34,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:32:36,247 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:32:36,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-28 22:32:39,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:32:44,509 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-28 22:32:48,382 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:32:58,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:33:07,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-28 22:33:07,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:33:11,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-28 22:33:13,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 22:33:13,483 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=163120.0, ans=0.05 2023-09-28 22:33:15,046 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=163120.0, ans=0.125 2023-09-28 22:33:16,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:33:16,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:33:17,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:33:19,918 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=20.58 vs. limit=22.5 2023-09-28 22:33:22,350 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-28 22:33:24,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-28 22:33:24,858 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=163186.66666666666, ans=0.0 2023-09-28 22:33:26,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-28 22:33:27,953 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=163186.66666666666, ans=0.125 2023-09-28 22:33:29,605 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=163186.66666666666, ans=0.0 2023-09-28 22:33:30,756 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-28 22:33:33,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:33:35,707 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=163253.33333333334, ans=0.0 2023-09-28 22:33:38,570 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=163253.33333333334, ans=0.125 2023-09-28 22:33:39,873 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:33:39,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:33:39,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:33:40,013 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-28 22:33:40,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 22:33:45,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:33:47,305 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-28 22:33:48,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-28 22:33:48,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-28 22:33:49,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-28 22:33:50,374 INFO [train.py:1039] (1/4) Epoch 5, batch 3250, loss[loss=0.26, simple_loss=0.3294, pruned_loss=0.09532, over 24670.00 frames. ], tot_loss[loss=0.2469, simple_loss=0.306, pruned_loss=0.09386, over 4729791.91 frames. ], batch size: 73, lr: 1.98e-02, grad_scale: 32.0 2023-09-28 22:33:52,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:33:53,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-28 22:33:53,759 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-28 22:33:55,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:33:55,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:33:57,468 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-28 22:33:59,865 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.48 vs. limit=15.0 2023-09-28 22:34:02,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:34:05,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:34:12,682 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.90 vs. limit=15.0 2023-09-28 22:34:13,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:34:13,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-28 22:34:13,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:34:14,909 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:34:14,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:34:16,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:34:17,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 22:34:18,787 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.93 vs. limit=15.0 2023-09-28 22:34:20,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:34:20,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:34:20,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:34:22,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:34:22,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:34:22,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:34:24,472 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.23 vs. limit=15.0 2023-09-28 22:34:25,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:34:26,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:34:27,694 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.38 vs. limit=6.0 2023-09-28 22:34:28,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:34:29,865 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:34:30,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:34:32,160 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:34:32,175 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:34:40,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-28 22:34:40,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:34:40,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:34:41,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:34:43,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-28 22:34:48,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:34:52,898 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=163520.0, ans=0.125 2023-09-28 22:34:54,931 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:34:54,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:34:54,981 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-28 22:34:54,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:34:54,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 22:34:57,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:34:59,438 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=6.10 vs. limit=15.0 2023-09-28 22:34:59,765 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.742e+02 2.181e+02 2.539e+02 2.910e+02 4.275e+02, threshold=5.078e+02, percent-clipped=0.0 2023-09-28 22:34:59,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-28 22:35:00,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-28 22:35:00,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:35:01,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:35:01,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:35:03,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-28 22:35:03,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:35:05,744 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=163586.66666666666, ans=0.2 2023-09-28 22:35:08,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:35:08,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:35:11,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-28 22:35:11,252 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:35:13,131 INFO [train.py:1039] (1/4) Epoch 5, batch 3300, loss[loss=0.2505, simple_loss=0.298, pruned_loss=0.1015, over 23592.00 frames. ], tot_loss[loss=0.2478, simple_loss=0.3069, pruned_loss=0.09434, over 4715230.03 frames. ], batch size: 256, lr: 1.97e-02, grad_scale: 32.0 2023-09-28 22:35:13,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:35:13,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-28 22:35:16,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:35:16,547 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-28 22:35:19,451 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-28 22:35:19,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-28 22:35:19,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:35:22,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:35:24,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:35:24,907 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:35:27,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 22:35:27,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 22:35:31,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:35:33,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:35:33,228 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=163720.0, ans=0.125 2023-09-28 22:35:36,238 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-28 22:35:37,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:35:37,925 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:35:38,996 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=163720.0, ans=0.0 2023-09-28 22:35:40,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:35:40,240 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-28 22:35:41,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:35:41,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 22:35:43,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 22:35:43,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:35:43,460 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-28 22:35:45,340 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=163786.66666666666, ans=0.1 2023-09-28 22:35:47,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:35:47,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-28 22:35:50,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:35:50,183 WARNING [train.py:1197] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-28 22:35:50,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-28 22:35:51,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:35:51,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:35:55,080 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-28 22:35:56,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-28 22:35:58,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:36:00,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-28 22:36:01,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:36:03,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-28 22:36:05,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:36:07,663 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.87 vs. limit=22.5 2023-09-28 22:36:07,859 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=4.98 vs. limit=12.0 2023-09-28 22:36:08,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:36:08,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:36:08,428 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:36:09,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:36:11,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:36:11,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:36:12,645 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=163853.33333333334, ans=0.0 2023-09-28 22:36:13,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:36:15,312 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-28 22:36:15,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-28 22:36:17,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-28 22:36:18,536 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:36:18,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:36:19,359 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.01 vs. limit=22.5 2023-09-28 22:36:20,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:36:20,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:36:22,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 22:36:23,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:36:23,766 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-28 22:36:23,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:36:26,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 22:36:28,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-28 22:36:28,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:36:30,052 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:36:33,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:36:33,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:36:35,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:36:36,541 INFO [train.py:1039] (1/4) Epoch 5, batch 3350, loss[loss=0.3396, simple_loss=0.3616, pruned_loss=0.1588, over 19005.00 frames. ], tot_loss[loss=0.2489, simple_loss=0.308, pruned_loss=0.09492, over 4713404.06 frames. ], batch size: 388, lr: 1.97e-02, grad_scale: 32.0 2023-09-28 22:36:36,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:36:36,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:36:38,610 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=163986.66666666666, ans=0.125 2023-09-28 22:36:41,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:36:43,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:36:44,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:36:47,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:36:50,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:36:51,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:36:51,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:36:51,951 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=164053.33333333334, ans=0.0 2023-09-28 22:36:53,365 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=164053.33333333334, ans=0.1 2023-09-28 22:36:54,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-28 22:36:58,141 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-28 22:36:58,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:37:01,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-28 22:37:01,255 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-28 22:37:01,425 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:37:01,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:37:03,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:37:04,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-28 22:37:04,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:37:04,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:37:06,212 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:37:08,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:37:08,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:37:08,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:37:08,611 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=164120.0, ans=0.1 2023-09-28 22:37:11,787 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=164120.0, ans=0.125 2023-09-28 22:37:14,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:37:17,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:37:17,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:37:20,903 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=164120.0, ans=0.0 2023-09-28 22:37:22,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:37:22,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:37:24,413 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:37:24,437 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:37:27,800 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=164186.66666666666, ans=0.125 2023-09-28 22:37:28,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:37:30,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-28 22:37:31,922 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 22:37:31,982 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-28 22:37:32,044 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:37:34,081 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-28 22:37:34,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:37:35,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:37:42,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:37:42,183 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-28 22:37:44,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 22:37:46,088 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.765e+02 2.271e+02 2.616e+02 3.188e+02 4.875e+02, threshold=5.232e+02, percent-clipped=0.0 2023-09-28 22:37:46,201 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:37:47,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:37:50,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:37:53,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-28 22:37:54,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 22:37:55,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:37:56,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:37:57,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-28 22:37:58,969 INFO [train.py:1039] (1/4) Epoch 5, batch 3400, loss[loss=0.3539, simple_loss=0.3742, pruned_loss=0.1668, over 19253.00 frames. ], tot_loss[loss=0.2494, simple_loss=0.3084, pruned_loss=0.09518, over 4716008.79 frames. ], batch size: 388, lr: 1.97e-02, grad_scale: 32.0 2023-09-28 22:37:59,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:37:59,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-28 22:38:02,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:38:02,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:38:02,286 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-28 22:38:03,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:38:05,279 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-28 22:38:10,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-28 22:38:10,452 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-28 22:38:10,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:38:12,387 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 22:38:15,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:38:15,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 22:38:15,123 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:38:16,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:38:19,251 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=164386.66666666666, ans=0.125 2023-09-28 22:38:22,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:38:22,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-28 22:38:27,500 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:38:31,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:38:32,362 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:38:32,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-28 22:38:35,972 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=164453.33333333334, ans=0.125 2023-09-28 22:38:38,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:38:43,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-28 22:38:49,933 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:38:51,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:38:51,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-28 22:38:51,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:38:53,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:38:53,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:38:53,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:38:58,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:39:01,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:39:01,978 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:39:07,331 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:39:10,243 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-28 22:39:12,070 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=164586.66666666666, ans=0.0 2023-09-28 22:39:16,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 22:39:21,481 INFO [train.py:1039] (1/4) Epoch 5, batch 3450, loss[loss=0.2635, simple_loss=0.3086, pruned_loss=0.1092, over 23892.00 frames. ], tot_loss[loss=0.249, simple_loss=0.3082, pruned_loss=0.0949, over 4721991.92 frames. ], batch size: 195, lr: 1.97e-02, grad_scale: 16.0 2023-09-28 22:39:21,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-28 22:39:25,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-28 22:39:27,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:39:28,943 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:39:28,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-28 22:39:31,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:39:31,733 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.42 vs. limit=15.0 2023-09-28 22:39:36,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-28 22:39:39,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:39:39,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:39:41,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:39:41,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:39:43,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:39:48,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-28 22:39:48,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=164720.0, ans=0.0 2023-09-28 22:39:54,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-28 22:39:54,897 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 22:39:54,985 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:39:57,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:40:02,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-28 22:40:04,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:40:08,756 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.min_positive, batch_count=164786.66666666666, ans=0.05 2023-09-28 22:40:09,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:40:09,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:40:11,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-28 22:40:13,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:40:15,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-28 22:40:15,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:40:16,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:40:18,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:40:21,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-28 22:40:25,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:40:27,385 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=15.08 vs. limit=15.0 2023-09-28 22:40:29,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:40:31,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:40:33,104 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 2.242e+02 2.572e+02 2.948e+02 4.937e+02, threshold=5.144e+02, percent-clipped=0.0 2023-09-28 22:40:33,382 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:40:39,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:40:40,751 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:40:40,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:40:40,916 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:40:44,020 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=164986.66666666666, ans=0.1 2023-09-28 22:40:45,219 INFO [train.py:1039] (1/4) Epoch 5, batch 3500, loss[loss=0.2341, simple_loss=0.3031, pruned_loss=0.0826, over 24684.00 frames. ], tot_loss[loss=0.2483, simple_loss=0.307, pruned_loss=0.09483, over 4710772.11 frames. ], batch size: 73, lr: 1.97e-02, grad_scale: 16.0 2023-09-28 22:40:46,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:40:50,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:40:50,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-28 22:40:52,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 22:40:52,315 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=164986.66666666666, ans=0.2 2023-09-28 22:40:56,719 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-28 22:41:00,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:41:01,002 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-28 22:41:06,470 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:41:07,945 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:41:09,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:41:09,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:41:09,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-28 22:41:11,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:41:11,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:41:13,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-28 22:41:16,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:41:16,527 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-28 22:41:19,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:41:22,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:41:22,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-28 22:41:22,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:41:23,719 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=165120.0, ans=0.0 2023-09-28 22:41:23,905 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=165120.0, ans=0.125 2023-09-28 22:41:25,383 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=165120.0, ans=0.125 2023-09-28 22:41:26,668 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:41:29,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:41:29,655 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:41:31,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:41:31,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:41:32,883 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-28 22:41:34,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-28 22:41:34,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-28 22:41:35,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:41:37,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:41:39,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:41:39,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:41:44,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 22:41:44,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:41:51,626 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:41:53,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-28 22:41:53,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-28 22:41:53,259 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:41:53,646 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=165253.33333333334, ans=0.125 2023-09-28 22:41:56,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:41:56,488 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:41:56,738 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:42:00,457 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-28 22:42:01,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:42:03,368 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:42:04,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-28 22:42:06,451 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-28 22:42:07,719 INFO [train.py:1039] (1/4) Epoch 5, batch 3550, loss[loss=0.2467, simple_loss=0.271, pruned_loss=0.1112, over 18999.00 frames. ], tot_loss[loss=0.2469, simple_loss=0.3057, pruned_loss=0.09406, over 4718011.74 frames. ], batch size: 388, lr: 1.96e-02, grad_scale: 16.0 2023-09-28 22:42:07,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:42:09,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:42:09,580 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:42:11,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:42:14,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:42:15,109 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 22:42:16,521 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=165320.0, ans=0.0 2023-09-28 22:42:21,857 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=165320.0, ans=0.125 2023-09-28 22:42:25,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:42:26,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 22:42:28,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:42:30,157 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:42:32,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:42:33,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:42:33,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 22:42:36,912 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:42:36,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:42:37,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:42:37,095 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-28 22:42:38,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 22:42:43,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:42:44,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:42:46,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:42:46,356 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:42:46,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:42:46,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-28 22:42:46,510 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:42:50,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:42:52,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 22:42:57,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:42:57,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:42:59,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:43:01,389 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=165520.0, ans=0.125 2023-09-28 22:43:02,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-28 22:43:02,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:43:04,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-28 22:43:04,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:43:07,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:43:08,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:43:11,291 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-28 22:43:12,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:43:13,042 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=165520.0, ans=0.125 2023-09-28 22:43:17,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:43:17,923 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-28 22:43:19,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:43:20,839 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.780e+02 2.191e+02 2.567e+02 2.914e+02 4.741e+02, threshold=5.134e+02, percent-clipped=0.0 2023-09-28 22:43:24,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:43:28,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-28 22:43:33,471 INFO [train.py:1039] (1/4) Epoch 5, batch 3600, loss[loss=0.2238, simple_loss=0.3014, pruned_loss=0.07312, over 24493.00 frames. ], tot_loss[loss=0.2448, simple_loss=0.3044, pruned_loss=0.09258, over 4725219.31 frames. ], batch size: 66, lr: 1.96e-02, grad_scale: 32.0 2023-09-28 22:43:35,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-28 22:43:35,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:43:36,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:43:36,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:43:38,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:43:38,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:43:43,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:43:45,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:43:46,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:43:46,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:43:48,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:43:48,326 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-28 22:43:51,453 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 22:43:54,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:43:56,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:43:56,386 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=165720.0, ans=0.0 2023-09-28 22:43:59,738 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:44:01,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 22:44:01,828 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:44:02,115 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=165720.0, ans=0.125 2023-09-28 22:44:03,189 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-28 22:44:04,580 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:44:08,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:44:08,255 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:44:11,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:44:11,610 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=165786.66666666666, ans=0.1 2023-09-28 22:44:14,258 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:44:14,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:44:14,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-28 22:44:22,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:44:24,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 22:44:25,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-28 22:44:30,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:44:35,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:44:37,212 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:44:41,649 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.77 vs. limit=15.0 2023-09-28 22:44:45,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-28 22:44:45,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:44:45,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-28 22:44:46,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-28 22:44:47,153 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-28 22:44:50,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:44:50,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:44:52,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-28 22:44:52,454 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:44:53,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:44:53,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:44:55,043 INFO [train.py:1039] (1/4) Epoch 5, batch 3650, loss[loss=0.2257, simple_loss=0.3029, pruned_loss=0.0742, over 24670.00 frames. ], tot_loss[loss=0.246, simple_loss=0.3054, pruned_loss=0.09332, over 4723544.11 frames. ], batch size: 68, lr: 1.96e-02, grad_scale: 32.0 2023-09-28 22:44:55,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-28 22:44:55,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-28 22:44:58,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:44:59,831 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-28 22:45:03,276 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=165986.66666666666, ans=0.5 2023-09-28 22:45:04,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-28 22:45:05,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:45:09,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-28 22:45:10,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-28 22:45:15,607 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:45:15,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:45:15,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 22:45:15,878 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=166053.33333333334, ans=0.035 2023-09-28 22:45:18,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-28 22:45:20,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:45:20,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-28 22:45:21,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-28 22:45:21,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:45:23,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-28 22:45:24,188 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=166053.33333333334, ans=0.0 2023-09-28 22:45:25,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 22:45:25,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:45:25,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:45:28,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:45:30,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-28 22:45:33,030 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-28 22:45:33,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:45:34,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-28 22:45:36,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:45:36,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:45:42,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:45:43,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:45:44,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-28 22:45:46,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:45:46,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:45:50,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:45:50,796 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=166186.66666666666, ans=0.125 2023-09-28 22:45:53,431 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:45:54,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:45:55,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:45:56,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 22:45:56,712 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:45:58,979 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:46:04,136 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.19 vs. limit=15.0 2023-09-28 22:46:06,469 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.698e+02 2.312e+02 2.641e+02 2.987e+02 4.263e+02, threshold=5.283e+02, percent-clipped=0.0 2023-09-28 22:46:06,585 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-28 22:46:11,234 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:46:11,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:46:12,779 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-28 22:46:12,866 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:46:12,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:46:13,393 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=166253.33333333334, ans=0.2 2023-09-28 22:46:14,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:46:17,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-28 22:46:17,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:46:18,614 INFO [train.py:1039] (1/4) Epoch 5, batch 3700, loss[loss=0.2358, simple_loss=0.288, pruned_loss=0.09186, over 24332.00 frames. ], tot_loss[loss=0.2475, simple_loss=0.306, pruned_loss=0.09445, over 4723837.63 frames. ], batch size: 56, lr: 1.96e-02, grad_scale: 32.0 2023-09-28 22:46:18,812 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 22:46:21,684 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:46:21,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:46:26,025 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:46:26,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-28 22:46:26,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:46:27,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 22:46:28,920 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 22:46:32,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 22:46:35,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:46:35,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:46:37,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:46:37,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:46:38,819 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 22:46:40,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:46:41,974 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-28 22:46:43,820 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=166386.66666666666, ans=0.125 2023-09-28 22:46:45,362 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=166386.66666666666, ans=0.125 2023-09-28 22:46:50,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:46:51,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 22:46:51,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 22:46:53,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-28 22:46:53,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:46:58,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:46:58,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-28 22:47:00,354 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:47:01,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:47:03,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:47:04,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 22:47:05,337 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=166453.33333333334, ans=0.125 2023-09-28 22:47:08,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 22:47:12,995 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:47:13,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-28 22:47:14,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:47:14,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-28 22:47:16,267 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=166520.0, ans=0.125 2023-09-28 22:47:19,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:47:19,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:47:20,302 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.65 vs. limit=10.0 2023-09-28 22:47:22,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:47:23,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-28 22:47:26,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:47:26,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-28 22:47:26,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:47:26,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:47:31,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:47:32,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-28 22:47:33,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-28 22:47:33,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:47:33,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:47:37,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:47:37,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:47:40,436 INFO [train.py:1039] (1/4) Epoch 5, batch 3750, loss[loss=0.2443, simple_loss=0.3132, pruned_loss=0.08765, over 24444.00 frames. ], tot_loss[loss=0.2485, simple_loss=0.3077, pruned_loss=0.09468, over 4726493.86 frames. ], batch size: 69, lr: 1.96e-02, grad_scale: 32.0 2023-09-28 22:47:40,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:47:42,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:47:45,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:47:47,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-28 22:47:47,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 22:47:50,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-28 22:47:50,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-28 22:47:52,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:47:53,122 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.91 vs. limit=22.5 2023-09-28 22:47:53,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:47:55,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:47:55,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:47:58,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:48:02,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:48:03,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:48:05,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:48:10,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:48:11,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-28 22:48:12,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:48:15,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:48:16,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:48:19,316 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=166786.66666666666, ans=0.2 2023-09-28 22:48:20,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-28 22:48:20,846 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=166786.66666666666, ans=0.125 2023-09-28 22:48:22,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-28 22:48:23,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:48:25,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:48:25,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:48:25,744 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=166786.66666666666, ans=0.2 2023-09-28 22:48:31,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:48:31,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-28 22:48:37,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-28 22:48:40,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:48:41,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:48:41,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:48:46,154 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:48:49,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 22:48:51,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-28 22:48:52,778 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.725e+02 2.387e+02 2.666e+02 3.325e+02 5.060e+02, threshold=5.333e+02, percent-clipped=0.0 2023-09-28 22:48:52,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 22:48:54,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:48:58,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:49:04,277 INFO [train.py:1039] (1/4) Epoch 5, batch 3800, loss[loss=0.2617, simple_loss=0.2899, pruned_loss=0.1167, over 22661.00 frames. ], tot_loss[loss=0.2494, simple_loss=0.3081, pruned_loss=0.09538, over 4724914.49 frames. ], batch size: 322, lr: 1.96e-02, grad_scale: 32.0 2023-09-28 22:49:07,222 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:49:11,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:49:13,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 22:49:13,424 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-28 22:49:14,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:49:16,530 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:49:18,344 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-28 22:49:22,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 22:49:22,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:49:22,625 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:49:22,834 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=167053.33333333334, ans=0.125 2023-09-28 22:49:25,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:49:27,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:49:27,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:49:27,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-28 22:49:31,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-28 22:49:32,038 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:49:33,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:49:35,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:49:37,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 22:49:38,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-28 22:49:38,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:49:41,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:49:41,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:49:47,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 22:49:47,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-28 22:49:50,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:49:58,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:50:04,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:50:06,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-28 22:50:08,346 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=167253.33333333334, ans=0.1 2023-09-28 22:50:10,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-28 22:50:10,112 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:50:13,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:50:13,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:50:14,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-28 22:50:19,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-28 22:50:19,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-28 22:50:20,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:50:20,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:50:25,470 INFO [train.py:1039] (1/4) Epoch 5, batch 3850, loss[loss=0.2813, simple_loss=0.337, pruned_loss=0.1128, over 23754.00 frames. ], tot_loss[loss=0.2486, simple_loss=0.307, pruned_loss=0.09509, over 4707988.37 frames. ], batch size: 85, lr: 1.95e-02, grad_scale: 32.0 2023-09-28 22:50:25,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:50:25,778 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:50:31,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:50:32,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-28 22:50:34,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 22:50:34,572 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:50:37,872 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 22:50:39,453 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:50:41,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-28 22:50:43,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-28 22:50:44,983 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=167386.66666666666, ans=0.125 2023-09-28 22:50:49,511 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:50:52,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:50:54,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:50:54,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:50:54,387 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=167386.66666666666, ans=0.1 2023-09-28 22:50:59,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:50:59,544 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:51:01,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:51:01,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:51:01,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:51:02,239 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=4.24 vs. limit=12.0 2023-09-28 22:51:04,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:51:06,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:51:06,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:51:07,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-28 22:51:07,726 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-28 22:51:09,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:51:09,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:51:10,324 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=12.82 vs. limit=15.0 2023-09-28 22:51:11,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:51:12,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:51:12,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-28 22:51:17,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-28 22:51:19,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:51:20,841 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-28 22:51:22,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-28 22:51:25,952 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=167520.0, ans=0.1 2023-09-28 22:51:28,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:51:30,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:51:35,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:51:35,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-28 22:51:37,562 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.762e+02 2.127e+02 2.578e+02 3.001e+02 5.626e+02, threshold=5.156e+02, percent-clipped=1.0 2023-09-28 22:51:37,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-28 22:51:39,657 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=167586.66666666666, ans=0.0 2023-09-28 22:51:40,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:51:40,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:51:45,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 22:51:45,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:51:45,413 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:51:46,858 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:51:46,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:51:46,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-28 22:51:48,211 INFO [train.py:1039] (1/4) Epoch 5, batch 3900, loss[loss=0.2325, simple_loss=0.2827, pruned_loss=0.09113, over 23771.00 frames. ], tot_loss[loss=0.2476, simple_loss=0.3058, pruned_loss=0.09468, over 4708770.87 frames. ], batch size: 150, lr: 1.95e-02, grad_scale: 32.0 2023-09-28 22:51:48,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:51:48,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-28 22:51:50,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:51:50,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:51:52,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:51:52,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:51:53,924 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:51:54,061 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=167653.33333333334, ans=0.1 2023-09-28 22:51:55,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:51:55,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:51:56,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:51:56,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-28 22:51:56,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:51:59,896 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:52:01,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 22:52:01,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:52:02,261 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.25 vs. limit=15.0 2023-09-28 22:52:02,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:52:04,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 22:52:04,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:52:06,377 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=167720.0, ans=0.0 2023-09-28 22:52:08,252 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-28 22:52:08,491 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=167720.0, ans=0.0 2023-09-28 22:52:09,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-28 22:52:09,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:52:11,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-28 22:52:13,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:52:13,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-28 22:52:15,545 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=167720.0, ans=0.0 2023-09-28 22:52:16,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-28 22:52:21,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:52:21,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:52:21,433 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 22:52:22,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:52:26,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:52:28,350 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=167786.66666666666, ans=0.04949747468305833 2023-09-28 22:52:29,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:52:31,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:52:31,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:52:32,759 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:52:38,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:52:38,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:52:39,176 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=167853.33333333334, ans=0.0 2023-09-28 22:52:47,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 22:52:49,682 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:52:59,073 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:53:02,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:53:02,733 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-28 22:53:02,825 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-28 22:53:02,846 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:53:05,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-28 22:53:06,871 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.88 vs. limit=10.0 2023-09-28 22:53:07,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:53:08,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-28 22:53:10,447 INFO [train.py:1039] (1/4) Epoch 5, batch 3950, loss[loss=0.2467, simple_loss=0.3134, pruned_loss=0.09004, over 24402.00 frames. ], tot_loss[loss=0.2462, simple_loss=0.3046, pruned_loss=0.0939, over 4698185.99 frames. ], batch size: 77, lr: 1.95e-02, grad_scale: 32.0 2023-09-28 22:53:16,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:53:17,777 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-28 22:53:17,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:53:19,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:53:21,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:53:27,993 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-28 22:53:28,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 22:53:28,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-28 22:53:28,223 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-28 22:53:29,607 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:53:32,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:53:34,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-28 22:53:34,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:53:36,593 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-28 22:53:39,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:53:39,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 22:53:39,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 22:53:41,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 22:53:42,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:53:55,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:53:56,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:54:00,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-28 22:54:07,441 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-28 22:54:07,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-28 22:54:08,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:54:08,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:54:10,771 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer_ff3.min_abs, batch_count=168186.66666666666, ans=0.2 2023-09-28 22:54:16,345 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=168253.33333333334, ans=0.0 2023-09-28 22:54:17,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:54:17,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-28 22:54:19,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:54:20,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:54:20,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-28 22:54:20,681 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=168253.33333333334, ans=0.125 2023-09-28 22:54:21,767 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.706e+02 2.253e+02 2.651e+02 3.133e+02 5.052e+02, threshold=5.303e+02, percent-clipped=0.0 2023-09-28 22:54:24,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:54:25,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:54:30,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-28 22:54:33,566 INFO [train.py:1039] (1/4) Epoch 5, batch 4000, loss[loss=0.2373, simple_loss=0.3045, pruned_loss=0.08503, over 24623.00 frames. ], tot_loss[loss=0.2473, simple_loss=0.3061, pruned_loss=0.09426, over 4705826.93 frames. ], batch size: 65, lr: 1.95e-02, grad_scale: 32.0 2023-09-28 22:54:40,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:54:47,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:54:51,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:54:53,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:54:53,506 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=168386.66666666666, ans=0.0 2023-09-28 22:54:54,546 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:54:54,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-28 22:54:54,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-28 22:54:56,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-28 22:54:56,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 22:54:56,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-28 22:54:58,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:55:01,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:55:03,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:55:03,280 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:55:03,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:55:03,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-28 22:55:04,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:55:06,473 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-28 22:55:07,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:55:09,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:55:13,072 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-28 22:55:13,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 22:55:13,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:55:13,579 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=168453.33333333334, ans=0.04949747468305833 2023-09-28 22:55:22,709 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-28 22:55:22,801 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:55:24,610 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=168520.0, ans=0.0 2023-09-28 22:55:25,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:55:25,848 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-28 22:55:27,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:55:27,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-28 22:55:27,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:55:28,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:55:30,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:55:32,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:55:32,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-28 22:55:32,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:55:34,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-28 22:55:34,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:55:35,992 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-28 22:55:38,364 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=168586.66666666666, ans=0.0 2023-09-28 22:55:41,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 22:55:44,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 22:55:49,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:55:49,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:55:49,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:55:51,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:55:56,164 INFO [train.py:1039] (1/4) Epoch 5, batch 4050, loss[loss=0.2469, simple_loss=0.3173, pruned_loss=0.08826, over 24380.00 frames. ], tot_loss[loss=0.2478, simple_loss=0.3068, pruned_loss=0.09436, over 4720360.18 frames. ], batch size: 77, lr: 1.95e-02, grad_scale: 32.0 2023-09-28 22:55:56,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:55:57,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-28 22:55:59,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-28 22:55:59,695 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 22:56:01,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:56:02,527 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:56:02,774 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=168653.33333333334, ans=0.015 2023-09-28 22:56:04,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-28 22:56:06,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:56:09,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:56:09,897 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=168653.33333333334, ans=0.125 2023-09-28 22:56:13,212 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:56:13,301 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 22:56:15,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=168720.0, ans=0.125 2023-09-28 22:56:16,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:56:16,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:56:20,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:56:21,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-28 22:56:24,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 22:56:25,040 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=168720.0, ans=0.125 2023-09-28 22:56:27,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-28 22:56:27,426 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-28 22:56:29,318 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.58 vs. limit=15.0 2023-09-28 22:56:30,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:56:31,003 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.48 vs. limit=22.5 2023-09-28 22:56:36,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-28 22:56:38,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:56:41,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:56:42,231 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.44 vs. limit=15.0 2023-09-28 22:56:44,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:56:46,856 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:56:46,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:56:50,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:56:53,866 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.22 vs. limit=15.0 2023-09-28 22:56:54,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-28 22:56:54,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 22:56:55,112 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=168853.33333333334, ans=0.125 2023-09-28 22:56:56,527 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:56:56,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-28 22:57:02,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:57:08,225 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.672e+02 2.148e+02 2.565e+02 3.123e+02 5.245e+02, threshold=5.130e+02, percent-clipped=0.0 2023-09-28 22:57:08,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-28 22:57:08,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:57:08,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 22:57:12,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-28 22:57:12,215 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-28 22:57:12,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:57:15,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:57:16,607 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:57:16,661 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:57:20,106 INFO [train.py:1039] (1/4) Epoch 5, batch 4100, loss[loss=0.2207, simple_loss=0.2831, pruned_loss=0.0791, over 24316.00 frames. ], tot_loss[loss=0.2492, simple_loss=0.3082, pruned_loss=0.09512, over 4711768.33 frames. ], batch size: 56, lr: 1.95e-02, grad_scale: 32.0 2023-09-28 22:57:21,108 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=5.87 vs. limit=12.0 2023-09-28 22:57:23,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-28 22:57:24,927 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-28 22:57:25,052 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=168986.66666666666, ans=0.035 2023-09-28 22:57:27,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-28 22:57:27,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-28 22:57:29,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:57:29,904 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:57:29,956 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:57:29,977 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 22:57:31,558 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-28 22:57:35,448 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:57:35,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:57:35,610 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:57:37,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:57:40,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:57:41,599 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:57:41,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:57:41,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-28 22:57:43,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:57:43,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:57:43,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:57:45,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:57:45,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-28 22:57:48,405 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:57:51,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-28 22:57:52,144 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=169120.0, ans=0.125 2023-09-28 22:57:53,427 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:57:55,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:57:55,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-28 22:57:57,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:57:58,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:57:58,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:57:58,935 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=9.29 vs. limit=15.0 2023-09-28 22:58:01,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-28 22:58:01,400 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=169120.0, ans=0.125 2023-09-28 22:58:02,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-28 22:58:04,514 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:58:06,844 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=169120.0, ans=0.125 2023-09-28 22:58:07,974 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-28 22:58:08,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:58:09,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:58:11,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:58:15,956 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=169186.66666666666, ans=0.125 2023-09-28 22:58:17,520 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:58:20,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:58:22,124 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:58:31,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:58:31,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:58:34,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:58:37,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:58:39,310 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=169253.33333333334, ans=0.125 2023-09-28 22:58:42,152 INFO [train.py:1039] (1/4) Epoch 5, batch 4150, loss[loss=0.2712, simple_loss=0.3152, pruned_loss=0.1137, over 23827.00 frames. ], tot_loss[loss=0.2491, simple_loss=0.3082, pruned_loss=0.09502, over 4710277.88 frames. ], batch size: 212, lr: 1.94e-02, grad_scale: 32.0 2023-09-28 22:58:43,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:58:43,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:58:46,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:58:46,372 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=169320.0, ans=0.0 2023-09-28 22:58:47,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:58:49,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-28 22:58:50,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:58:50,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-28 22:58:50,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-28 22:58:52,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-28 22:58:53,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:58:56,007 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=169320.0, ans=0.0 2023-09-28 22:58:57,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:58:57,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:59:01,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:59:03,085 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:59:03,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-28 22:59:03,497 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=169386.66666666666, ans=0.0 2023-09-28 22:59:04,969 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=169386.66666666666, ans=0.125 2023-09-28 22:59:06,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 22:59:06,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:59:07,510 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-28 22:59:12,192 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=169386.66666666666, ans=0.125 2023-09-28 22:59:13,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:59:17,801 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:59:19,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-28 22:59:22,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-28 22:59:22,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:59:23,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-28 22:59:23,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:59:23,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:59:26,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:59:28,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:59:28,884 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=169453.33333333334, ans=0.2 2023-09-28 22:59:30,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-28 22:59:30,318 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=169520.0, ans=0.0 2023-09-28 22:59:33,801 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-28 22:59:34,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:59:34,322 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 22:59:35,526 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-28 22:59:36,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:59:37,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-28 22:59:40,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 22:59:41,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:59:44,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:59:44,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-28 22:59:44,580 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:59:45,961 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-28 22:59:47,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 22:59:51,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-28 22:59:51,164 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:59:51,181 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:59:51,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 22:59:53,280 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-28 22:59:54,526 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.840e+02 2.448e+02 2.857e+02 3.478e+02 5.752e+02, threshold=5.715e+02, percent-clipped=2.0 2023-09-28 22:59:54,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:59:54,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 22:59:54,827 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:59:56,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:59:57,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-28 22:59:57,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-28 23:00:01,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:00:04,579 INFO [train.py:1039] (1/4) Epoch 5, batch 4200, loss[loss=0.2797, simple_loss=0.3362, pruned_loss=0.1116, over 23310.00 frames. ], tot_loss[loss=0.2477, simple_loss=0.3075, pruned_loss=0.09391, over 4714071.85 frames. ], batch size: 105, lr: 1.94e-02, grad_scale: 16.0 2023-09-28 23:00:04,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-28 23:00:05,204 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=169653.33333333334, ans=0.0 2023-09-28 23:00:06,190 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 23:00:09,682 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:00:11,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:00:11,331 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:00:11,334 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:00:14,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-28 23:00:17,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-28 23:00:17,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:00:17,953 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=169653.33333333334, ans=0.125 2023-09-28 23:00:21,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:00:23,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:00:26,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-28 23:00:28,208 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:00:28,255 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:00:28,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-28 23:00:28,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:00:28,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:00:29,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:00:29,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 23:00:30,298 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=169720.0, ans=0.1 2023-09-28 23:00:32,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:00:34,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-28 23:00:34,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:00:39,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-28 23:00:41,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:00:44,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:00:46,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:00:46,705 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=169786.66666666666, ans=0.125 2023-09-28 23:00:47,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:00:47,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-28 23:00:47,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:00:48,193 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=169786.66666666666, ans=0.05 2023-09-28 23:00:50,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:00:52,746 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=169853.33333333334, ans=0.125 2023-09-28 23:00:57,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:00:59,745 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:01:03,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-28 23:01:07,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-28 23:01:11,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:01:15,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 23:01:16,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:01:19,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-28 23:01:26,900 INFO [train.py:1039] (1/4) Epoch 5, batch 4250, loss[loss=0.2399, simple_loss=0.2931, pruned_loss=0.09332, over 23454.00 frames. ], tot_loss[loss=0.2464, simple_loss=0.3066, pruned_loss=0.09309, over 4711561.07 frames. ], batch size: 134, lr: 1.94e-02, grad_scale: 16.0 2023-09-28 23:01:26,969 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-28 23:01:28,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:01:28,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-28 23:01:32,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:01:38,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-28 23:01:38,797 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-28 23:01:38,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:01:43,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:01:45,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:01:50,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:01:50,399 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:01:53,321 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:01:53,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:01:54,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:01:57,016 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:01:58,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:02:01,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:02:01,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:02:03,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-28 23:02:06,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-28 23:02:06,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:02:08,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:02:08,902 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:02:09,100 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=170120.0, ans=0.1 2023-09-28 23:02:10,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:02:10,435 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:02:11,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:02:15,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-28 23:02:16,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-28 23:02:17,445 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=15.68 vs. limit=15.0 2023-09-28 23:02:20,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:02:23,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:02:23,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-28 23:02:23,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:02:25,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-28 23:02:26,635 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-28 23:02:26,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:02:28,542 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=170186.66666666666, ans=0.0 2023-09-28 23:02:29,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:02:29,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:02:33,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-28 23:02:33,797 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=170253.33333333334, ans=0.0 2023-09-28 23:02:35,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 23:02:35,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-28 23:02:35,873 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.53 vs. limit=22.5 2023-09-28 23:02:39,741 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.682e+02 2.222e+02 2.521e+02 2.962e+02 6.093e+02, threshold=5.043e+02, percent-clipped=1.0 2023-09-28 23:02:40,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:02:42,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:02:43,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:02:45,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:02:46,965 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:02:48,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:02:49,892 INFO [train.py:1039] (1/4) Epoch 5, batch 4300, loss[loss=0.2438, simple_loss=0.3163, pruned_loss=0.08566, over 24445.00 frames. ], tot_loss[loss=0.2456, simple_loss=0.3053, pruned_loss=0.09294, over 4701316.59 frames. ], batch size: 69, lr: 1.94e-02, grad_scale: 16.0 2023-09-28 23:02:49,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:02:49,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-28 23:02:51,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:02:52,022 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=13.15 vs. limit=15.0 2023-09-28 23:02:58,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:02:58,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:03:01,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:03:08,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:03:08,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-28 23:03:09,676 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:03:11,357 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:03:11,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 23:03:11,403 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-28 23:03:15,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 23:03:18,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 23:03:20,359 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-28 23:03:21,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:03:21,810 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-28 23:03:24,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 23:03:26,559 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:03:28,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:03:28,678 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:03:30,189 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:03:31,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:03:31,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:03:32,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-28 23:03:33,466 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-28 23:03:35,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:03:37,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:03:38,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 23:03:38,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:03:38,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:03:38,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-28 23:03:38,831 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-28 23:03:40,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-28 23:03:41,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:03:41,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-28 23:03:41,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-28 23:03:48,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:03:50,147 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-28 23:03:52,318 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:03:52,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:03:52,541 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:03:55,587 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-28 23:03:55,955 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=170586.66666666666, ans=0.0 2023-09-28 23:03:57,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 23:03:57,026 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:03:57,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:03:57,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:03:57,266 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:04:00,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:04:00,960 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=170586.66666666666, ans=0.0 2023-09-28 23:04:02,446 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=170586.66666666666, ans=0.07 2023-09-28 23:04:03,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:04:03,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:04:05,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:04:10,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-28 23:04:10,840 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-28 23:04:12,507 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=170653.33333333334, ans=0.0 2023-09-28 23:04:12,524 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=170653.33333333334, ans=0.125 2023-09-28 23:04:13,724 INFO [train.py:1039] (1/4) Epoch 5, batch 4350, loss[loss=0.253, simple_loss=0.305, pruned_loss=0.1006, over 23763.00 frames. ], tot_loss[loss=0.2458, simple_loss=0.3056, pruned_loss=0.09298, over 4720533.60 frames. ], batch size: 195, lr: 1.94e-02, grad_scale: 16.0 2023-09-28 23:04:15,478 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:04:15,896 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=170653.33333333334, ans=0.125 2023-09-28 23:04:17,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:04:19,739 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=170653.33333333334, ans=0.1 2023-09-28 23:04:22,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:04:22,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:04:27,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:04:30,836 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:04:34,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:04:34,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:04:37,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:04:39,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:04:40,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:04:40,990 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=170720.0, ans=0.0 2023-09-28 23:04:45,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-28 23:04:48,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:04:50,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:04:55,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:04:58,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-28 23:05:00,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:05:01,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 23:05:07,313 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-28 23:05:09,612 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=170853.33333333334, ans=0.0 2023-09-28 23:05:10,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:05:10,798 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-28 23:05:12,257 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-28 23:05:12,380 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-28 23:05:12,388 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:05:12,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:05:13,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:05:15,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:05:15,455 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:05:16,800 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:05:18,406 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-28 23:05:18,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:05:18,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:05:18,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:05:20,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-28 23:05:20,717 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-28 23:05:22,109 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-28 23:05:22,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-28 23:05:25,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:05:25,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:05:25,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:05:25,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:05:27,195 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.519e+02 2.177e+02 2.511e+02 2.905e+02 5.033e+02, threshold=5.022e+02, percent-clipped=0.0 2023-09-28 23:05:28,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-28 23:05:31,961 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-28 23:05:31,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:05:35,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:05:35,182 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:05:36,623 INFO [train.py:1039] (1/4) Epoch 5, batch 4400, loss[loss=0.2173, simple_loss=0.2892, pruned_loss=0.07269, over 24499.00 frames. ], tot_loss[loss=0.2465, simple_loss=0.3064, pruned_loss=0.09327, over 4727655.93 frames. ], batch size: 63, lr: 1.93e-02, grad_scale: 32.0 2023-09-28 23:05:36,848 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:05:40,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-28 23:05:40,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-28 23:05:41,980 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-28 23:05:42,018 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-28 23:05:42,384 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=170986.66666666666, ans=0.125 2023-09-28 23:05:42,447 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=170986.66666666666, ans=0.125 2023-09-28 23:05:43,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 23:05:43,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:05:45,679 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-28 23:05:48,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:05:50,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:05:50,126 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-28 23:05:54,718 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:05:54,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-28 23:05:56,755 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-28 23:05:59,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-28 23:06:02,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-28 23:06:02,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-28 23:06:02,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:06:03,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:06:03,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:06:05,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:06:06,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-28 23:06:06,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-28 23:06:07,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:06:08,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:06:08,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:06:09,084 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=171120.0, ans=0.2 2023-09-28 23:06:10,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:06:11,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:06:11,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-28 23:06:11,960 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-28 23:06:16,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:06:25,040 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:06:26,583 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-28 23:06:29,760 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:06:33,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:06:35,061 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:06:35,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-28 23:06:37,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:06:37,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:06:37,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:06:37,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-28 23:06:42,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-28 23:06:46,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-28 23:06:48,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-28 23:06:48,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:06:48,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-28 23:06:49,707 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:06:55,415 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:06:57,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-28 23:06:57,241 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=171253.33333333334, ans=0.0 2023-09-28 23:07:00,187 INFO [train.py:1039] (1/4) Epoch 5, batch 4450, loss[loss=0.2438, simple_loss=0.3191, pruned_loss=0.08421, over 24546.00 frames. ], tot_loss[loss=0.247, simple_loss=0.3069, pruned_loss=0.09354, over 4720509.23 frames. ], batch size: 71, lr: 1.93e-02, grad_scale: 32.0 2023-09-28 23:07:01,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:07:03,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:07:05,044 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 23:07:07,514 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.98 vs. limit=6.0 2023-09-28 23:07:11,771 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:07:11,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:07:15,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:07:18,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:07:23,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:07:23,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:07:23,617 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 23:07:24,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-28 23:07:24,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:07:24,886 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:07:24,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:07:24,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:07:28,004 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 23:07:28,287 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=171386.66666666666, ans=0.05 2023-09-28 23:07:33,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:07:35,244 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:07:35,460 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:07:36,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:07:37,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:07:39,548 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.50 vs. limit=15.0 2023-09-28 23:07:40,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 23:07:42,755 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-28 23:07:42,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-28 23:07:42,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:07:45,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:07:46,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-28 23:07:49,888 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=171520.0, ans=0.1 2023-09-28 23:07:51,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-28 23:07:54,185 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:07:54,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-28 23:07:54,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:07:54,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:07:54,348 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:07:55,013 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.49 vs. limit=10.0 2023-09-28 23:07:55,751 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:07:58,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:08:03,565 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-28 23:08:03,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-28 23:08:05,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 23:08:05,582 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=171586.66666666666, ans=0.1 2023-09-28 23:08:08,920 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:08:10,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:08:11,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:08:11,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 23:08:13,314 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.679e+02 2.374e+02 2.783e+02 3.317e+02 5.756e+02, threshold=5.567e+02, percent-clipped=2.0 2023-09-28 23:08:15,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-28 23:08:18,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-28 23:08:19,817 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.95 vs. limit=6.0 2023-09-28 23:08:20,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:08:20,734 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=171586.66666666666, ans=0.125 2023-09-28 23:08:23,522 INFO [train.py:1039] (1/4) Epoch 5, batch 4500, loss[loss=0.2428, simple_loss=0.3146, pruned_loss=0.0855, over 24639.00 frames. ], tot_loss[loss=0.2478, simple_loss=0.3077, pruned_loss=0.09398, over 4717335.41 frames. ], batch size: 73, lr: 1.93e-02, grad_scale: 32.0 2023-09-28 23:08:25,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:08:26,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-28 23:08:26,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-28 23:08:28,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:08:33,049 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:08:33,135 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:08:33,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 23:08:34,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:08:34,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:08:34,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:08:47,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:08:48,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:08:52,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:08:52,403 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:08:55,330 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:09:01,557 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 23:09:06,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:09:06,460 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=171786.66666666666, ans=0.05 2023-09-28 23:09:09,909 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=171786.66666666666, ans=0.0 2023-09-28 23:09:11,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 23:09:13,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=171853.33333333334, ans=0.125 2023-09-28 23:09:14,317 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:09:14,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-28 23:09:14,494 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:09:16,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:09:18,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:09:18,304 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:09:20,314 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=171853.33333333334, ans=0.0 2023-09-28 23:09:21,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:09:21,521 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-28 23:09:21,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 23:09:21,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:09:28,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:09:28,773 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:09:31,635 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:09:32,519 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.81 vs. limit=22.5 2023-09-28 23:09:33,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:09:34,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:09:35,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-28 23:09:37,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-28 23:09:37,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-28 23:09:41,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-28 23:09:44,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-28 23:09:46,068 INFO [train.py:1039] (1/4) Epoch 5, batch 4550, loss[loss=0.2539, simple_loss=0.29, pruned_loss=0.1089, over 22789.00 frames. ], tot_loss[loss=0.2463, simple_loss=0.3061, pruned_loss=0.09324, over 4731154.65 frames. ], batch size: 322, lr: 1.93e-02, grad_scale: 32.0 2023-09-28 23:09:46,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:09:49,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:09:51,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:09:54,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:09:54,974 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=171986.66666666666, ans=0.0 2023-09-28 23:09:59,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:10:02,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:10:02,854 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:10:02,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:10:02,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:10:06,577 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:10:07,999 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:10:08,730 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.89 vs. limit=22.5 2023-09-28 23:10:11,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:10:12,721 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-28 23:10:14,193 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-28 23:10:14,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:10:16,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-28 23:10:20,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-28 23:10:21,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:10:24,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-28 23:10:27,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 23:10:30,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:10:30,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:10:30,619 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-28 23:10:33,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-28 23:10:37,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:10:40,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:10:40,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:10:42,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:10:44,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-28 23:10:44,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-28 23:10:44,217 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:10:45,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-28 23:10:48,148 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.whiten.whitening_limit, batch_count=172186.66666666666, ans=12.0 2023-09-28 23:10:48,837 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-28 23:10:48,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:10:49,051 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:10:49,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:10:51,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:10:51,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:10:52,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 23:10:54,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-28 23:10:54,864 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=172253.33333333334, ans=0.0 2023-09-28 23:10:55,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:10:55,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 23:10:56,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-28 23:10:57,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:10:57,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-28 23:10:58,976 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.720e+02 2.022e+02 2.307e+02 2.730e+02 4.696e+02, threshold=4.615e+02, percent-clipped=0.0 2023-09-28 23:11:00,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:11:00,612 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:11:04,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:11:04,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:11:05,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-28 23:11:06,017 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:11:07,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-28 23:11:08,881 INFO [train.py:1039] (1/4) Epoch 5, batch 4600, loss[loss=0.267, simple_loss=0.323, pruned_loss=0.1055, over 23349.00 frames. ], tot_loss[loss=0.2455, simple_loss=0.3056, pruned_loss=0.09266, over 4727078.35 frames. ], batch size: 93, lr: 1.93e-02, grad_scale: 32.0 2023-09-28 23:11:11,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:11:12,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:11:16,530 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:11:16,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:11:16,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:11:18,601 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=172320.0, ans=0.125 2023-09-28 23:11:19,814 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-28 23:11:21,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:11:23,985 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=172320.0, ans=0.0 2023-09-28 23:11:25,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:11:25,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:11:27,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:11:34,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-28 23:11:36,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:11:39,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:11:42,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:11:45,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:11:52,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-28 23:11:52,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 23:11:53,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:11:57,975 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.63 vs. limit=15.0 2023-09-28 23:11:58,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:11:58,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:12:02,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:12:05,490 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-28 23:12:05,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-28 23:12:10,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:12:11,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:12:12,391 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=172520.0, ans=0.0 2023-09-28 23:12:13,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:12:13,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 23:12:14,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:12:14,298 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=172520.0, ans=0.07 2023-09-28 23:12:15,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-28 23:12:15,452 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:12:16,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:12:18,382 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:12:18,502 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:12:18,803 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=172586.66666666666, ans=0.09899494936611666 2023-09-28 23:12:20,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:12:21,318 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.85 vs. limit=15.0 2023-09-28 23:12:22,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-28 23:12:22,460 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=172586.66666666666, ans=0.125 2023-09-28 23:12:23,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-28 23:12:23,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-28 23:12:23,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:12:26,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:12:26,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:12:27,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:12:33,418 INFO [train.py:1039] (1/4) Epoch 5, batch 4650, loss[loss=0.2581, simple_loss=0.3032, pruned_loss=0.1065, over 22782.00 frames. ], tot_loss[loss=0.244, simple_loss=0.3043, pruned_loss=0.09187, over 4733992.03 frames. ], batch size: 322, lr: 1.93e-02, grad_scale: 32.0 2023-09-28 23:12:37,644 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=172653.33333333334, ans=0.125 2023-09-28 23:12:38,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:12:39,136 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=172653.33333333334, ans=0.125 2023-09-28 23:12:41,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:12:41,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:12:43,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:12:43,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:12:43,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:12:43,857 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=172653.33333333334, ans=0.05 2023-09-28 23:12:45,080 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:12:48,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-28 23:12:51,844 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=172720.0, ans=0.0 2023-09-28 23:12:53,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:12:55,418 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-28 23:12:56,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:12:58,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-28 23:12:58,303 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:12:58,388 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-28 23:12:58,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-28 23:12:58,425 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:12:59,877 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:13:03,517 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 23:13:03,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:13:03,717 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-28 23:13:06,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:13:08,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-28 23:13:09,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:13:09,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:13:12,020 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-28 23:13:13,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:13:13,990 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=172786.66666666666, ans=0.1 2023-09-28 23:13:18,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:13:21,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:13:29,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:13:31,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:13:32,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:13:32,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:13:36,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-28 23:13:36,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-28 23:13:36,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 23:13:36,878 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-28 23:13:39,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:13:41,936 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=172920.0, ans=10.0 2023-09-28 23:13:43,560 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 23:13:45,873 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.666e+02 2.231e+02 2.488e+02 2.964e+02 5.544e+02, threshold=4.977e+02, percent-clipped=2.0 2023-09-28 23:13:48,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:13:48,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:13:48,206 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-28 23:13:48,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:13:49,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:13:49,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:13:50,187 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=172920.0, ans=0.125 2023-09-28 23:13:51,432 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:13:53,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:13:53,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:13:54,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:13:56,018 INFO [train.py:1039] (1/4) Epoch 5, batch 4700, loss[loss=0.2287, simple_loss=0.3002, pruned_loss=0.07861, over 24488.00 frames. ], tot_loss[loss=0.2436, simple_loss=0.3041, pruned_loss=0.09154, over 4738906.84 frames. ], batch size: 66, lr: 1.92e-02, grad_scale: 32.0 2023-09-28 23:13:58,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:13:59,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:13:59,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 23:14:01,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-28 23:14:01,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-28 23:14:02,928 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-28 23:14:08,416 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=172986.66666666666, ans=0.07 2023-09-28 23:14:11,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:14:12,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:14:13,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:14:13,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:14:15,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 23:14:19,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-28 23:14:19,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-28 23:14:23,424 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:14:23,561 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:14:24,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:14:26,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:14:34,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 23:14:36,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 23:14:38,378 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=173120.0, ans=0.2 2023-09-28 23:14:40,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:14:47,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-28 23:14:48,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:14:50,775 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.39 vs. limit=15.0 2023-09-28 23:14:51,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:14:52,097 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=173186.66666666666, ans=0.0 2023-09-28 23:14:54,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-28 23:14:54,835 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:14:59,940 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:15:01,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-28 23:15:01,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:15:02,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:15:05,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:15:06,060 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:15:06,377 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=173253.33333333334, ans=0.125 2023-09-28 23:15:08,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-28 23:15:08,161 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-28 23:15:11,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:15:12,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:15:12,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:15:12,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-28 23:15:14,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:15:19,183 INFO [train.py:1039] (1/4) Epoch 5, batch 4750, loss[loss=0.2753, simple_loss=0.3189, pruned_loss=0.1158, over 23437.00 frames. ], tot_loss[loss=0.2449, simple_loss=0.3053, pruned_loss=0.0923, over 4739029.76 frames. ], batch size: 285, lr: 1.92e-02, grad_scale: 32.0 2023-09-28 23:15:19,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-28 23:15:21,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:15:23,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:15:27,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:15:27,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:15:29,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-28 23:15:29,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:15:34,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-28 23:15:37,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:15:37,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:15:39,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:15:44,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-28 23:15:49,000 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:15:51,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-28 23:15:52,245 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=173453.33333333334, ans=0.125 2023-09-28 23:15:53,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:15:56,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:15:56,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:15:56,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:15:57,707 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-28 23:15:57,711 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-28 23:15:58,212 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=173453.33333333334, ans=0.0 2023-09-28 23:16:02,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-28 23:16:05,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:16:07,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:16:09,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 23:16:09,125 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-28 23:16:09,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:16:12,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:16:14,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:16:14,737 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.07 vs. limit=15.0 2023-09-28 23:16:17,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-28 23:16:17,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-28 23:16:19,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:16:19,048 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:16:19,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:16:20,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 23:16:20,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-28 23:16:21,272 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.39 vs. limit=22.5 2023-09-28 23:16:24,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-28 23:16:25,191 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.33 vs. limit=15.0 2023-09-28 23:16:25,202 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.73 vs. limit=15.0 2023-09-28 23:16:25,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:16:27,648 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:16:27,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-28 23:16:29,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:16:31,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:16:32,601 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.700e+02 2.125e+02 2.370e+02 2.784e+02 4.798e+02, threshold=4.741e+02, percent-clipped=0.0 2023-09-28 23:16:32,907 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:16:34,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:16:34,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 23:16:39,551 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:16:39,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-28 23:16:41,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-28 23:16:42,588 INFO [train.py:1039] (1/4) Epoch 5, batch 4800, loss[loss=0.1992, simple_loss=0.2671, pruned_loss=0.06559, over 17482.00 frames. ], tot_loss[loss=0.2452, simple_loss=0.3056, pruned_loss=0.09247, over 4730447.38 frames. ], batch size: 38, lr: 1.92e-02, grad_scale: 32.0 2023-09-28 23:16:42,737 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-28 23:16:45,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-28 23:16:45,830 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:16:48,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-28 23:16:53,893 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:16:55,504 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:16:56,597 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=8.08 vs. limit=15.0 2023-09-28 23:16:57,477 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=173720.0, ans=0.0 2023-09-28 23:16:59,345 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:17:02,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:17:02,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:17:02,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-28 23:17:03,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:17:03,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:17:05,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:17:05,751 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=173720.0, ans=0.2 2023-09-28 23:17:12,390 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:17:12,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:17:12,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:17:16,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:17:16,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 23:17:16,232 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:17:17,218 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.98 vs. limit=6.0 2023-09-28 23:17:17,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:17:20,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:17:23,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:17:25,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:17:25,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-28 23:17:26,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 23:17:28,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:17:32,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-28 23:17:32,172 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-28 23:17:32,283 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:17:32,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:17:33,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:17:33,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:17:33,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-28 23:17:34,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:17:35,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:17:37,989 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:17:38,273 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=173853.33333333334, ans=0.125 2023-09-28 23:17:41,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:17:43,233 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:17:47,035 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=173853.33333333334, ans=0.1 2023-09-28 23:17:48,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-28 23:17:48,280 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:17:48,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:17:49,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 23:17:49,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:17:53,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:17:54,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:17:54,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:17:54,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:17:54,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:17:56,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 23:18:00,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:18:00,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:18:00,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:18:01,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-28 23:18:04,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-28 23:18:04,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:18:04,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:18:06,215 INFO [train.py:1039] (1/4) Epoch 5, batch 4850, loss[loss=0.2546, simple_loss=0.3044, pruned_loss=0.1024, over 23254.00 frames. ], tot_loss[loss=0.2462, simple_loss=0.3063, pruned_loss=0.09309, over 4724377.83 frames. ], batch size: 119, lr: 1.92e-02, grad_scale: 32.0 2023-09-28 23:18:06,376 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:18:06,377 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:18:09,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:18:10,677 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.05 vs. limit=15.0 2023-09-28 23:18:15,602 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=173986.66666666666, ans=0.0 2023-09-28 23:18:19,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-28 23:18:21,276 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:18:27,860 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:18:29,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 23:18:29,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:18:31,395 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=174053.33333333334, ans=0.2 2023-09-28 23:18:32,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:18:32,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:18:34,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:18:34,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-28 23:18:39,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:18:42,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:18:42,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 23:18:44,085 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 23:18:44,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-28 23:18:46,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:18:46,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:18:49,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:18:49,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-28 23:18:50,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-28 23:18:53,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 23:18:58,477 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=174186.66666666666, ans=0.1 2023-09-28 23:19:01,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:19:02,607 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-28 23:19:02,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:19:02,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:19:04,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-28 23:19:06,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-28 23:19:06,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:19:07,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-28 23:19:07,832 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=174186.66666666666, ans=0.0 2023-09-28 23:19:08,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:19:10,501 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:19:12,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-28 23:19:18,958 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.747e+02 2.375e+02 2.676e+02 3.229e+02 5.316e+02, threshold=5.352e+02, percent-clipped=3.0 2023-09-28 23:19:22,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:19:28,098 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:19:28,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:19:30,256 INFO [train.py:1039] (1/4) Epoch 5, batch 4900, loss[loss=0.2399, simple_loss=0.2852, pruned_loss=0.09735, over 23646.00 frames. ], tot_loss[loss=0.2448, simple_loss=0.3048, pruned_loss=0.09239, over 4729829.77 frames. ], batch size: 232, lr: 1.92e-02, grad_scale: 32.0 2023-09-28 23:19:33,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-28 23:19:33,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:19:36,846 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=174320.0, ans=0.05 2023-09-28 23:19:36,947 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=174320.0, ans=0.125 2023-09-28 23:19:39,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:19:40,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:19:42,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-28 23:19:43,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-28 23:19:44,170 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=174386.66666666666, ans=0.0 2023-09-28 23:19:44,228 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=174386.66666666666, ans=0.125 2023-09-28 23:19:46,638 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=174386.66666666666, ans=0.125 2023-09-28 23:19:49,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-28 23:19:49,697 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=174386.66666666666, ans=0.1 2023-09-28 23:19:54,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-28 23:19:54,467 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=174386.66666666666, ans=0.2 2023-09-28 23:19:55,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-28 23:19:55,631 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-28 23:19:55,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:19:55,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:19:55,732 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:19:55,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:19:57,696 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-28 23:19:58,418 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=14.01 vs. limit=15.0 2023-09-28 23:20:01,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-28 23:20:01,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 23:20:03,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:20:05,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-28 23:20:08,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:20:08,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:20:09,731 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:20:09,745 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-28 23:20:10,506 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.95 vs. limit=15.0 2023-09-28 23:20:11,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 23:20:11,624 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 23:20:12,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:20:12,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-28 23:20:12,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-28 23:20:17,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-28 23:20:19,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-28 23:20:21,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:20:22,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:20:22,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:20:22,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 23:20:22,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:20:22,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-28 23:20:24,590 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:20:24,719 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_abs, batch_count=174520.0, ans=0.5 2023-09-28 23:20:27,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-28 23:20:29,732 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.90 vs. limit=15.0 2023-09-28 23:20:30,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:20:34,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-28 23:20:36,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:20:36,924 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=13.76 vs. limit=15.0 2023-09-28 23:20:37,624 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-28 23:20:37,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-28 23:20:44,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:20:44,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:20:44,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-28 23:20:44,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 23:20:46,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:20:46,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:20:52,893 INFO [train.py:1039] (1/4) Epoch 5, batch 4950, loss[loss=0.2184, simple_loss=0.2512, pruned_loss=0.09276, over 19205.00 frames. ], tot_loss[loss=0.2447, simple_loss=0.3039, pruned_loss=0.09271, over 4720594.02 frames. ], batch size: 389, lr: 1.92e-02, grad_scale: 32.0 2023-09-28 23:20:53,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:20:53,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-28 23:20:54,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:20:54,585 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-28 23:20:56,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 23:20:59,622 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:20:59,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 23:21:01,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-28 23:21:01,328 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-28 23:21:01,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-28 23:21:02,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-28 23:21:02,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:21:02,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:21:03,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:21:03,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:21:06,602 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:21:06,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:21:10,072 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:21:10,867 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=13.55 vs. limit=15.0 2023-09-28 23:21:12,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:21:12,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:21:13,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:21:16,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 23:21:19,473 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.18 vs. limit=15.0 2023-09-28 23:21:20,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=174720.0, ans=0.0 2023-09-28 23:21:21,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:21:23,142 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.max_abs, batch_count=174720.0, ans=10.0 2023-09-28 23:21:25,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:21:25,536 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=174786.66666666666, ans=0.0 2023-09-28 23:21:26,774 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:21:28,217 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:21:28,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:21:31,493 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-28 23:21:31,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-28 23:21:33,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:21:34,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:21:34,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:21:36,360 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=174786.66666666666, ans=0.0 2023-09-28 23:21:38,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:21:38,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:21:38,611 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-28 23:21:41,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:21:43,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-28 23:21:45,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:21:46,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:21:48,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:21:48,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-28 23:21:49,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:21:51,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 23:21:55,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:21:56,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:21:56,083 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:21:56,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:21:57,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:21:57,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:21:59,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:21:59,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:22:01,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:22:03,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-28 23:22:05,864 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.824e+02 2.215e+02 2.550e+02 3.115e+02 4.856e+02, threshold=5.099e+02, percent-clipped=0.0 2023-09-28 23:22:07,560 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:22:12,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-28 23:22:12,891 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-28 23:22:16,396 INFO [train.py:1039] (1/4) Epoch 5, batch 5000, loss[loss=0.2376, simple_loss=0.3152, pruned_loss=0.07998, over 24457.00 frames. ], tot_loss[loss=0.2437, simple_loss=0.3029, pruned_loss=0.09227, over 4712854.57 frames. ], batch size: 66, lr: 1.91e-02, grad_scale: 32.0 2023-09-28 23:22:21,778 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:22:21,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:22:22,018 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-28 23:22:22,195 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=174986.66666666666, ans=0.035 2023-09-28 23:22:22,779 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=13.77 vs. limit=15.0 2023-09-28 23:22:23,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-28 23:22:26,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:22:28,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-28 23:22:28,947 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.23 vs. limit=15.0 2023-09-28 23:22:29,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:22:29,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 23:22:29,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-28 23:22:31,203 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:22:31,314 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:22:33,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-28 23:22:33,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:22:33,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:22:34,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-28 23:22:35,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-28 23:22:36,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:22:36,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-28 23:22:36,606 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 23:22:38,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:22:38,195 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 23:22:38,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-28 23:22:38,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-28 23:22:39,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-28 23:22:39,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:22:41,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:22:42,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-28 23:22:42,961 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:22:45,148 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:22:47,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:22:48,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-28 23:22:50,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-28 23:22:50,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:22:52,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:22:57,024 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-28 23:23:00,183 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:23:01,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:23:01,652 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:23:04,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-28 23:23:04,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:23:05,039 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:23:05,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:23:06,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-28 23:23:08,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:23:11,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:23:13,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:23:19,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-28 23:23:25,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:23:34,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:23:36,219 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:23:36,230 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:23:36,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:23:37,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 23:23:37,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:23:37,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:23:39,237 INFO [train.py:1039] (1/4) Epoch 5, batch 5050, loss[loss=0.2329, simple_loss=0.3087, pruned_loss=0.07855, over 24633.00 frames. ], tot_loss[loss=0.2442, simple_loss=0.3036, pruned_loss=0.09247, over 4712293.20 frames. ], batch size: 73, lr: 1.91e-02, grad_scale: 32.0 2023-09-28 23:23:42,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:23:44,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-28 23:23:44,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:23:47,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:23:49,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:23:49,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-28 23:23:50,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:23:50,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:23:53,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 23:23:55,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 23:23:55,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-28 23:24:04,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-28 23:24:06,164 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-28 23:24:08,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:24:08,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-28 23:24:09,769 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:24:11,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:24:11,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:24:11,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:24:11,390 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-28 23:24:12,887 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-28 23:24:14,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:24:17,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:24:21,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:24:21,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-28 23:24:22,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:24:25,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-28 23:24:27,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 23:24:29,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:24:29,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:24:31,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:24:32,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:24:34,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:24:35,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:24:35,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:24:36,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:24:36,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-28 23:24:37,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:24:39,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:24:42,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:24:42,661 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-28 23:24:42,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-28 23:24:46,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:24:47,435 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:24:47,474 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-28 23:24:50,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:24:50,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-28 23:24:50,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:24:51,912 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.754e+02 2.101e+02 2.649e+02 3.047e+02 4.508e+02, threshold=5.297e+02, percent-clipped=0.0 2023-09-28 23:24:52,413 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=175586.66666666666, ans=0.0 2023-09-28 23:24:55,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:24:57,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:24:57,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-28 23:24:57,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-28 23:25:00,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:25:00,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:25:01,969 INFO [train.py:1039] (1/4) Epoch 5, batch 5100, loss[loss=0.2416, simple_loss=0.298, pruned_loss=0.09258, over 17101.00 frames. ], tot_loss[loss=0.2442, simple_loss=0.3037, pruned_loss=0.09241, over 4702035.96 frames. ], batch size: 37, lr: 1.91e-02, grad_scale: 32.0 2023-09-28 23:25:02,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:25:05,040 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-28 23:25:07,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:25:09,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-28 23:25:11,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-28 23:25:11,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:25:12,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:25:14,519 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=175653.33333333334, ans=0.125 2023-09-28 23:25:15,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:25:17,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-28 23:25:17,186 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-28 23:25:20,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:25:22,403 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:25:24,887 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.50 vs. limit=15.0 2023-09-28 23:25:25,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:25:29,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-28 23:25:30,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:25:32,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:25:32,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-28 23:25:35,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:25:37,506 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:25:37,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-28 23:25:39,769 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-28 23:25:40,330 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.20 vs. limit=15.0 2023-09-28 23:25:42,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:25:42,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-28 23:25:42,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-28 23:25:47,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:25:48,022 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.75 vs. limit=10.0 2023-09-28 23:25:56,927 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:25:59,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-28 23:26:00,057 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-28 23:26:01,427 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-28 23:26:03,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-28 23:26:03,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:26:05,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-28 23:26:10,004 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-28 23:26:11,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 23:26:13,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:26:15,924 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-28 23:26:18,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-28 23:26:18,856 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-28 23:26:24,741 INFO [train.py:1039] (1/4) Epoch 5, batch 5150, loss[loss=0.2605, simple_loss=0.3281, pruned_loss=0.09642, over 24655.00 frames. ], tot_loss[loss=0.2456, simple_loss=0.3049, pruned_loss=0.09316, over 4699111.61 frames. ], batch size: 73, lr: 1.91e-02, grad_scale: 32.0 2023-09-28 23:26:24,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:26:24,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:26:24,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:26:25,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:26:25,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 23:26:26,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:26:26,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-28 23:26:26,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-28 23:26:28,461 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-28 23:26:28,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:26:28,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-28 23:26:30,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:26:31,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 23:26:33,620 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:26:35,163 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:26:39,066 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=175986.66666666666, ans=0.07 2023-09-28 23:26:39,201 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=175986.66666666666, ans=0.1 2023-09-28 23:26:40,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 23:26:40,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-28 23:26:41,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:26:41,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 23:26:44,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-28 23:26:44,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:26:44,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:26:46,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:26:46,377 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:26:46,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-28 23:26:47,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:26:48,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:26:50,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 23:26:52,078 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-28 23:26:53,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:26:55,312 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=176053.33333333334, ans=0.1 2023-09-28 23:26:59,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-28 23:27:01,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-28 23:27:05,305 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:27:05,555 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=176120.0, ans=0.125 2023-09-28 23:27:11,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:27:13,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:27:16,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:27:16,416 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:27:20,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-28 23:27:21,651 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=176186.66666666666, ans=0.125 2023-09-28 23:27:24,547 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:27:26,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-28 23:27:26,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:27:30,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:27:31,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:27:32,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-28 23:27:32,945 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=176253.33333333334, ans=0.1 2023-09-28 23:27:37,275 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 2.252e+02 2.481e+02 2.857e+02 3.938e+02, threshold=4.962e+02, percent-clipped=0.0 2023-09-28 23:27:37,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:27:37,873 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=176253.33333333334, ans=0.1 2023-09-28 23:27:40,874 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 23:27:42,606 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:27:42,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:27:44,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-28 23:27:44,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-28 23:27:44,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:27:44,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:27:47,116 INFO [train.py:1039] (1/4) Epoch 5, batch 5200, loss[loss=0.2089, simple_loss=0.2776, pruned_loss=0.07007, over 24565.00 frames. ], tot_loss[loss=0.2465, simple_loss=0.3056, pruned_loss=0.09373, over 4680510.16 frames. ], batch size: 60, lr: 1.91e-02, grad_scale: 32.0 2023-09-28 23:27:49,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:27:51,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:27:55,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:27:58,269 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.63 vs. limit=12.0 2023-09-28 23:28:01,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-28 23:28:01,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:28:02,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:28:03,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:28:04,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:28:05,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:28:07,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-28 23:28:09,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 23:28:09,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:28:12,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-28 23:28:16,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-28 23:28:16,890 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.94 vs. limit=15.0 2023-09-28 23:28:17,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:28:17,790 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-28 23:28:17,847 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-28 23:28:22,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-28 23:28:22,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:28:22,295 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-28 23:28:22,315 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:28:25,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:28:25,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:28:27,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-28 23:28:27,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:28:29,180 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=176453.33333333334, ans=0.125 2023-09-28 23:28:30,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:28:34,563 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-28 23:28:34,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-28 23:28:34,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-28 23:28:38,010 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=176520.0, ans=0.125 2023-09-28 23:28:40,246 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.85 vs. limit=12.0 2023-09-28 23:28:40,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-28 23:28:40,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 23:28:47,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:28:47,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:28:48,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-28 23:28:48,778 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:28:48,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-28 23:28:48,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:28:50,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 23:28:54,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:28:55,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:28:58,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:29:00,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:29:00,154 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:29:06,665 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.21 vs. limit=22.5 2023-09-28 23:29:07,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:29:07,406 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-28 23:29:08,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:29:08,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:29:09,237 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=176653.33333333334, ans=0.0 2023-09-28 23:29:10,857 INFO [train.py:1039] (1/4) Epoch 5, batch 5250, loss[loss=0.2175, simple_loss=0.2841, pruned_loss=0.07551, over 18483.00 frames. ], tot_loss[loss=0.246, simple_loss=0.3049, pruned_loss=0.09357, over 4672891.20 frames. ], batch size: 40, lr: 1.91e-02, grad_scale: 32.0 2023-09-28 23:29:10,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:29:11,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-28 23:29:11,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:29:14,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:29:17,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:29:17,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:29:18,902 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:29:22,293 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 23:29:23,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:29:27,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:29:28,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:29:29,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 23:29:31,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-28 23:29:31,995 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:29:34,078 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:29:37,824 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=176720.0, ans=0.0 2023-09-28 23:29:51,219 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=16.50 vs. limit=15.0 2023-09-28 23:30:12,053 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.83 vs. limit=22.5 2023-09-28 23:30:16,424 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.843e+02 2.340e+02 2.624e+02 3.163e+02 5.259e+02, threshold=5.248e+02, percent-clipped=2.0 2023-09-28 23:30:16,846 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=176920.0, ans=0.0 2023-09-28 23:30:24,702 INFO [train.py:1039] (1/4) Epoch 5, batch 5300, loss[loss=0.2679, simple_loss=0.3291, pruned_loss=0.1034, over 24575.00 frames. ], tot_loss[loss=0.2456, simple_loss=0.3042, pruned_loss=0.09355, over 4678668.65 frames. ], batch size: 71, lr: 1.90e-02, grad_scale: 32.0 2023-09-28 23:30:24,909 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=176986.66666666666, ans=0.125 2023-09-28 23:30:40,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:30:40,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 23:30:40,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 23:30:40,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:30:41,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:30:41,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:30:41,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:30:41,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:30:41,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:30:41,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:30:41,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 23:30:41,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:30:41,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 23:30:42,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 23:30:42,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 23:30:42,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 23:30:42,399 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 23:30:42,525 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 23:30:42,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:30:43,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:30:43,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:30:43,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:30:44,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:30:44,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:30:44,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:30:44,603 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:30:44,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:30:44,795 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:30:44,802 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:30:44,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:30:44,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:30:45,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 23:30:45,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:30:46,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:30:46,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 23:30:46,328 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 23:30:46,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:30:46,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:30:46,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 23:30:46,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 23:30:46,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 23:30:47,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:30:48,206 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:30:48,362 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 23:30:48,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 23:30:48,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 23:30:48,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:30:48,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 23:30:48,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 23:30:48,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 23:30:49,236 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-28 23:30:57,113 INFO [train.py:1039] (1/4) Epoch 6, batch 0, loss[loss=0.2545, simple_loss=0.3094, pruned_loss=0.09985, over 23803.00 frames. ], tot_loss[loss=0.2545, simple_loss=0.3094, pruned_loss=0.09985, over 23803.00 frames. ], batch size: 164, lr: 1.78e-02, grad_scale: 32.0 2023-09-28 23:30:57,113 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-28 23:31:12,854 INFO [train.py:1071] (1/4) Epoch 6, validation: loss=0.2892, simple_loss=0.2993, pruned_loss=0.1395, over 1125622.00 frames. 2023-09-28 23:31:12,855 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-28 23:31:16,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-28 23:31:16,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:31:18,415 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:31:24,426 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:31:24,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:31:24,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:31:24,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-28 23:31:26,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-28 23:31:27,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:31:29,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:31:32,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:31:32,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:31:34,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:31:34,133 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:31:35,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-28 23:31:38,617 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:31:44,926 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:31:44,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:31:49,367 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-28 23:31:53,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:31:53,819 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:31:55,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:31:55,569 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=177200.0, ans=0.04949747468305833 2023-09-28 23:32:00,036 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:32:04,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:32:04,872 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=177266.66666666666, ans=0.125 2023-09-28 23:32:10,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-28 23:32:12,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-28 23:32:13,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:32:13,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:32:15,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:32:17,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:32:17,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-28 23:32:17,788 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=177333.33333333334, ans=0.125 2023-09-28 23:32:19,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:32:22,001 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=177333.33333333334, ans=0.2 2023-09-28 23:32:23,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:32:26,177 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=177333.33333333334, ans=0.2 2023-09-28 23:32:27,486 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:32:31,279 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.52 vs. limit=6.0 2023-09-28 23:32:32,020 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-28 23:32:33,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:32:34,991 INFO [train.py:1039] (1/4) Epoch 6, batch 50, loss[loss=0.2317, simple_loss=0.3046, pruned_loss=0.07935, over 24641.00 frames. ], tot_loss[loss=0.2513, simple_loss=0.3091, pruned_loss=0.09673, over 1059551.41 frames. ], batch size: 68, lr: 1.77e-02, grad_scale: 32.0 2023-09-28 23:32:38,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:32:38,537 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=177400.0, ans=0.09899494936611666 2023-09-28 23:32:41,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:32:41,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-28 23:32:41,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 23:32:42,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:32:44,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:32:46,270 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:32:49,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:32:55,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-28 23:32:55,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:33:03,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-28 23:33:04,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-28 23:33:06,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-28 23:33:08,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:33:08,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:33:09,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:33:11,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:33:11,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-28 23:33:13,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 23:33:13,054 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:33:21,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:33:22,583 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:33:22,744 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=177600.0, ans=0.125 2023-09-28 23:33:23,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 23:33:25,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-28 23:33:26,831 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.775e+02 2.186e+02 2.592e+02 3.142e+02 7.850e+02, threshold=5.184e+02, percent-clipped=2.0 2023-09-28 23:33:27,062 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:33:28,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 23:33:28,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-28 23:33:28,673 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=177600.0, ans=0.125 2023-09-28 23:33:28,810 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=177600.0, ans=0.125 2023-09-28 23:33:29,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:33:30,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-28 23:33:39,181 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.22 vs. limit=15.0 2023-09-28 23:33:39,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:33:40,617 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:33:42,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:33:43,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:33:43,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-28 23:33:45,967 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=177666.66666666666, ans=0.125 2023-09-28 23:33:47,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-28 23:33:47,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-28 23:33:47,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:33:48,786 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-28 23:33:50,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:33:50,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:33:50,670 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=177666.66666666666, ans=0.0 2023-09-28 23:33:51,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-28 23:33:52,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-28 23:33:54,773 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-28 23:33:54,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:33:54,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:33:56,335 INFO [train.py:1039] (1/4) Epoch 6, batch 100, loss[loss=0.218, simple_loss=0.2896, pruned_loss=0.07324, over 24473.00 frames. ], tot_loss[loss=0.2456, simple_loss=0.3069, pruned_loss=0.09214, over 1888758.45 frames. ], batch size: 63, lr: 1.77e-02, grad_scale: 32.0 2023-09-28 23:33:56,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-28 23:33:56,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-28 23:33:58,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:33:58,132 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:34:01,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-28 23:34:01,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:34:02,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:34:05,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:34:08,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:34:11,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-28 23:34:12,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:34:17,602 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:34:17,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:34:17,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:34:17,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:34:19,102 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:34:19,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-28 23:34:21,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-28 23:34:21,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:34:21,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:34:21,541 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:34:25,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-28 23:34:25,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:34:27,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:34:28,389 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.99 vs. limit=15.0 2023-09-28 23:34:28,918 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-28 23:34:30,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 23:34:34,160 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-28 23:34:34,185 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-28 23:34:37,244 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:34:37,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:34:40,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-28 23:34:42,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:34:43,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:34:51,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:34:52,814 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-28 23:34:54,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-28 23:34:58,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:34:59,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:35:01,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:35:02,972 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=178000.0, ans=0.0 2023-09-28 23:35:04,303 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.max_abs, batch_count=178000.0, ans=10.0 2023-09-28 23:35:05,563 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:35:07,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:35:08,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:35:10,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:35:11,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:35:14,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:35:15,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:35:15,039 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:35:16,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-28 23:35:18,638 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-28 23:35:18,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:35:18,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:35:20,656 INFO [train.py:1039] (1/4) Epoch 6, batch 150, loss[loss=0.2325, simple_loss=0.306, pruned_loss=0.07946, over 24042.00 frames. ], tot_loss[loss=0.2475, simple_loss=0.3069, pruned_loss=0.09403, over 2504079.78 frames. ], batch size: 80, lr: 1.77e-02, grad_scale: 32.0 2023-09-28 23:35:20,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:35:20,810 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:35:20,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 23:35:22,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 23:35:22,410 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-28 23:35:22,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:35:22,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:35:22,858 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=178066.66666666666, ans=0.0 2023-09-28 23:35:24,573 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:35:24,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:35:24,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:35:27,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:35:31,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:35:31,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:35:31,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:35:33,327 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=178066.66666666666, ans=0.125 2023-09-28 23:35:34,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:35:35,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:35:36,785 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.12 vs. limit=22.5 2023-09-28 23:35:37,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:35:39,187 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:35:39,898 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.05 vs. limit=22.5 2023-09-28 23:35:41,657 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.40 vs. limit=22.5 2023-09-28 23:35:42,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-28 23:35:42,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-28 23:35:42,489 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-28 23:35:47,016 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:35:47,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 23:35:47,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:35:48,853 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:35:48,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:35:50,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:35:50,423 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:35:52,068 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.59 vs. limit=15.0 2023-09-28 23:35:54,358 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-28 23:35:56,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:36:01,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:36:04,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:36:06,505 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-28 23:36:09,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:36:09,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:36:10,916 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:36:12,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:36:13,832 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.731e+02 2.160e+02 2.435e+02 3.119e+02 4.742e+02, threshold=4.869e+02, percent-clipped=0.0 2023-09-28 23:36:15,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:36:15,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:36:16,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:36:17,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-28 23:36:20,378 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=178266.66666666666, ans=0.125 2023-09-28 23:36:23,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:36:25,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:36:25,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:36:25,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:36:28,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:36:31,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 23:36:33,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:36:34,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:36:36,296 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:36:38,436 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-28 23:36:38,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-28 23:36:38,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:36:38,538 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-28 23:36:42,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:36:43,678 INFO [train.py:1039] (1/4) Epoch 6, batch 200, loss[loss=0.2207, simple_loss=0.2961, pruned_loss=0.07264, over 24424.00 frames. ], tot_loss[loss=0.2476, simple_loss=0.3069, pruned_loss=0.09417, over 2990577.53 frames. ], batch size: 69, lr: 1.77e-02, grad_scale: 32.0 2023-09-28 23:36:45,625 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=178400.0, ans=0.125 2023-09-28 23:36:46,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:36:46,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:36:47,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=178400.0, ans=0.125 2023-09-28 23:36:48,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-28 23:36:48,761 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=178400.0, ans=0.2 2023-09-28 23:36:50,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:36:50,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:36:51,830 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-28 23:36:53,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-28 23:36:54,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:36:56,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:37:01,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:37:01,656 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:37:01,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:37:03,587 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=178466.66666666666, ans=0.0 2023-09-28 23:37:13,501 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=178466.66666666666, ans=0.2 2023-09-28 23:37:24,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:37:24,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:37:24,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:37:25,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:37:26,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 23:37:26,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 23:37:28,233 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.88 vs. limit=22.5 2023-09-28 23:37:29,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:37:30,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 23:37:32,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:37:32,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:37:32,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-28 23:37:33,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 23:37:33,863 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:37:37,276 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.69 vs. limit=15.0 2023-09-28 23:37:39,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:37:44,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:37:51,851 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:37:53,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:37:59,684 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:38:01,982 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.66 vs. limit=6.0 2023-09-28 23:38:02,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-28 23:38:02,738 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:38:02,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:38:02,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:38:04,283 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:38:05,648 INFO [train.py:1039] (1/4) Epoch 6, batch 250, loss[loss=0.2378, simple_loss=0.2999, pruned_loss=0.08783, over 23950.00 frames. ], tot_loss[loss=0.2444, simple_loss=0.3051, pruned_loss=0.09185, over 3385623.62 frames. ], batch size: 86, lr: 1.77e-02, grad_scale: 32.0 2023-09-28 23:38:07,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-28 23:38:07,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:38:08,672 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-28 23:38:10,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:38:12,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:38:13,262 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:38:13,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:38:17,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:38:17,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:38:19,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:38:24,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:38:24,916 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.39 vs. limit=10.0 2023-09-28 23:38:30,238 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=178800.0, ans=0.0 2023-09-28 23:38:35,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:38:37,637 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:38:39,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:38:46,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-28 23:38:46,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-28 23:38:47,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:38:47,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:38:49,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 23:38:49,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:38:51,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:38:52,845 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:38:56,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-28 23:38:56,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:38:57,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:38:58,071 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=178933.33333333334, ans=0.1 2023-09-28 23:38:59,143 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 2.163e+02 2.470e+02 2.985e+02 4.206e+02, threshold=4.941e+02, percent-clipped=0.0 2023-09-28 23:38:59,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-28 23:38:59,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:39:01,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:39:02,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 23:39:02,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 23:39:04,616 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:39:04,896 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=178933.33333333334, ans=0.125 2023-09-28 23:39:06,090 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:39:06,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:39:09,321 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:39:09,776 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=178933.33333333334, ans=0.0 2023-09-28 23:39:12,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:39:13,221 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=4.20 vs. limit=12.0 2023-09-28 23:39:14,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:39:15,141 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.03 vs. limit=10.0 2023-09-28 23:39:18,145 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=179000.0, ans=0.0 2023-09-28 23:39:21,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:39:23,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:39:26,466 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-28 23:39:26,709 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 23:39:28,038 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:39:28,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:39:29,806 INFO [train.py:1039] (1/4) Epoch 6, batch 300, loss[loss=0.2661, simple_loss=0.2875, pruned_loss=0.1223, over 19569.00 frames. ], tot_loss[loss=0.2425, simple_loss=0.3029, pruned_loss=0.09099, over 3681884.02 frames. ], batch size: 389, lr: 1.77e-02, grad_scale: 32.0 2023-09-28 23:39:30,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-28 23:39:30,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-28 23:39:31,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:39:31,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-28 23:39:33,892 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=179066.66666666666, ans=0.0 2023-09-28 23:39:36,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:39:38,300 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:39:38,649 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 23:39:40,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:39:41,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-28 23:39:41,747 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:39:44,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 23:39:44,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-28 23:39:44,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:39:49,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:39:54,615 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:39:54,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-28 23:39:58,590 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-28 23:39:58,680 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:39:59,008 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=179133.33333333334, ans=0.0 2023-09-28 23:40:01,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:40:03,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:40:03,670 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-28 23:40:03,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:40:06,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:40:10,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:40:10,115 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:40:14,783 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-28 23:40:14,792 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-28 23:40:16,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:40:17,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:40:19,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-28 23:40:21,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:40:24,965 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:40:28,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:40:28,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-28 23:40:31,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:40:31,757 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 23:40:35,478 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:40:36,987 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:40:38,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-28 23:40:38,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 23:40:39,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:40:40,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-28 23:40:40,292 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=179333.33333333334, ans=0.125 2023-09-28 23:40:43,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:40:43,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:40:45,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:40:46,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:40:46,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:40:52,695 INFO [train.py:1039] (1/4) Epoch 6, batch 350, loss[loss=0.257, simple_loss=0.3145, pruned_loss=0.09969, over 23275.00 frames. ], tot_loss[loss=0.2396, simple_loss=0.2996, pruned_loss=0.08976, over 3889593.14 frames. ], batch size: 93, lr: 1.77e-02, grad_scale: 32.0 2023-09-28 23:40:52,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:40:52,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 23:40:55,827 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:40:56,005 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=179400.0, ans=0.1 2023-09-28 23:41:03,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:41:07,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:41:07,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:41:10,784 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-28 23:41:10,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:41:11,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-28 23:41:14,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:41:16,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-28 23:41:16,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:41:16,878 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=14.52 vs. limit=15.0 2023-09-28 23:41:21,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-28 23:41:21,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:41:21,601 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=179466.66666666666, ans=0.125 2023-09-28 23:41:24,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:41:24,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:41:25,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:41:25,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:41:27,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:41:27,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:41:28,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:41:30,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:41:30,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:41:32,199 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=179533.33333333334, ans=0.125 2023-09-28 23:41:39,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:41:39,418 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-28 23:41:40,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:41:40,942 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:41:45,455 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.694e+02 2.204e+02 2.490e+02 2.803e+02 5.345e+02, threshold=4.981e+02, percent-clipped=1.0 2023-09-28 23:41:47,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-28 23:41:47,107 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:41:54,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:41:54,477 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:41:54,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:41:54,788 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=179600.0, ans=0.125 2023-09-28 23:41:56,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-28 23:41:57,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:41:59,166 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-28 23:42:00,725 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-28 23:42:00,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:42:02,714 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=179666.66666666666, ans=0.0 2023-09-28 23:42:03,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:42:03,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-28 23:42:05,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:42:08,549 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=179666.66666666666, ans=0.1 2023-09-28 23:42:09,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:42:09,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:42:12,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:42:12,130 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:42:15,715 INFO [train.py:1039] (1/4) Epoch 6, batch 400, loss[loss=0.2684, simple_loss=0.3399, pruned_loss=0.09845, over 24369.00 frames. ], tot_loss[loss=0.2391, simple_loss=0.2997, pruned_loss=0.08924, over 4077767.04 frames. ], batch size: 77, lr: 1.76e-02, grad_scale: 32.0 2023-09-28 23:42:15,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:42:17,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:42:21,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:42:21,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-28 23:42:21,946 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:42:23,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:42:25,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:42:26,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:42:29,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:42:29,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:42:32,305 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-28 23:42:33,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-28 23:42:33,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:42:35,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-28 23:42:35,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:42:39,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:42:39,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:42:40,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-28 23:42:41,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:42:41,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:42:41,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:42:41,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:42:43,288 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-28 23:42:43,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-28 23:42:47,244 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=179866.66666666666, ans=0.125 2023-09-28 23:42:50,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:42:50,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:42:52,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-28 23:42:53,925 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-28 23:42:57,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:42:59,390 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:43:05,046 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-28 23:43:08,058 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-28 23:43:10,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-28 23:43:12,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:43:14,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-28 23:43:15,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-28 23:43:15,748 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=179933.33333333334, ans=0.0 2023-09-28 23:43:19,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:43:21,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 23:43:23,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:43:24,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:43:24,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-28 23:43:29,144 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-28 23:43:31,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-28 23:43:32,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 23:43:32,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:43:36,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-28 23:43:38,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 23:43:38,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:43:38,320 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-28 23:43:39,730 INFO [train.py:1039] (1/4) Epoch 6, batch 450, loss[loss=0.2608, simple_loss=0.329, pruned_loss=0.09631, over 24397.00 frames. ], tot_loss[loss=0.2404, simple_loss=0.3005, pruned_loss=0.09014, over 4204460.49 frames. ], batch size: 77, lr: 1.76e-02, grad_scale: 32.0 2023-09-28 23:43:41,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-28 23:43:41,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:43:41,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:43:42,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:43:42,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-28 23:43:43,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:43:44,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 23:43:44,806 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=180066.66666666666, ans=0.0 2023-09-28 23:43:46,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:43:59,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:43:59,729 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:44:01,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-28 23:44:02,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-28 23:44:06,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-28 23:44:09,448 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=180133.33333333334, ans=0.125 2023-09-28 23:44:10,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:44:12,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:44:17,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:44:17,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:44:19,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-28 23:44:20,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-28 23:44:22,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-28 23:44:22,297 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:44:23,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:44:25,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:44:28,860 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-28 23:44:28,873 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-28 23:44:28,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:44:30,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:44:31,953 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.626e+02 2.131e+02 2.453e+02 2.864e+02 4.653e+02, threshold=4.906e+02, percent-clipped=0.0 2023-09-28 23:44:32,127 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-28 23:44:37,551 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-28 23:44:37,616 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:44:38,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-28 23:44:39,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-28 23:44:42,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:44:44,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-28 23:44:45,671 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 23:44:45,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-28 23:44:50,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:44:50,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-28 23:44:52,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-28 23:44:53,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:44:57,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:44:59,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:45:00,776 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:45:00,821 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-28 23:45:02,041 INFO [train.py:1039] (1/4) Epoch 6, batch 500, loss[loss=0.2899, simple_loss=0.3268, pruned_loss=0.1266, over 22674.00 frames. ], tot_loss[loss=0.2421, simple_loss=0.3021, pruned_loss=0.09102, over 4310837.88 frames. ], batch size: 322, lr: 1.76e-02, grad_scale: 32.0 2023-09-28 23:45:05,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:45:05,422 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=180400.0, ans=0.2 2023-09-28 23:45:06,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:45:06,864 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:45:06,883 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-28 23:45:07,321 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=180400.0, ans=0.125 2023-09-28 23:45:09,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-28 23:45:09,779 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:45:12,653 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.12 vs. limit=10.0 2023-09-28 23:45:13,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 23:45:17,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 23:45:17,579 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:45:20,671 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:45:20,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:45:20,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:45:31,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:45:32,572 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.63 vs. limit=15.0 2023-09-28 23:45:33,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-28 23:45:33,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-28 23:45:33,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:45:33,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-28 23:45:35,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:45:38,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:45:40,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:45:40,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:45:40,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:45:40,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-28 23:45:43,432 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-28 23:45:47,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:45:47,483 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=180533.33333333334, ans=0.2 2023-09-28 23:45:48,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:45:48,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:45:50,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:45:50,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-28 23:45:52,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-28 23:45:55,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:45:57,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:46:01,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:46:07,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:46:14,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:46:16,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-28 23:46:16,282 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:46:16,313 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:46:18,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-28 23:46:19,461 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-28 23:46:19,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:46:24,643 INFO [train.py:1039] (1/4) Epoch 6, batch 550, loss[loss=0.3292, simple_loss=0.3592, pruned_loss=0.1496, over 19152.00 frames. ], tot_loss[loss=0.2436, simple_loss=0.3034, pruned_loss=0.09188, over 4389862.46 frames. ], batch size: 388, lr: 1.76e-02, grad_scale: 32.0 2023-09-28 23:46:28,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-28 23:46:28,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-28 23:46:28,521 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:46:28,772 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=180733.33333333334, ans=0.125 2023-09-28 23:46:29,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-28 23:46:30,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:46:30,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:46:30,208 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:46:32,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:46:33,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:46:33,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:46:36,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:46:38,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-28 23:46:38,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:46:43,899 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:46:43,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:46:45,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:46:47,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:46:51,083 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.31 vs. limit=15.0 2023-09-28 23:46:51,802 WARNING [train.py:1197] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-28 23:46:51,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-28 23:46:53,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:47:00,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:47:00,659 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 23:47:02,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-28 23:47:05,202 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:47:05,213 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-28 23:47:07,381 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:47:08,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 23:47:09,037 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=180866.66666666666, ans=0.125 2023-09-28 23:47:13,271 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 23:47:14,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 23:47:14,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-28 23:47:14,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:47:15,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-28 23:47:17,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-28 23:47:18,454 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.759e+02 2.230e+02 2.579e+02 3.045e+02 5.000e+02, threshold=5.158e+02, percent-clipped=1.0 2023-09-28 23:47:18,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:47:18,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:47:20,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:47:20,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:47:23,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:47:24,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-28 23:47:26,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:47:27,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:47:29,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 23:47:29,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:47:33,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:47:34,525 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:47:34,616 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:47:36,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-28 23:47:36,879 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-28 23:47:44,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-28 23:47:47,847 INFO [train.py:1039] (1/4) Epoch 6, batch 600, loss[loss=0.2795, simple_loss=0.321, pruned_loss=0.119, over 23776.00 frames. ], tot_loss[loss=0.2444, simple_loss=0.3043, pruned_loss=0.09231, over 4461873.35 frames. ], batch size: 164, lr: 1.76e-02, grad_scale: 32.0 2023-09-28 23:47:49,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-28 23:47:51,569 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:47:51,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 23:47:51,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:47:57,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:47:59,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 23:48:01,000 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-28 23:48:03,962 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-28 23:48:06,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:48:07,778 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:48:09,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-28 23:48:09,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:48:17,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-28 23:48:20,011 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=181200.0, ans=0.125 2023-09-28 23:48:21,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:48:21,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:48:21,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:48:23,562 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=181200.0, ans=0.125 2023-09-28 23:48:29,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:48:29,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:48:29,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:48:36,005 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=181266.66666666666, ans=0.0 2023-09-28 23:48:37,526 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=181266.66666666666, ans=0.125 2023-09-28 23:48:38,765 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:48:42,816 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:48:42,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:48:42,851 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:48:48,625 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=181266.66666666666, ans=0.1 2023-09-28 23:48:51,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-28 23:48:56,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-28 23:48:56,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:49:00,549 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=181333.33333333334, ans=0.1 2023-09-28 23:49:02,076 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=181333.33333333334, ans=0.125 2023-09-28 23:49:03,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-28 23:49:03,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:49:04,241 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.39 vs. limit=15.0 2023-09-28 23:49:06,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-28 23:49:06,304 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:49:06,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 23:49:11,015 INFO [train.py:1039] (1/4) Epoch 6, batch 650, loss[loss=0.2501, simple_loss=0.3003, pruned_loss=0.09997, over 23792.00 frames. ], tot_loss[loss=0.2421, simple_loss=0.3024, pruned_loss=0.09093, over 4525767.86 frames. ], batch size: 164, lr: 1.76e-02, grad_scale: 32.0 2023-09-28 23:49:13,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 23:49:13,385 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=181400.0, ans=0.125 2023-09-28 23:49:14,562 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:49:16,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:49:16,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:49:19,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:49:23,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-28 23:49:23,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:49:28,796 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=181466.66666666666, ans=0.125 2023-09-28 23:49:30,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:49:30,042 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:49:35,117 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:49:38,415 WARNING [train.py:1197] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-28 23:49:39,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:49:40,073 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:49:43,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:49:45,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 23:49:47,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:49:48,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:49:48,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 23:49:50,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:49:50,594 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:49:52,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:49:54,017 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-28 23:49:54,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:49:54,085 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:49:57,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:49:58,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:49:58,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:49:58,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-28 23:50:00,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-28 23:50:01,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:50:01,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:50:05,124 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.839e+02 2.251e+02 2.578e+02 2.975e+02 4.088e+02, threshold=5.156e+02, percent-clipped=0.0 2023-09-28 23:50:05,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-28 23:50:05,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:50:05,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 23:50:08,817 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-28 23:50:08,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-28 23:50:09,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:50:09,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:50:09,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:50:09,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:50:10,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:50:18,930 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:50:18,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:50:21,754 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:50:24,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:50:24,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 23:50:25,481 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:50:33,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 23:50:33,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:50:33,148 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:50:34,418 INFO [train.py:1039] (1/4) Epoch 6, batch 700, loss[loss=0.2264, simple_loss=0.2991, pruned_loss=0.07688, over 24489.00 frames. ], tot_loss[loss=0.2405, simple_loss=0.3014, pruned_loss=0.08976, over 4577267.16 frames. ], batch size: 63, lr: 1.76e-02, grad_scale: 32.0 2023-09-28 23:50:34,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:50:37,786 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-28 23:50:37,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-28 23:50:42,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-28 23:50:42,879 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.89 vs. limit=15.0 2023-09-28 23:50:43,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:50:45,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:50:45,641 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=181733.33333333334, ans=0.04949747468305833 2023-09-28 23:50:48,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-28 23:50:52,013 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:50:55,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:50:56,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:50:58,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-28 23:50:58,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:51:02,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:51:05,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 23:51:05,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:51:08,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-28 23:51:12,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-28 23:51:15,697 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-28 23:51:15,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:51:17,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-28 23:51:22,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:51:22,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-28 23:51:29,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:51:29,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:51:29,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-28 23:51:29,474 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=181933.33333333334, ans=0.125 2023-09-28 23:51:34,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:51:34,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:51:37,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:51:44,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:51:44,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-28 23:51:47,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-28 23:51:47,800 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-28 23:51:51,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:51:52,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:51:54,437 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:51:56,099 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:51:56,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-28 23:51:57,427 INFO [train.py:1039] (1/4) Epoch 6, batch 750, loss[loss=0.2542, simple_loss=0.3034, pruned_loss=0.1025, over 23751.00 frames. ], tot_loss[loss=0.2405, simple_loss=0.3012, pruned_loss=0.0899, over 4604185.12 frames. ], batch size: 232, lr: 1.75e-02, grad_scale: 32.0 2023-09-28 23:51:58,456 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.23 vs. limit=6.0 2023-09-28 23:52:02,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-28 23:52:02,166 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-28 23:52:03,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-28 23:52:03,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-28 23:52:05,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-28 23:52:05,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:52:07,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-28 23:52:08,182 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=182066.66666666666, ans=0.125 2023-09-28 23:52:09,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:52:09,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-28 23:52:09,812 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=182066.66666666666, ans=0.125 2023-09-28 23:52:12,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:52:14,072 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:52:15,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-28 23:52:15,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:52:17,333 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:52:17,697 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 23:52:18,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:52:21,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:52:22,897 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=182133.33333333334, ans=0.0 2023-09-28 23:52:24,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:52:25,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:52:25,540 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-28 23:52:27,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:52:28,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:52:30,780 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:52:32,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-28 23:52:32,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-28 23:52:32,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:52:35,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-28 23:52:35,533 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-28 23:52:37,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-28 23:52:37,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:52:37,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 23:52:39,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:52:44,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-28 23:52:44,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:52:44,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 23:52:47,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:52:49,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:52:51,234 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.788e+02 2.168e+02 2.495e+02 2.811e+02 4.815e+02, threshold=4.990e+02, percent-clipped=0.0 2023-09-28 23:52:51,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-28 23:52:51,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:52:52,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-28 23:52:53,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:52:54,956 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=182266.66666666666, ans=0.125 2023-09-28 23:52:56,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:52:56,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-28 23:52:57,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:53:05,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:53:05,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:53:07,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:53:10,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:53:14,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-28 23:53:14,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:53:14,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:53:14,794 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 23:53:16,188 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:53:18,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:53:21,136 INFO [train.py:1039] (1/4) Epoch 6, batch 800, loss[loss=0.2324, simple_loss=0.3088, pruned_loss=0.078, over 24676.00 frames. ], tot_loss[loss=0.2411, simple_loss=0.3014, pruned_loss=0.09038, over 4622557.83 frames. ], batch size: 73, lr: 1.75e-02, grad_scale: 32.0 2023-09-28 23:53:21,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:53:21,329 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-28 23:53:28,496 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=182400.0, ans=0.0 2023-09-28 23:53:29,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:53:29,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:53:31,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:53:31,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:53:34,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:53:34,198 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:53:35,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:53:40,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:53:40,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 23:53:44,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-28 23:53:44,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:53:45,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:53:46,080 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=182466.66666666666, ans=0.2 2023-09-28 23:53:47,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:53:47,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:53:47,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-28 23:53:47,299 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:53:47,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-28 23:53:47,640 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=182466.66666666666, ans=0.125 2023-09-28 23:53:49,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:53:50,283 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=182466.66666666666, ans=0.5 2023-09-28 23:53:51,736 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:53:54,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:53:54,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:53:56,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:53:58,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:54:02,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:54:04,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 23:54:04,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-28 23:54:07,631 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-28 23:54:09,162 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-28 23:54:09,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 23:54:09,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:54:12,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:54:12,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:54:12,481 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=182600.0, ans=0.2 2023-09-28 23:54:18,798 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-28 23:54:18,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-28 23:54:21,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:54:23,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:54:25,289 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=182666.66666666666, ans=0.035 2023-09-28 23:54:27,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:54:30,473 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:54:31,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-28 23:54:33,371 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:54:37,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-28 23:54:42,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 23:54:43,432 INFO [train.py:1039] (1/4) Epoch 6, batch 850, loss[loss=0.3469, simple_loss=0.3739, pruned_loss=0.1599, over 19128.00 frames. ], tot_loss[loss=0.2433, simple_loss=0.3036, pruned_loss=0.09144, over 4647707.40 frames. ], batch size: 388, lr: 1.75e-02, grad_scale: 32.0 2023-09-28 23:54:45,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:54:45,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-28 23:54:45,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:54:48,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:54:48,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-28 23:54:48,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:54:49,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:54:52,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:54:53,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 23:54:55,093 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:54:56,550 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-28 23:54:56,633 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-28 23:54:58,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-28 23:54:59,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 23:55:00,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:55:02,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:55:03,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:55:03,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:55:04,027 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=182800.0, ans=0.125 2023-09-28 23:55:08,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:55:08,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:55:10,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-28 23:55:11,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-28 23:55:14,894 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:55:16,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-28 23:55:20,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-28 23:55:22,617 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-28 23:55:24,275 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-28 23:55:25,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:55:25,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:55:25,671 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 23:55:27,796 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:55:29,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:55:29,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-28 23:55:33,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:55:33,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:55:35,160 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:55:36,613 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-28 23:55:37,993 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.793e+02 2.156e+02 2.378e+02 2.757e+02 3.805e+02, threshold=4.755e+02, percent-clipped=0.0 2023-09-28 23:55:38,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:55:39,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-28 23:55:39,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-28 23:55:43,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:55:43,714 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:55:45,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:55:45,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:55:46,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:55:48,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:55:50,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:55:52,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-28 23:55:53,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:55:53,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-28 23:56:02,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-28 23:56:02,352 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=183000.0, ans=0.125 2023-09-28 23:56:03,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:56:05,016 INFO [train.py:1039] (1/4) Epoch 6, batch 900, loss[loss=0.1913, simple_loss=0.2612, pruned_loss=0.0607, over 24367.00 frames. ], tot_loss[loss=0.2434, simple_loss=0.3037, pruned_loss=0.09159, over 4666199.23 frames. ], batch size: 56, lr: 1.75e-02, grad_scale: 32.0 2023-09-28 23:56:05,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-28 23:56:05,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:56:05,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:56:05,594 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=183066.66666666666, ans=0.125 2023-09-28 23:56:06,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-28 23:56:07,047 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=183066.66666666666, ans=0.125 2023-09-28 23:56:13,706 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:56:18,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:56:20,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-28 23:56:21,040 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.98 vs. limit=15.0 2023-09-28 23:56:21,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 23:56:23,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-28 23:56:23,481 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-28 23:56:24,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:56:24,977 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:56:25,042 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 23:56:25,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:56:38,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:56:38,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:56:38,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 23:56:42,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:56:47,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-28 23:56:47,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:56:52,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-28 23:56:52,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-28 23:56:52,982 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-28 23:56:54,405 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-28 23:56:56,344 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=183266.66666666666, ans=0.125 2023-09-28 23:57:00,599 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-28 23:57:00,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:57:02,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 23:57:02,522 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=183266.66666666666, ans=0.125 2023-09-28 23:57:09,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:57:09,542 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:57:11,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-28 23:57:11,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:57:14,218 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-28 23:57:15,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:57:15,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:57:17,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:57:19,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:57:23,183 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-28 23:57:23,238 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-28 23:57:24,868 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-28 23:57:24,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-28 23:57:29,116 INFO [train.py:1039] (1/4) Epoch 6, batch 950, loss[loss=0.2357, simple_loss=0.286, pruned_loss=0.09268, over 23535.00 frames. ], tot_loss[loss=0.2432, simple_loss=0.3034, pruned_loss=0.09145, over 4673946.32 frames. ], batch size: 256, lr: 1.75e-02, grad_scale: 16.0 2023-09-28 23:57:29,182 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:57:32,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-28 23:57:38,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:57:39,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:57:40,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:57:41,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 23:57:43,142 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-28 23:57:43,597 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=183466.66666666666, ans=0.0 2023-09-28 23:57:46,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:57:47,166 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=183466.66666666666, ans=0.125 2023-09-28 23:57:47,236 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=183466.66666666666, ans=0.125 2023-09-28 23:57:48,462 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:57:50,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:57:50,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:57:50,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-28 23:57:50,590 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-28 23:57:53,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:57:55,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-28 23:57:55,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:57:59,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:57:59,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:57:59,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:58:00,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-28 23:58:02,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 23:58:05,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:58:06,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 23:58:11,452 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:58:11,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:58:14,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-28 23:58:18,466 WARNING [train.py:1197] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 23:58:18,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:58:18,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:58:20,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:58:20,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:58:24,055 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=183600.0, ans=0.125 2023-09-28 23:58:25,031 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.674e+02 2.262e+02 2.553e+02 3.079e+02 4.621e+02, threshold=5.106e+02, percent-clipped=0.0 2023-09-28 23:58:25,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-28 23:58:26,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:58:26,944 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=183600.0, ans=0.125 2023-09-28 23:58:28,592 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=183600.0, ans=0.0 2023-09-28 23:58:29,753 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:58:30,499 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:58:30,536 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-28 23:58:32,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:58:32,058 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:58:32,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-28 23:58:33,025 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.88 vs. limit=15.0 2023-09-28 23:58:36,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:58:38,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:58:43,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:58:44,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-28 23:58:44,881 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-28 23:58:49,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:58:49,812 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=183733.33333333334, ans=0.025 2023-09-28 23:58:49,927 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=183733.33333333334, ans=0.04949747468305833 2023-09-28 23:58:51,489 INFO [train.py:1039] (1/4) Epoch 6, batch 1000, loss[loss=0.2453, simple_loss=0.2976, pruned_loss=0.09648, over 23645.00 frames. ], tot_loss[loss=0.242, simple_loss=0.3022, pruned_loss=0.09088, over 4676527.68 frames. ], batch size: 149, lr: 1.75e-02, grad_scale: 16.0 2023-09-28 23:58:52,337 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.64 vs. limit=15.0 2023-09-28 23:58:53,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-28 23:58:53,578 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=183733.33333333334, ans=0.125 2023-09-28 23:58:54,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:58:55,056 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=183733.33333333334, ans=0.0 2023-09-28 23:58:56,639 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 23:59:00,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:59:01,625 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-28 23:59:01,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-28 23:59:07,367 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=183800.0, ans=0.125 2023-09-28 23:59:08,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:59:08,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:59:08,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:59:13,283 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-28 23:59:16,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-28 23:59:19,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-28 23:59:19,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:59:21,000 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-28 23:59:22,683 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-28 23:59:22,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-28 23:59:24,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:59:24,612 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=183866.66666666666, ans=0.1 2023-09-28 23:59:26,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:59:32,382 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 23:59:33,849 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=183866.66666666666, ans=0.125 2023-09-28 23:59:35,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:59:35,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:59:37,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:59:37,885 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.77 vs. limit=12.0 2023-09-28 23:59:38,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:59:38,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-28 23:59:38,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:59:40,140 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:59:40,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:59:41,679 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-28 23:59:44,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-28 23:59:45,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-28 23:59:48,249 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.33 vs. limit=15.0 2023-09-28 23:59:48,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-28 23:59:50,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:59:55,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:59:56,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-28 23:59:56,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:59:58,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:00:00,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 00:00:02,834 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=184000.0, ans=0.0 2023-09-29 00:00:03,927 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:00:04,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 00:00:05,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 00:00:07,070 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:00:07,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:00:09,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:00:12,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 00:00:14,314 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=184066.66666666666, ans=0.125 2023-09-29 00:00:15,324 INFO [train.py:1039] (1/4) Epoch 6, batch 1050, loss[loss=0.2138, simple_loss=0.2933, pruned_loss=0.06709, over 24460.00 frames. ], tot_loss[loss=0.2404, simple_loss=0.3006, pruned_loss=0.09011, over 4687519.49 frames. ], batch size: 69, lr: 1.74e-02, grad_scale: 16.0 2023-09-29 00:00:15,410 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:00:19,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:00:20,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:00:22,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 00:00:24,050 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:00:25,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:00:27,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:00:28,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:00:30,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:00:32,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:00:32,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 00:00:33,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:00:33,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 00:00:36,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:00:36,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 00:00:39,999 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:00:40,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 00:00:40,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 00:00:45,793 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=184133.33333333334, ans=0.125 2023-09-29 00:00:47,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:00:48,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:00:48,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:00:49,739 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=184200.0, ans=0.125 2023-09-29 00:00:50,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 00:00:52,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 00:00:52,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:00:54,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 00:00:57,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 00:00:58,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:01:00,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 00:01:01,287 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.07 vs. limit=22.5 2023-09-29 00:01:01,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 00:01:01,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:01:02,073 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:01:08,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:01:09,457 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.19 vs. limit=12.0 2023-09-29 00:01:12,413 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 00:01:14,309 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 2.070e+02 2.251e+02 2.597e+02 4.023e+02, threshold=4.502e+02, percent-clipped=0.0 2023-09-29 00:01:14,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 00:01:14,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 00:01:16,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:01:16,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:01:17,736 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 00:01:22,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:01:24,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:01:24,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:01:24,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:01:24,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:01:30,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:01:30,748 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 00:01:32,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:01:32,355 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 00:01:32,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 00:01:33,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:01:38,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:01:39,987 INFO [train.py:1039] (1/4) Epoch 6, batch 1100, loss[loss=0.2293, simple_loss=0.2994, pruned_loss=0.07957, over 24284.00 frames. ], tot_loss[loss=0.2404, simple_loss=0.3002, pruned_loss=0.09025, over 4668105.71 frames. ], batch size: 61, lr: 1.74e-02, grad_scale: 16.0 2023-09-29 00:01:43,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:01:46,901 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.69 vs. limit=10.0 2023-09-29 00:01:49,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 00:01:51,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:01:51,429 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:01:51,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 00:01:53,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:01:56,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 00:01:57,474 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=184466.66666666666, ans=0.0 2023-09-29 00:01:58,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:02:01,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:02:01,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 00:02:03,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 00:02:04,761 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:02:04,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:02:07,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:02:09,462 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 00:02:14,043 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:02:14,489 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=184533.33333333334, ans=0.125 2023-09-29 00:02:17,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 00:02:18,026 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 00:02:20,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:02:23,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:02:23,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 00:02:25,181 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:02:26,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 00:02:27,016 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=184533.33333333334, ans=0.0 2023-09-29 00:02:28,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:02:28,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:02:28,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:02:28,295 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:02:28,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 00:02:35,369 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:02:35,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 00:02:35,736 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=184600.0, ans=0.0 2023-09-29 00:02:36,167 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.26 vs. limit=15.0 2023-09-29 00:02:36,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 00:02:41,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 00:02:44,839 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 00:02:44,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 00:02:46,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:02:48,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:02:49,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:02:51,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 00:02:53,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:02:53,441 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:02:53,654 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=184666.66666666666, ans=0.125 2023-09-29 00:02:54,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 00:02:54,997 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:02:57,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 00:02:58,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:02:58,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:02:58,846 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=184666.66666666666, ans=0.0 2023-09-29 00:03:00,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:03:04,730 INFO [train.py:1039] (1/4) Epoch 6, batch 1150, loss[loss=0.2525, simple_loss=0.3047, pruned_loss=0.1002, over 23629.00 frames. ], tot_loss[loss=0.2397, simple_loss=0.3003, pruned_loss=0.08962, over 4687998.13 frames. ], batch size: 149, lr: 1.74e-02, grad_scale: 16.0 2023-09-29 00:03:06,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:03:07,437 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=184733.33333333334, ans=0.125 2023-09-29 00:03:07,450 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=184733.33333333334, ans=0.1 2023-09-29 00:03:10,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:03:11,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:03:11,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:03:11,853 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 00:03:13,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:03:14,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 00:03:16,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:03:16,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 00:03:21,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 00:03:24,631 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:03:29,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:03:31,940 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:03:32,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 00:03:32,050 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:03:32,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:03:32,449 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=184800.0, ans=0.125 2023-09-29 00:03:32,469 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=184800.0, ans=0.0 2023-09-29 00:03:35,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 00:03:37,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:03:38,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:03:48,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:03:56,230 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:03:56,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 00:03:56,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:03:56,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:03:59,789 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=184933.33333333334, ans=0.0 2023-09-29 00:04:00,839 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.699e+02 2.083e+02 2.291e+02 2.736e+02 4.000e+02, threshold=4.583e+02, percent-clipped=0.0 2023-09-29 00:04:02,583 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 00:04:02,748 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=184933.33333333334, ans=0.125 2023-09-29 00:04:05,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:04:15,208 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 00:04:19,026 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:04:21,903 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:04:21,952 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 00:04:23,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:04:26,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:04:27,743 INFO [train.py:1039] (1/4) Epoch 6, batch 1200, loss[loss=0.2301, simple_loss=0.3089, pruned_loss=0.07564, over 24357.00 frames. ], tot_loss[loss=0.2402, simple_loss=0.301, pruned_loss=0.08969, over 4695781.27 frames. ], batch size: 74, lr: 1.74e-02, grad_scale: 32.0 2023-09-29 00:04:32,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:04:32,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:04:33,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:04:33,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:04:33,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:04:35,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:04:36,864 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 00:04:40,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:04:40,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:04:41,819 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=185133.33333333334, ans=0.1 2023-09-29 00:04:43,775 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 00:04:47,255 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 00:04:48,972 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=185133.33333333334, ans=0.1 2023-09-29 00:04:51,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:04:54,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:04:58,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:04:59,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:04:59,575 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 00:05:01,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:05:07,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 00:05:07,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:05:07,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 00:05:07,286 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:05:11,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 00:05:12,369 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.90 vs. limit=6.0 2023-09-29 00:05:15,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=185266.66666666666, ans=0.125 2023-09-29 00:05:16,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 00:05:16,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:05:18,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:05:20,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:05:20,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 00:05:20,657 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=185266.66666666666, ans=0.1 2023-09-29 00:05:22,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:05:22,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:05:24,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:05:24,175 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 00:05:24,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 00:05:25,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:05:25,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 00:05:29,123 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:05:29,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:05:29,312 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=185266.66666666666, ans=0.0 2023-09-29 00:05:32,356 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=185333.33333333334, ans=0.125 2023-09-29 00:05:33,783 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 00:05:35,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:05:39,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 00:05:40,428 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.46 vs. limit=15.0 2023-09-29 00:05:43,004 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 00:05:44,472 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:05:45,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:05:46,709 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.00 vs. limit=22.5 2023-09-29 00:05:47,248 INFO [train.py:1039] (1/4) Epoch 6, batch 1250, loss[loss=0.2351, simple_loss=0.3091, pruned_loss=0.08055, over 24667.00 frames. ], tot_loss[loss=0.2404, simple_loss=0.3015, pruned_loss=0.08967, over 4715909.91 frames. ], batch size: 73, lr: 1.74e-02, grad_scale: 32.0 2023-09-29 00:05:48,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:05:51,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:05:54,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 00:05:57,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:05:58,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:05:59,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 00:05:59,275 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=185400.0, ans=0.125 2023-09-29 00:06:00,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:06:02,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 00:06:05,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 00:06:08,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:06:08,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:06:08,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:06:11,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 00:06:15,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 00:06:15,732 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 00:06:15,740 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:06:17,174 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:06:17,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:06:20,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:06:22,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 00:06:25,305 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=12.67 vs. limit=15.0 2023-09-29 00:06:29,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 00:06:30,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:06:35,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:06:35,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 00:06:35,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:06:36,608 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 00:06:36,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:06:36,644 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:06:39,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:06:42,093 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.677e+02 2.172e+02 2.410e+02 2.804e+02 3.996e+02, threshold=4.819e+02, percent-clipped=0.0 2023-09-29 00:06:42,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:06:42,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:06:43,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 00:06:43,941 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 00:06:43,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 00:06:44,268 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=185600.0, ans=0.1 2023-09-29 00:06:48,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:06:49,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 00:06:49,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:06:52,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 00:06:52,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:06:55,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 00:06:56,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 00:06:56,107 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:06:56,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 00:06:56,423 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=185666.66666666666, ans=0.125 2023-09-29 00:06:57,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:06:57,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 00:07:03,005 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:07:03,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:07:04,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 00:07:06,374 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 00:07:07,766 INFO [train.py:1039] (1/4) Epoch 6, batch 1300, loss[loss=0.1934, simple_loss=0.2663, pruned_loss=0.06023, over 24332.00 frames. ], tot_loss[loss=0.2401, simple_loss=0.3013, pruned_loss=0.08945, over 4712568.05 frames. ], batch size: 61, lr: 1.74e-02, grad_scale: 32.0 2023-09-29 00:07:11,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:07:11,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 00:07:14,350 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:07:15,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 00:07:17,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:07:18,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:07:20,403 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:07:21,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 00:07:25,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:07:27,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:07:29,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 00:07:33,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 00:07:37,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:07:37,309 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:07:39,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:07:39,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:07:40,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 00:07:40,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 00:07:42,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 00:07:45,577 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=185866.66666666666, ans=0.125 2023-09-29 00:07:48,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:07:48,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 00:07:49,880 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 00:07:50,000 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 00:07:52,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:07:53,064 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=185866.66666666666, ans=0.125 2023-09-29 00:07:55,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:07:57,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 00:07:58,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:07:58,725 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 00:07:58,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:08:02,634 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:08:02,639 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:08:04,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 00:08:05,759 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 00:08:07,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 00:08:11,494 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:08:14,441 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 00:08:17,253 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:08:23,913 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten.whitening_limit, batch_count=186000.0, ans=15.0 2023-09-29 00:08:24,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 00:08:27,627 INFO [train.py:1039] (1/4) Epoch 6, batch 1350, loss[loss=0.2411, simple_loss=0.2888, pruned_loss=0.09673, over 23802.00 frames. ], tot_loss[loss=0.2391, simple_loss=0.2995, pruned_loss=0.08932, over 4703580.14 frames. ], batch size: 179, lr: 1.74e-02, grad_scale: 32.0 2023-09-29 00:08:27,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:08:29,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:08:32,029 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:08:33,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:08:36,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:08:36,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:08:39,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:08:41,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 00:08:43,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 00:08:44,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:08:47,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 00:08:47,899 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=186133.33333333334, ans=0.0 2023-09-29 00:08:49,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:08:50,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:08:50,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 00:08:53,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 00:08:55,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 00:08:57,063 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=186133.33333333334, ans=0.2 2023-09-29 00:08:58,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:08:58,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 00:09:01,538 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=186200.0, ans=0.0 2023-09-29 00:09:08,785 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=186200.0, ans=0.0 2023-09-29 00:09:11,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:09:17,331 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=186266.66666666666, ans=0.125 2023-09-29 00:09:21,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:09:22,609 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.690e+02 2.119e+02 2.458e+02 2.800e+02 4.358e+02, threshold=4.916e+02, percent-clipped=0.0 2023-09-29 00:09:22,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:09:22,797 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 00:09:23,163 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=186266.66666666666, ans=0.125 2023-09-29 00:09:24,675 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=186266.66666666666, ans=0.1 2023-09-29 00:09:25,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:09:27,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 00:09:27,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 00:09:28,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:09:31,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:09:33,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 00:09:33,611 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=186333.33333333334, ans=0.125 2023-09-29 00:09:34,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:09:39,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 00:09:43,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 00:09:45,958 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.60 vs. limit=6.0 2023-09-29 00:09:48,756 INFO [train.py:1039] (1/4) Epoch 6, batch 1400, loss[loss=0.2207, simple_loss=0.2936, pruned_loss=0.07387, over 24514.00 frames. ], tot_loss[loss=0.2374, simple_loss=0.2981, pruned_loss=0.08837, over 4703947.97 frames. ], batch size: 66, lr: 1.73e-02, grad_scale: 32.0 2023-09-29 00:09:48,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 00:09:50,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:09:53,957 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:09:55,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:09:58,519 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 00:10:00,182 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 00:10:02,341 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=6.82 vs. limit=12.0 2023-09-29 00:10:10,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:10:12,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:10:16,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:10:17,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 00:10:21,003 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:10:22,462 WARNING [train.py:1197] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 00:10:26,687 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.67 vs. limit=22.5 2023-09-29 00:10:30,809 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:10:30,922 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:10:34,504 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.63 vs. limit=10.0 2023-09-29 00:10:35,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 00:10:36,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:10:36,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:10:39,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:10:39,875 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:10:41,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:10:42,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:10:42,862 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:10:44,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 00:10:45,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:10:49,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:10:56,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:11:03,674 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 00:11:05,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 00:11:06,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:11:09,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 00:11:10,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:11:11,175 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:11:12,496 INFO [train.py:1039] (1/4) Epoch 6, batch 1450, loss[loss=0.2596, simple_loss=0.3183, pruned_loss=0.1004, over 23267.00 frames. ], tot_loss[loss=0.236, simple_loss=0.2965, pruned_loss=0.08772, over 4696100.46 frames. ], batch size: 93, lr: 1.73e-02, grad_scale: 32.0 2023-09-29 00:11:15,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:11:15,913 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:11:17,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:11:17,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 00:11:19,078 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=186733.33333333334, ans=0.125 2023-09-29 00:11:22,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:11:23,786 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 00:11:25,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:11:25,348 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 00:11:27,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 00:11:27,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 00:11:28,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:11:30,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:11:30,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 00:11:30,582 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=186800.0, ans=0.125 2023-09-29 00:11:32,540 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:11:32,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 00:11:34,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 00:11:34,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:11:36,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:11:37,634 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:11:38,063 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=186800.0, ans=0.0 2023-09-29 00:11:38,125 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=186800.0, ans=0.0 2023-09-29 00:11:39,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:11:39,547 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=186800.0, ans=0.125 2023-09-29 00:11:43,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:11:43,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:11:45,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:11:46,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:11:48,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:11:48,124 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:11:48,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:11:49,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:11:53,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 00:11:56,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:11:57,807 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.91 vs. limit=10.0 2023-09-29 00:12:00,670 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 00:12:02,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:12:04,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:12:06,442 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:12:08,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 00:12:09,774 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.836e+02 2.135e+02 2.497e+02 3.099e+02 5.077e+02, threshold=4.994e+02, percent-clipped=1.0 2023-09-29 00:12:11,786 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=186933.33333333334, ans=0.125 2023-09-29 00:12:12,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:12:14,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 00:12:14,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 00:12:15,983 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:12:18,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:12:20,385 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:12:20,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 00:12:23,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 00:12:23,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 00:12:25,165 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:12:26,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 00:12:32,989 INFO [train.py:1039] (1/4) Epoch 6, batch 1500, loss[loss=0.2463, simple_loss=0.3228, pruned_loss=0.08492, over 24455.00 frames. ], tot_loss[loss=0.2369, simple_loss=0.2975, pruned_loss=0.08819, over 4695268.33 frames. ], batch size: 69, lr: 1.73e-02, grad_scale: 16.0 2023-09-29 00:12:38,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 00:12:38,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:12:38,671 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:12:40,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:12:41,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:12:42,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:12:43,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 00:12:43,857 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=187066.66666666666, ans=0.125 2023-09-29 00:12:45,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:12:45,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 00:12:45,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:12:46,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:12:47,657 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.35 vs. limit=12.0 2023-09-29 00:12:48,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:12:49,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:12:52,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:12:54,163 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 00:12:54,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:12:55,049 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=6.25 vs. limit=15.0 2023-09-29 00:12:55,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:12:55,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:12:56,041 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=187133.33333333334, ans=0.0 2023-09-29 00:12:58,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 00:13:03,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 00:13:03,413 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=187200.0, ans=0.125 2023-09-29 00:13:05,725 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=187200.0, ans=0.125 2023-09-29 00:13:07,441 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:13:07,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 00:13:11,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 00:13:13,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:13:14,773 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:13:14,805 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:13:15,717 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.43 vs. limit=15.0 2023-09-29 00:13:16,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 00:13:16,438 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:13:16,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:13:17,826 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 00:13:17,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:13:18,680 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.13 vs. limit=15.0 2023-09-29 00:13:25,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:13:25,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 00:13:28,319 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 00:13:31,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 00:13:35,660 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 00:13:35,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:13:37,108 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 00:13:37,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:13:38,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:13:40,787 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 00:13:42,756 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:13:46,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 00:13:47,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:13:52,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:13:52,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:13:52,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:13:53,963 INFO [train.py:1039] (1/4) Epoch 6, batch 1550, loss[loss=0.2364, simple_loss=0.2888, pruned_loss=0.09205, over 23808.00 frames. ], tot_loss[loss=0.2368, simple_loss=0.2979, pruned_loss=0.08787, over 4702198.32 frames. ], batch size: 179, lr: 1.73e-02, grad_scale: 16.0 2023-09-29 00:13:54,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:13:54,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:13:55,663 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 00:13:55,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 00:13:55,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:13:57,214 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 00:13:57,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 00:13:59,043 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=187400.0, ans=0.2 2023-09-29 00:14:00,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:14:01,125 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.56 vs. limit=15.0 2023-09-29 00:14:01,671 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:14:03,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:14:03,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:14:03,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:14:04,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:14:07,750 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 00:14:07,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:14:07,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:14:09,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 00:14:10,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 00:14:10,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 00:14:12,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:14:13,990 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 00:14:14,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 00:14:14,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 00:14:16,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:14:17,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:14:21,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:14:24,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 00:14:24,427 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 00:14:33,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:14:36,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:14:36,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 00:14:36,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:14:36,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 00:14:41,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 00:14:42,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:14:42,860 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=187600.0, ans=0.0 2023-09-29 00:14:45,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:14:49,280 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.701e+02 2.090e+02 2.378e+02 2.713e+02 3.704e+02, threshold=4.756e+02, percent-clipped=0.0 2023-09-29 00:14:49,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:14:49,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:14:49,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 00:14:49,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:14:51,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:14:51,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:14:53,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 00:14:53,749 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 00:14:55,664 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=187600.0, ans=0.1 2023-09-29 00:14:56,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:15:01,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 00:15:07,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:15:08,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:15:09,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 00:15:10,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:15:12,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:15:12,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:15:12,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:15:13,493 INFO [train.py:1039] (1/4) Epoch 6, batch 1600, loss[loss=0.2907, simple_loss=0.3299, pruned_loss=0.1258, over 22731.00 frames. ], tot_loss[loss=0.238, simple_loss=0.299, pruned_loss=0.08852, over 4709504.64 frames. ], batch size: 323, lr: 1.73e-02, grad_scale: 32.0 2023-09-29 00:15:13,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:15:16,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:15:17,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 00:15:18,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 00:15:22,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 00:15:25,392 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:15:26,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 00:15:28,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:15:30,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:15:36,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:15:38,111 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=187800.0, ans=0.0 2023-09-29 00:15:40,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 00:15:42,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:15:43,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 00:15:44,272 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=187866.66666666666, ans=0.125 2023-09-29 00:15:45,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:15:45,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 00:15:50,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 00:15:50,499 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.min_positive, batch_count=187866.66666666666, ans=0.025 2023-09-29 00:15:55,188 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=187866.66666666666, ans=0.125 2023-09-29 00:15:58,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:15:59,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 00:15:59,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:16:01,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:16:01,469 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:16:05,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 00:16:06,548 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=5.70 vs. limit=12.0 2023-09-29 00:16:09,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 00:16:10,978 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:16:11,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:16:12,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:16:12,594 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:16:14,167 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:16:17,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:16:18,803 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:16:23,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:16:25,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:16:27,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 00:16:28,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:16:28,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 00:16:30,115 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.79 vs. limit=22.5 2023-09-29 00:16:33,095 INFO [train.py:1039] (1/4) Epoch 6, batch 1650, loss[loss=0.2391, simple_loss=0.313, pruned_loss=0.08259, over 23736.00 frames. ], tot_loss[loss=0.2378, simple_loss=0.2993, pruned_loss=0.08819, over 4716024.25 frames. ], batch size: 85, lr: 1.73e-02, grad_scale: 32.0 2023-09-29 00:16:36,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:16:37,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:16:37,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:16:39,195 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 00:16:39,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 00:16:39,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 00:16:39,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 00:16:42,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:16:43,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:16:44,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:16:44,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 00:16:46,220 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=188066.66666666666, ans=0.125 2023-09-29 00:16:47,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:16:48,995 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 00:16:51,892 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:16:51,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:16:51,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:16:51,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:16:53,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 00:16:53,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 00:16:58,105 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 00:16:59,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 00:17:08,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 00:17:10,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:17:12,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 00:17:16,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:17:19,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:17:19,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:17:20,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:17:21,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:17:21,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:17:25,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:17:25,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:17:27,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:17:27,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:17:28,446 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 2.178e+02 2.493e+02 2.802e+02 6.343e+02, threshold=4.987e+02, percent-clipped=2.0 2023-09-29 00:17:28,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:17:28,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:17:31,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:17:33,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 00:17:34,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:17:34,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 00:17:37,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 00:17:37,077 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 00:17:39,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:17:39,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:17:40,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:17:40,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:17:40,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 00:17:45,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:17:46,804 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:17:46,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:17:50,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 00:17:53,514 INFO [train.py:1039] (1/4) Epoch 6, batch 1700, loss[loss=0.2593, simple_loss=0.3268, pruned_loss=0.09588, over 24352.00 frames. ], tot_loss[loss=0.2379, simple_loss=0.2996, pruned_loss=0.08816, over 4724222.62 frames. ], batch size: 77, lr: 1.73e-02, grad_scale: 32.0 2023-09-29 00:17:55,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:17:55,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:17:55,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 00:17:55,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:17:56,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:17:56,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:17:59,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:17:59,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:17:59,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 00:18:01,509 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=188400.0, ans=0.04949747468305833 2023-09-29 00:18:02,751 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 00:18:03,667 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.66 vs. limit=15.0 2023-09-29 00:18:09,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:18:13,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:18:19,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 00:18:21,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:18:21,235 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:18:21,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:18:24,212 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 00:18:24,476 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:18:25,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:18:27,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:18:27,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 00:18:29,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 00:18:30,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 00:18:32,247 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:18:33,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 00:18:34,071 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=188533.33333333334, ans=0.125 2023-09-29 00:18:35,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:18:40,998 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=188600.0, ans=0.0 2023-09-29 00:18:44,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:18:46,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:18:47,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:18:49,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 00:18:49,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 00:18:49,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:18:51,050 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:18:51,051 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 00:18:53,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:18:53,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:18:53,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:18:53,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:18:53,449 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=188600.0, ans=0.1 2023-09-29 00:18:56,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:18:56,290 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:18:57,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:18:57,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:18:57,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:19:02,641 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:19:05,517 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 00:19:05,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:19:07,191 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:19:08,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 00:19:14,963 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=188733.33333333334, ans=0.1 2023-09-29 00:19:16,070 INFO [train.py:1039] (1/4) Epoch 6, batch 1750, loss[loss=0.2433, simple_loss=0.3006, pruned_loss=0.09298, over 23728.00 frames. ], tot_loss[loss=0.237, simple_loss=0.2991, pruned_loss=0.08745, over 4723377.01 frames. ], batch size: 179, lr: 1.72e-02, grad_scale: 32.0 2023-09-29 00:19:17,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:19:21,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:19:21,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 00:19:21,354 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=188733.33333333334, ans=0.2 2023-09-29 00:19:22,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 00:19:22,581 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:19:24,576 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=188733.33333333334, ans=0.0 2023-09-29 00:19:26,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:19:26,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:19:28,663 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.29 vs. limit=6.0 2023-09-29 00:19:29,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 00:19:32,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:19:32,834 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=188800.0, ans=0.125 2023-09-29 00:19:34,331 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=188800.0, ans=0.0 2023-09-29 00:19:35,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 00:19:35,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:19:37,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:19:38,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 00:19:40,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 00:19:43,248 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:19:43,299 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 00:19:46,554 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.max_abs, batch_count=188866.66666666666, ans=10.0 2023-09-29 00:19:53,643 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:19:57,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:19:57,221 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:20:00,765 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:20:00,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:20:03,815 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:20:04,191 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 00:20:05,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:20:07,211 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:20:07,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:20:08,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 00:20:10,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:20:12,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 00:20:13,403 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.714e+02 2.157e+02 2.511e+02 2.908e+02 4.872e+02, threshold=5.023e+02, percent-clipped=0.0 2023-09-29 00:20:13,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:20:16,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:20:18,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:20:21,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 00:20:23,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 00:20:23,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:20:25,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:20:30,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:20:34,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:20:35,732 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:20:37,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 00:20:37,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:20:38,617 INFO [train.py:1039] (1/4) Epoch 6, batch 1800, loss[loss=0.2419, simple_loss=0.3001, pruned_loss=0.09182, over 23229.00 frames. ], tot_loss[loss=0.2368, simple_loss=0.2986, pruned_loss=0.08751, over 4731455.03 frames. ], batch size: 105, lr: 1.72e-02, grad_scale: 16.0 2023-09-29 00:20:38,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 00:20:38,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:20:38,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:20:38,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:20:40,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:20:42,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:20:43,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:20:45,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 00:20:48,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:20:51,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 00:20:52,597 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:20:55,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:20:57,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:20:59,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:20:59,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:21:03,428 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:21:03,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 00:21:04,895 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:21:08,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:21:12,870 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 00:21:15,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 00:21:15,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 00:21:15,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:21:17,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:21:17,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:21:19,087 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:21:19,360 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=189200.0, ans=0.125 2023-09-29 00:21:21,048 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=189200.0, ans=0.0 2023-09-29 00:21:23,892 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 00:21:25,533 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:21:25,767 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=189266.66666666666, ans=0.125 2023-09-29 00:21:28,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:21:29,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 00:21:31,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 00:21:32,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:21:33,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:21:35,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 00:21:39,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 00:21:48,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:21:48,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 00:21:48,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:21:48,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:21:49,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:21:50,066 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 00:21:50,750 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.61 vs. limit=12.0 2023-09-29 00:21:53,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:21:53,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:21:56,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 00:21:56,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:21:59,197 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:21:59,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 00:21:59,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:21:59,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:22:00,797 INFO [train.py:1039] (1/4) Epoch 6, batch 1850, loss[loss=0.2412, simple_loss=0.3116, pruned_loss=0.08536, over 24530.00 frames. ], tot_loss[loss=0.237, simple_loss=0.2993, pruned_loss=0.08736, over 4728927.17 frames. ], batch size: 71, lr: 1.72e-02, grad_scale: 16.0 2023-09-29 00:22:00,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:22:02,470 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:22:02,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:22:04,839 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.07 vs. limit=15.0 2023-09-29 00:22:06,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:22:06,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:22:14,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:22:16,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 00:22:17,159 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.49 vs. limit=15.0 2023-09-29 00:22:18,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 00:22:22,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 00:22:25,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:22:27,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 00:22:27,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 00:22:29,226 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=189466.66666666666, ans=0.0 2023-09-29 00:22:36,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:22:40,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 00:22:41,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:22:43,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:22:48,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 00:22:48,692 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=189600.0, ans=0.125 2023-09-29 00:22:49,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:22:49,832 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 00:22:50,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:22:53,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:22:56,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:22:59,537 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 2.153e+02 2.382e+02 2.790e+02 3.964e+02, threshold=4.764e+02, percent-clipped=0.0 2023-09-29 00:22:59,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 00:22:59,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:22:59,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 00:23:01,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:23:02,742 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:23:02,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:23:06,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 00:23:07,426 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:23:10,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:23:10,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:23:10,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 00:23:10,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 00:23:11,434 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.30 vs. limit=15.0 2023-09-29 00:23:14,195 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 00:23:14,340 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 00:23:17,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 00:23:17,261 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:23:17,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:23:17,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:23:19,386 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 00:23:19,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:23:19,472 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:23:19,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 00:23:21,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 00:23:22,537 INFO [train.py:1039] (1/4) Epoch 6, batch 1900, loss[loss=0.2404, simple_loss=0.2982, pruned_loss=0.09128, over 23194.00 frames. ], tot_loss[loss=0.2372, simple_loss=0.2995, pruned_loss=0.08746, over 4729882.53 frames. ], batch size: 105, lr: 1.72e-02, grad_scale: 16.0 2023-09-29 00:23:22,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:23:22,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 00:23:26,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:23:26,219 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 00:23:26,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:23:28,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:23:31,646 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=189733.33333333334, ans=0.2 2023-09-29 00:23:32,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:23:35,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:23:37,345 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 00:23:37,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 00:23:37,893 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=189800.0, ans=0.125 2023-09-29 00:23:39,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:23:40,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:23:40,606 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 00:23:40,654 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 00:23:45,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 00:23:47,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:23:50,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 00:23:52,128 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=189800.0, ans=0.125 2023-09-29 00:23:53,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 00:23:53,765 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=189866.66666666666, ans=0.0 2023-09-29 00:24:04,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 00:24:07,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 00:24:07,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:24:07,234 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 00:24:07,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 00:24:07,290 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 00:24:08,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 00:24:08,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:24:13,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 00:24:17,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:24:20,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:24:20,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 00:24:24,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:24:27,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 00:24:27,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:24:33,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:24:33,950 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:24:33,972 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:24:34,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:24:36,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 00:24:37,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 00:24:37,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:24:39,250 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:24:39,253 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:24:42,240 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:24:42,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:24:42,316 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:24:45,068 INFO [train.py:1039] (1/4) Epoch 6, batch 1950, loss[loss=0.2512, simple_loss=0.2946, pruned_loss=0.1039, over 23774.00 frames. ], tot_loss[loss=0.2398, simple_loss=0.3011, pruned_loss=0.08922, over 4717638.07 frames. ], batch size: 164, lr: 1.72e-02, grad_scale: 8.0 2023-09-29 00:24:45,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:24:49,519 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:24:51,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:24:51,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:24:51,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 00:24:52,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 00:24:54,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 00:24:54,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:24:56,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:24:58,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:24:59,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:24:59,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:25:01,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:25:06,736 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:25:08,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 00:25:08,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:25:08,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:25:11,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:25:14,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:25:14,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:25:14,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 00:25:14,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 00:25:14,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 00:25:14,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:25:16,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:25:20,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:25:22,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:25:26,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 00:25:30,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:25:30,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 00:25:32,233 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 00:25:32,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:25:36,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:25:40,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:25:40,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 00:25:41,456 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=190266.66666666666, ans=0.0 2023-09-29 00:25:46,172 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.758e+02 2.214e+02 2.605e+02 2.904e+02 4.592e+02, threshold=5.209e+02, percent-clipped=0.0 2023-09-29 00:25:49,579 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:25:51,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:25:51,457 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=190333.33333333334, ans=0.125 2023-09-29 00:25:53,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:25:56,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:25:59,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:26:00,040 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:26:01,418 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 00:26:01,427 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 00:26:01,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:26:03,066 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 00:26:06,581 INFO [train.py:1039] (1/4) Epoch 6, batch 2000, loss[loss=0.2332, simple_loss=0.3077, pruned_loss=0.07933, over 24373.00 frames. ], tot_loss[loss=0.2398, simple_loss=0.3017, pruned_loss=0.08897, over 4725743.55 frames. ], batch size: 77, lr: 1.72e-02, grad_scale: 16.0 2023-09-29 00:26:06,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:26:09,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 00:26:09,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:26:11,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:26:13,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:26:15,079 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:26:18,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 00:26:18,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 00:26:23,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:26:25,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 00:26:26,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 00:26:26,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:26:27,987 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.16 vs. limit=15.0 2023-09-29 00:26:29,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:26:30,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 00:26:32,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:26:35,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:26:35,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:26:35,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 00:26:36,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 00:26:36,256 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=190466.66666666666, ans=0.125 2023-09-29 00:26:38,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 00:26:38,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:26:41,321 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:26:41,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 00:26:41,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:26:42,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:26:44,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:26:45,383 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=12.27 vs. limit=15.0 2023-09-29 00:26:45,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 00:26:49,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 00:26:49,268 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:26:50,163 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.30 vs. limit=22.5 2023-09-29 00:26:50,678 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:26:55,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:26:55,913 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.28 vs. limit=15.0 2023-09-29 00:26:56,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:26:57,967 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:26:58,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:26:59,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:27:00,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:27:01,007 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:27:01,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:27:03,824 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:27:06,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:27:08,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 00:27:10,763 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=190666.66666666666, ans=0.125 2023-09-29 00:27:13,899 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=190666.66666666666, ans=0.125 2023-09-29 00:27:15,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 00:27:15,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:27:15,655 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=190666.66666666666, ans=0.125 2023-09-29 00:27:18,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:27:18,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:27:21,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:27:21,944 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=190666.66666666666, ans=0.125 2023-09-29 00:27:23,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:27:23,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:27:25,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 00:27:25,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:27:29,344 INFO [train.py:1039] (1/4) Epoch 6, batch 2050, loss[loss=0.2135, simple_loss=0.2715, pruned_loss=0.07778, over 24454.00 frames. ], tot_loss[loss=0.2381, simple_loss=0.2999, pruned_loss=0.08811, over 4731325.85 frames. ], batch size: 58, lr: 1.72e-02, grad_scale: 16.0 2023-09-29 00:27:29,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:27:30,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:27:32,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:27:32,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:27:38,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:27:40,276 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:27:40,369 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:27:41,989 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:27:45,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 00:27:45,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:27:47,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:27:47,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:27:57,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:27:57,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:27:57,575 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=190800.0, ans=0.125 2023-09-29 00:28:00,230 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 00:28:03,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:28:05,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 00:28:05,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:28:07,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:28:07,324 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=190866.66666666666, ans=0.125 2023-09-29 00:28:10,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:28:11,598 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 00:28:12,960 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:28:14,560 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:28:16,016 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:28:16,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:28:19,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:28:21,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:28:23,389 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=12.88 vs. limit=15.0 2023-09-29 00:28:24,200 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 00:28:25,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:28:29,259 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.749e+02 2.164e+02 2.484e+02 2.839e+02 4.579e+02, threshold=4.968e+02, percent-clipped=0.0 2023-09-29 00:28:29,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:28:35,258 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:28:35,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 00:28:41,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:28:42,765 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:28:44,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:28:47,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 00:28:50,061 INFO [train.py:1039] (1/4) Epoch 6, batch 2100, loss[loss=0.2374, simple_loss=0.2935, pruned_loss=0.09069, over 23293.00 frames. ], tot_loss[loss=0.2364, simple_loss=0.298, pruned_loss=0.08735, over 4713699.57 frames. ], batch size: 105, lr: 1.71e-02, grad_scale: 16.0 2023-09-29 00:28:50,323 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 00:28:50,324 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:28:50,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:28:51,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 00:28:53,915 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:28:53,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 00:28:53,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 00:28:56,872 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:28:59,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:28:59,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:29:03,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:29:05,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:29:05,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 00:29:06,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:29:06,256 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=191133.33333333334, ans=0.125 2023-09-29 00:29:07,320 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 00:29:07,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 00:29:08,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:29:08,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:29:08,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 00:29:10,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 00:29:11,247 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.90 vs. limit=6.0 2023-09-29 00:29:15,274 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 00:29:15,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 00:29:19,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:29:20,282 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.56 vs. limit=15.0 2023-09-29 00:29:21,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:29:22,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:29:24,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 00:29:24,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:29:24,477 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 00:29:27,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 00:29:29,439 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:29:29,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 00:29:29,526 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 00:29:29,592 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 00:29:32,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 00:29:34,095 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:29:37,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 00:29:37,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 00:29:38,803 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=191266.66666666666, ans=0.035 2023-09-29 00:29:40,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:29:42,162 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:29:42,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 00:29:42,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:29:42,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:29:43,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:29:43,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 00:29:45,403 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 00:29:45,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 00:29:48,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:29:52,872 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:29:52,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 00:29:53,995 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=4.92 vs. limit=10.0 2023-09-29 00:29:54,737 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=191333.33333333334, ans=0.07 2023-09-29 00:29:59,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:30:02,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:30:02,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:30:02,583 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:30:02,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 00:30:04,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 00:30:04,243 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=191333.33333333334, ans=0.125 2023-09-29 00:30:06,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:30:06,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:30:07,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:30:07,105 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:30:09,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 00:30:09,494 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=191333.33333333334, ans=0.125 2023-09-29 00:30:10,834 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 00:30:10,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:30:12,257 INFO [train.py:1039] (1/4) Epoch 6, batch 2150, loss[loss=0.2182, simple_loss=0.2795, pruned_loss=0.07846, over 24302.00 frames. ], tot_loss[loss=0.2352, simple_loss=0.2971, pruned_loss=0.08663, over 4710320.96 frames. ], batch size: 56, lr: 1.71e-02, grad_scale: 16.0 2023-09-29 00:30:14,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:30:14,617 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:30:14,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:30:14,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:30:21,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 00:30:24,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:30:24,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:30:25,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:30:25,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:30:25,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:30:29,097 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:30:30,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:30:30,503 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:30:34,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:30:34,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 00:30:35,906 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=191466.66666666666, ans=0.125 2023-09-29 00:30:39,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:30:40,968 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:30:41,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:30:42,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:30:42,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:30:42,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:30:44,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:30:44,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:30:45,543 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:30:47,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 00:30:48,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:30:49,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:30:49,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:30:51,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 00:30:52,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:30:54,557 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:30:55,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 00:30:57,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:30:57,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 00:30:57,467 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 00:30:57,789 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=191533.33333333334, ans=0.0 2023-09-29 00:31:00,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:31:00,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:31:02,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:31:02,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 00:31:03,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:31:05,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:31:05,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 00:31:07,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 00:31:07,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:31:08,651 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 00:31:08,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:31:10,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:31:12,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 00:31:12,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:31:12,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 00:31:12,169 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 00:31:12,169 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 00:31:13,510 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 2.175e+02 2.371e+02 2.778e+02 4.132e+02, threshold=4.742e+02, percent-clipped=0.0 2023-09-29 00:31:13,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 00:31:15,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:31:15,309 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:31:16,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:31:16,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:31:18,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 00:31:19,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:31:19,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:31:30,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:31:31,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 00:31:34,640 INFO [train.py:1039] (1/4) Epoch 6, batch 2200, loss[loss=0.1861, simple_loss=0.2614, pruned_loss=0.05539, over 24344.00 frames. ], tot_loss[loss=0.2356, simple_loss=0.2975, pruned_loss=0.08683, over 4722222.73 frames. ], batch size: 56, lr: 1.71e-02, grad_scale: 16.0 2023-09-29 00:31:34,794 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:31:39,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:31:40,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:31:40,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:31:44,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:31:46,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:31:47,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:31:47,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 00:31:54,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 00:31:55,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 00:32:01,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 00:32:05,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:32:07,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:32:07,199 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:32:11,796 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:32:11,842 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 00:32:16,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:32:16,679 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:32:18,106 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 00:32:21,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 00:32:23,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:32:25,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:32:25,733 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=191933.33333333334, ans=0.1 2023-09-29 00:32:26,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:32:28,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 00:32:29,112 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.36 vs. limit=6.0 2023-09-29 00:32:29,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:32:30,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 00:32:32,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:32:32,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 00:32:32,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:32:36,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:32:37,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:32:37,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:32:37,729 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:32:37,983 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_ff3.min_abs, batch_count=191933.33333333334, ans=0.2 2023-09-29 00:32:38,010 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer_ff3.min_abs, batch_count=191933.33333333334, ans=0.2 2023-09-29 00:32:39,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 00:32:40,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:32:42,430 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 00:32:45,558 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 00:32:45,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:32:47,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 00:32:48,719 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 00:32:50,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:32:51,736 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 00:32:51,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 00:32:51,963 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 00:32:53,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:32:55,486 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 00:32:56,881 INFO [train.py:1039] (1/4) Epoch 6, batch 2250, loss[loss=0.2745, simple_loss=0.3191, pruned_loss=0.115, over 23482.00 frames. ], tot_loss[loss=0.2373, simple_loss=0.2991, pruned_loss=0.08777, over 4727493.67 frames. ], batch size: 285, lr: 1.71e-02, grad_scale: 16.0 2023-09-29 00:32:58,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:32:59,171 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 00:33:00,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:33:04,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:33:11,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:33:11,364 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:33:14,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:33:15,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:33:17,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:33:17,752 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 00:33:20,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 00:33:20,442 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:33:20,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:33:22,148 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=192133.33333333334, ans=0.125 2023-09-29 00:33:23,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 00:33:23,429 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:33:24,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:33:26,327 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:33:27,963 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=192200.0, ans=0.2 2023-09-29 00:33:31,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:33:33,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 00:33:33,686 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 00:33:35,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 00:33:36,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:33:40,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:33:44,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:33:45,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:33:47,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:33:47,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:33:47,703 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=192266.66666666666, ans=0.125 2023-09-29 00:33:48,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:33:50,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:33:54,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:33:56,427 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 00:33:57,684 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 2.089e+02 2.370e+02 2.766e+02 4.098e+02, threshold=4.740e+02, percent-clipped=0.0 2023-09-29 00:34:02,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 00:34:04,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 00:34:04,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:34:08,323 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=192333.33333333334, ans=0.0 2023-09-29 00:34:09,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 00:34:13,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 00:34:13,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 00:34:14,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:34:14,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:34:18,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 00:34:19,573 INFO [train.py:1039] (1/4) Epoch 6, batch 2300, loss[loss=0.2332, simple_loss=0.3051, pruned_loss=0.08067, over 24310.00 frames. ], tot_loss[loss=0.2373, simple_loss=0.2991, pruned_loss=0.08773, over 4732851.01 frames. ], batch size: 74, lr: 1.71e-02, grad_scale: 16.0 2023-09-29 00:34:19,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:34:19,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:34:23,521 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.75 vs. limit=15.0 2023-09-29 00:34:23,649 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=14.68 vs. limit=15.0 2023-09-29 00:34:25,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:34:25,791 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:34:27,431 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 00:34:30,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:34:38,517 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:34:38,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 00:34:38,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:34:40,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:34:40,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 00:34:40,617 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.23 vs. limit=15.0 2023-09-29 00:34:41,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:34:46,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:34:46,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:34:50,497 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:34:54,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 00:34:57,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:35:01,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:35:03,191 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:35:06,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:35:07,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:35:11,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:35:11,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 00:35:13,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:35:13,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 00:35:16,400 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 00:35:16,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:35:16,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:35:16,541 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:35:18,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:35:19,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 00:35:19,889 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 00:35:19,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 00:35:21,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:35:21,995 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:35:22,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 00:35:30,218 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:35:31,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:35:33,994 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.66 vs. limit=10.0 2023-09-29 00:35:37,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:35:37,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:35:39,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 00:35:40,481 INFO [train.py:1039] (1/4) Epoch 6, batch 2350, loss[loss=0.2314, simple_loss=0.3003, pruned_loss=0.08121, over 24103.00 frames. ], tot_loss[loss=0.2381, simple_loss=0.3002, pruned_loss=0.08796, over 4735259.46 frames. ], batch size: 80, lr: 1.71e-02, grad_scale: 16.0 2023-09-29 00:35:40,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 00:35:40,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:35:42,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 00:35:44,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 00:35:47,909 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=192733.33333333334, ans=0.125 2023-09-29 00:35:50,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:35:50,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 00:35:57,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 00:35:59,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:36:01,731 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:36:01,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:36:01,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:36:03,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:36:03,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 00:36:03,649 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=192800.0, ans=0.125 2023-09-29 00:36:07,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:36:12,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 00:36:13,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:36:16,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:36:16,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:36:19,446 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:36:22,320 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 00:36:22,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:36:23,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:36:23,883 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:36:25,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:36:30,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:36:32,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 00:36:33,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:36:36,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:36:37,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:36:39,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 00:36:39,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:36:39,524 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=192933.33333333334, ans=0.0 2023-09-29 00:36:41,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 00:36:41,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 00:36:42,239 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.670e+02 2.125e+02 2.359e+02 2.805e+02 3.859e+02, threshold=4.718e+02, percent-clipped=0.0 2023-09-29 00:36:46,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 00:36:49,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 00:36:50,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:36:50,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 00:36:50,836 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 00:36:52,165 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 00:36:52,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 00:36:55,751 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=193000.0, ans=0.125 2023-09-29 00:36:56,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:37:01,878 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:37:03,294 INFO [train.py:1039] (1/4) Epoch 6, batch 2400, loss[loss=0.242, simple_loss=0.314, pruned_loss=0.08497, over 23767.00 frames. ], tot_loss[loss=0.2387, simple_loss=0.3006, pruned_loss=0.08842, over 4735421.41 frames. ], batch size: 85, lr: 1.71e-02, grad_scale: 32.0 2023-09-29 00:37:06,952 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:37:08,523 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:37:10,538 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 00:37:10,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 00:37:16,987 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 00:37:16,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:37:20,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 00:37:22,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:37:23,441 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:37:23,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 00:37:29,720 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=193133.33333333334, ans=0.0 2023-09-29 00:37:30,917 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:37:32,608 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 00:37:37,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 00:37:40,191 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 00:37:43,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:37:45,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:37:49,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:37:50,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 00:37:51,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:37:59,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:37:59,933 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.49 vs. limit=15.0 2023-09-29 00:38:02,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:38:03,884 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=193266.66666666666, ans=0.125 2023-09-29 00:38:05,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:38:05,455 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:38:05,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 00:38:05,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:38:05,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:38:06,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:38:07,033 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 00:38:12,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:38:12,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 00:38:12,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 00:38:14,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 00:38:15,376 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.77 vs. limit=15.0 2023-09-29 00:38:18,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:38:18,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:38:18,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 00:38:19,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 00:38:19,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 00:38:19,777 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 00:38:21,390 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 00:38:21,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:38:23,099 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:38:24,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:38:25,948 INFO [train.py:1039] (1/4) Epoch 6, batch 2450, loss[loss=0.2044, simple_loss=0.2405, pruned_loss=0.08414, over 19111.00 frames. ], tot_loss[loss=0.2371, simple_loss=0.2989, pruned_loss=0.08763, over 4716700.27 frames. ], batch size: 388, lr: 1.70e-02, grad_scale: 16.0 2023-09-29 00:38:25,999 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 00:38:26,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:38:28,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 00:38:29,827 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=193400.0, ans=0.1 2023-09-29 00:38:32,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 00:38:32,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:38:35,727 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:38:37,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:38:37,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 00:38:43,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:38:43,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:38:48,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:38:48,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:38:48,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:38:49,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 00:38:53,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:38:55,143 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 00:38:56,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:38:59,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 00:38:59,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:39:01,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:39:01,471 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=193533.33333333334, ans=0.125 2023-09-29 00:39:01,483 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=193533.33333333334, ans=0.1 2023-09-29 00:39:03,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:39:04,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 00:39:04,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:39:14,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:39:15,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:39:16,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:39:16,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:39:18,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:39:18,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:39:18,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 00:39:18,489 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=193600.0, ans=0.1 2023-09-29 00:39:22,539 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.59 vs. limit=15.0 2023-09-29 00:39:23,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:39:23,249 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:39:23,537 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=193600.0, ans=0.125 2023-09-29 00:39:26,707 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.695e+02 2.234e+02 2.563e+02 3.066e+02 5.570e+02, threshold=5.125e+02, percent-clipped=5.0 2023-09-29 00:39:26,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:39:26,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:39:31,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:39:32,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 00:39:34,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:39:34,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:39:34,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 00:39:34,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:39:36,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:39:39,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:39:41,636 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=193666.66666666666, ans=0.125 2023-09-29 00:39:42,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:39:42,923 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:39:45,688 INFO [train.py:1039] (1/4) Epoch 6, batch 2500, loss[loss=0.2309, simple_loss=0.304, pruned_loss=0.07886, over 24376.00 frames. ], tot_loss[loss=0.2357, simple_loss=0.2971, pruned_loss=0.08713, over 4704572.20 frames. ], batch size: 77, lr: 1.70e-02, grad_scale: 16.0 2023-09-29 00:39:46,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 00:39:47,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:39:49,843 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.43 vs. limit=15.0 2023-09-29 00:39:54,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:40:04,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:40:05,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:40:07,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:40:07,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 00:40:12,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 00:40:14,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:40:15,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 00:40:15,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 00:40:15,970 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 00:40:17,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:40:18,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:40:18,957 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 00:40:20,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:40:20,458 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 00:40:21,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:40:25,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:40:26,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:40:30,653 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 00:40:30,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 00:40:32,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:40:33,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:40:37,525 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:40:37,697 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=193933.33333333334, ans=0.2 2023-09-29 00:40:40,805 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.min_positive, batch_count=193933.33333333334, ans=0.05 2023-09-29 00:40:42,097 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:40:45,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:40:49,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 00:40:49,445 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=193933.33333333334, ans=0.125 2023-09-29 00:40:52,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 00:40:52,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:40:52,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 00:40:55,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:40:55,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 00:40:55,991 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=11.38 vs. limit=15.0 2023-09-29 00:40:56,684 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 00:40:56,684 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 00:40:56,693 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 00:40:58,939 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=9.69 vs. limit=10.0 2023-09-29 00:41:00,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:41:03,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 00:41:03,321 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 00:41:04,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:41:04,868 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 00:41:05,238 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=194000.0, ans=0.1 2023-09-29 00:41:07,742 INFO [train.py:1039] (1/4) Epoch 6, batch 2550, loss[loss=0.2168, simple_loss=0.2832, pruned_loss=0.07523, over 24619.00 frames. ], tot_loss[loss=0.2365, simple_loss=0.2972, pruned_loss=0.08789, over 4694322.48 frames. ], batch size: 60, lr: 1.70e-02, grad_scale: 16.0 2023-09-29 00:41:09,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 00:41:11,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:41:13,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:41:14,693 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:41:16,266 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:41:16,392 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 00:41:18,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 00:41:18,696 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 00:41:21,503 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 00:41:22,993 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:41:25,924 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:41:27,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:41:27,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 00:41:27,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 00:41:27,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:41:28,393 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.05 vs. limit=22.5 2023-09-29 00:41:29,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:41:31,565 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:41:31,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 00:41:32,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 00:41:32,904 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:41:32,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 00:41:39,982 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=194200.0, ans=0.125 2023-09-29 00:41:43,186 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=194200.0, ans=0.125 2023-09-29 00:41:44,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:41:47,169 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=194200.0, ans=15.0 2023-09-29 00:41:50,436 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=194200.0, ans=0.125 2023-09-29 00:41:51,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:41:51,517 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:41:51,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:41:53,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 00:42:00,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:42:02,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 00:42:02,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:42:02,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:42:03,364 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=13.16 vs. limit=22.5 2023-09-29 00:42:04,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 00:42:04,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 00:42:06,549 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=194266.66666666666, ans=0.0 2023-09-29 00:42:09,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:42:09,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:42:10,627 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.774e+02 2.103e+02 2.352e+02 2.955e+02 4.902e+02, threshold=4.704e+02, percent-clipped=0.0 2023-09-29 00:42:14,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:42:14,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 00:42:14,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:42:14,385 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=194333.33333333334, ans=0.1 2023-09-29 00:42:15,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:42:17,172 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:42:18,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 00:42:18,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:42:26,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:42:27,698 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:42:30,617 INFO [train.py:1039] (1/4) Epoch 6, batch 2600, loss[loss=0.3169, simple_loss=0.342, pruned_loss=0.1459, over 19394.00 frames. ], tot_loss[loss=0.2389, simple_loss=0.2993, pruned_loss=0.08922, over 4687395.48 frames. ], batch size: 388, lr: 1.70e-02, grad_scale: 16.0 2023-09-29 00:42:32,247 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 00:42:35,210 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 00:42:35,236 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:42:35,304 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 00:42:37,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 00:42:37,425 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 00:42:38,240 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.84 vs. limit=15.0 2023-09-29 00:42:39,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:42:39,282 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 00:42:41,321 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 00:42:42,786 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 00:42:45,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:42:47,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 00:42:47,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 00:42:48,847 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:42:48,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 00:42:52,054 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 00:42:53,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 00:43:00,667 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.76 vs. limit=22.5 2023-09-29 00:43:03,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:43:03,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:43:05,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:43:05,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 00:43:06,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:43:12,043 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 00:43:18,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:43:20,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:43:20,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 00:43:20,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:43:20,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:43:21,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 00:43:25,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 00:43:25,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:43:27,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:43:31,056 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 00:43:32,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:43:32,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:43:40,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:43:41,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:43:41,746 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 00:43:41,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:43:44,810 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:43:44,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:43:50,996 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=194666.66666666666, ans=0.1 2023-09-29 00:43:52,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 00:43:52,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:43:53,556 INFO [train.py:1039] (1/4) Epoch 6, batch 2650, loss[loss=0.2523, simple_loss=0.2986, pruned_loss=0.103, over 23782.00 frames. ], tot_loss[loss=0.2388, simple_loss=0.2999, pruned_loss=0.08879, over 4710324.86 frames. ], batch size: 164, lr: 1.70e-02, grad_scale: 16.0 2023-09-29 00:43:53,767 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 00:43:58,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 00:43:58,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:43:59,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 00:44:01,190 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 00:44:01,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:44:04,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:44:08,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 00:44:10,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:44:10,611 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=194800.0, ans=0.0 2023-09-29 00:44:12,393 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:44:14,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 00:44:14,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 00:44:15,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:44:17,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 00:44:18,616 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 00:44:21,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:44:22,662 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 00:44:23,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:44:23,978 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 00:44:27,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:44:29,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 00:44:29,103 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:44:29,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:44:34,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 00:44:34,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 00:44:36,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 00:44:38,231 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=194866.66666666666, ans=0.125 2023-09-29 00:44:39,713 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 00:44:39,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:44:41,237 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:44:41,296 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:44:43,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:44:43,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:44:46,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:44:49,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:44:49,637 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:44:49,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 00:44:51,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:44:52,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:44:54,155 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.615e+02 2.176e+02 2.610e+02 3.276e+02 6.463e+02, threshold=5.220e+02, percent-clipped=8.0 2023-09-29 00:44:54,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:44:54,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:44:55,342 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=194933.33333333334, ans=0.0 2023-09-29 00:44:56,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:44:56,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 00:45:03,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:45:03,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:45:03,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:45:03,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 00:45:06,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:45:08,296 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:45:09,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:45:11,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:45:12,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:45:12,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:45:14,211 INFO [train.py:1039] (1/4) Epoch 6, batch 2700, loss[loss=0.2346, simple_loss=0.3054, pruned_loss=0.08189, over 24685.00 frames. ], tot_loss[loss=0.24, simple_loss=0.3009, pruned_loss=0.08961, over 4694421.21 frames. ], batch size: 73, lr: 1.70e-02, grad_scale: 16.0 2023-09-29 00:45:15,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:45:15,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 00:45:17,897 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten.whitening_limit, batch_count=195066.66666666666, ans=15.0 2023-09-29 00:45:18,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:45:21,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 00:45:22,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:45:22,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:45:22,880 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:45:24,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:45:24,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:45:24,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:45:24,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 00:45:24,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 00:45:24,607 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:45:26,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 00:45:28,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 00:45:29,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:45:32,156 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.31 vs. limit=15.0 2023-09-29 00:45:33,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 00:45:33,599 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=195133.33333333334, ans=0.1 2023-09-29 00:45:34,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 00:45:36,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:45:42,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:45:42,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:45:48,220 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:45:48,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:45:48,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:45:48,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 00:45:53,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:45:57,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:45:57,220 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 00:45:57,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:46:01,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:46:01,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:46:10,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:46:12,164 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:46:15,181 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:46:15,184 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:46:15,924 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=8.71 vs. limit=15.0 2023-09-29 00:46:18,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:46:19,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:46:19,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:46:21,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:46:22,948 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:46:22,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:46:26,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:46:26,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:46:26,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:46:29,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 00:46:29,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:46:31,855 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=195333.33333333334, ans=0.0 2023-09-29 00:46:33,086 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:46:33,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 00:46:36,437 INFO [train.py:1039] (1/4) Epoch 6, batch 2750, loss[loss=0.207, simple_loss=0.2511, pruned_loss=0.08139, over 23431.00 frames. ], tot_loss[loss=0.2403, simple_loss=0.3009, pruned_loss=0.08991, over 4688564.73 frames. ], batch size: 285, lr: 1.70e-02, grad_scale: 16.0 2023-09-29 00:46:36,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 00:46:36,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:46:40,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:46:40,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:46:41,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:46:41,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 00:46:42,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:46:45,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:46:45,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 00:46:46,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:46:46,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:46:46,676 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 00:46:46,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:46:46,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:46:48,473 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=195400.0, ans=0.1 2023-09-29 00:46:54,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 00:46:55,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:46:55,754 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:46:55,841 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:46:57,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 00:46:59,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:46:59,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:47:01,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:47:01,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:47:05,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 00:47:05,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 00:47:07,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:47:07,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:47:10,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 00:47:12,759 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=195533.33333333334, ans=0.2 2023-09-29 00:47:18,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:47:20,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 00:47:20,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:47:26,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:47:26,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 00:47:26,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:47:33,202 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:47:33,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:47:33,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 00:47:38,282 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.719e+02 2.212e+02 2.511e+02 3.083e+02 4.520e+02, threshold=5.022e+02, percent-clipped=0.0 2023-09-29 00:47:39,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:47:41,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 00:47:45,572 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=195666.66666666666, ans=0.125 2023-09-29 00:47:48,828 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 00:47:50,446 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:47:50,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 00:47:52,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:47:53,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:47:53,673 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 00:47:53,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:47:56,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 00:47:58,241 INFO [train.py:1039] (1/4) Epoch 6, batch 2800, loss[loss=0.2568, simple_loss=0.3058, pruned_loss=0.1039, over 23854.00 frames. ], tot_loss[loss=0.2389, simple_loss=0.2992, pruned_loss=0.08926, over 4693634.04 frames. ], batch size: 212, lr: 1.70e-02, grad_scale: 32.0 2023-09-29 00:47:58,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:47:58,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:47:59,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 00:47:59,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:48:00,022 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:48:03,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:48:03,157 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 00:48:03,158 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 00:48:07,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:48:09,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 00:48:09,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:48:14,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:48:15,872 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 00:48:19,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 00:48:20,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 00:48:21,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:48:22,570 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:48:22,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:48:24,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:48:26,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:48:26,354 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 00:48:26,718 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=195800.0, ans=0.125 2023-09-29 00:48:27,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:48:35,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:48:37,144 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:48:38,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:48:40,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:48:41,184 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=195866.66666666666, ans=0.1 2023-09-29 00:48:42,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:48:46,561 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 00:48:47,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:48:47,717 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 00:48:49,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:48:49,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:48:49,363 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:48:56,010 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:48:57,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:48:59,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:49:01,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:49:02,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:49:02,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 00:49:02,971 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 00:49:04,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 00:49:05,386 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=12.45 vs. limit=15.0 2023-09-29 00:49:06,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:49:06,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 00:49:06,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:49:07,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:49:07,697 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:49:09,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 00:49:10,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:49:10,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:49:12,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:49:13,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 00:49:15,932 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=196000.0, ans=0.125 2023-09-29 00:49:19,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:49:19,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 00:49:19,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:49:21,351 INFO [train.py:1039] (1/4) Epoch 6, batch 2850, loss[loss=0.2417, simple_loss=0.2956, pruned_loss=0.09387, over 23640.00 frames. ], tot_loss[loss=0.2367, simple_loss=0.2972, pruned_loss=0.08816, over 4697029.79 frames. ], batch size: 232, lr: 1.69e-02, grad_scale: 32.0 2023-09-29 00:49:23,058 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:49:26,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:49:26,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:49:26,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:49:29,756 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:49:29,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:49:32,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 00:49:33,400 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 00:49:39,505 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 00:49:39,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:49:41,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 00:49:41,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:49:43,508 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.42 vs. limit=6.0 2023-09-29 00:49:44,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 00:49:44,235 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 00:49:47,735 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:49:59,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:49:59,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:49:59,507 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=196200.0, ans=0.125 2023-09-29 00:50:00,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:50:01,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 00:50:01,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 00:50:01,187 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 00:50:02,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 00:50:02,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 00:50:07,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 00:50:07,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:50:07,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:50:09,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:50:12,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:50:12,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:50:12,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:50:14,079 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=196266.66666666666, ans=0.125 2023-09-29 00:50:15,592 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:50:15,808 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:50:17,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:50:17,837 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.34 vs. limit=15.0 2023-09-29 00:50:18,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:50:21,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:50:21,328 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=196266.66666666666, ans=0.125 2023-09-29 00:50:21,727 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=12.24 vs. limit=15.0 2023-09-29 00:50:24,239 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.740e+02 2.055e+02 2.329e+02 2.690e+02 4.548e+02, threshold=4.658e+02, percent-clipped=0.0 2023-09-29 00:50:27,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:50:31,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 00:50:31,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 00:50:32,776 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 00:50:32,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:50:32,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 00:50:34,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:50:34,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:50:34,499 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:50:34,544 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:50:34,545 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 00:50:34,627 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 00:50:34,633 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:50:36,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:50:42,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 00:50:42,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:50:42,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:50:44,192 INFO [train.py:1039] (1/4) Epoch 6, batch 2900, loss[loss=0.2538, simple_loss=0.3303, pruned_loss=0.08861, over 24044.00 frames. ], tot_loss[loss=0.2377, simple_loss=0.2981, pruned_loss=0.08858, over 4685450.19 frames. ], batch size: 80, lr: 1.69e-02, grad_scale: 32.0 2023-09-29 00:50:44,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 00:50:47,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:50:47,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 00:50:47,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 00:50:50,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:50:50,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 00:50:52,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:50:55,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:50:56,181 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=196400.0, ans=0.125 2023-09-29 00:50:58,089 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.77 vs. limit=15.0 2023-09-29 00:50:59,571 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:50:59,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:51:02,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 00:51:02,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 00:51:04,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 00:51:06,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:51:09,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 00:51:10,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 00:51:14,603 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:51:14,619 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 00:51:14,645 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:51:16,475 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=196533.33333333334, ans=0.0 2023-09-29 00:51:17,700 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:51:17,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 00:51:20,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:51:20,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:51:23,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:51:25,560 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:51:27,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 00:51:27,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 00:51:27,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:51:32,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:51:34,675 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 00:51:35,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 00:51:36,163 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:51:43,410 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:51:43,795 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=196600.0, ans=0.1 2023-09-29 00:51:52,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:51:52,706 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 00:51:54,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 00:51:57,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:51:57,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 00:51:58,924 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:52:00,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:52:02,379 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=196666.66666666666, ans=0.125 2023-09-29 00:52:05,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:52:06,202 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=196733.33333333334, ans=0.0 2023-09-29 00:52:07,278 INFO [train.py:1039] (1/4) Epoch 6, batch 2950, loss[loss=0.2216, simple_loss=0.2923, pruned_loss=0.07546, over 24484.00 frames. ], tot_loss[loss=0.2364, simple_loss=0.2985, pruned_loss=0.08715, over 4707571.70 frames. ], batch size: 63, lr: 1.69e-02, grad_scale: 32.0 2023-09-29 00:52:07,556 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 00:52:07,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:52:07,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:52:11,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:52:12,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:52:14,724 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 00:52:14,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 00:52:14,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 00:52:14,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:52:21,796 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:52:23,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:52:24,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:52:24,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:52:28,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:52:29,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:52:30,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:52:32,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:52:32,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:52:34,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 00:52:39,603 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=196866.66666666666, ans=0.1 2023-09-29 00:52:40,920 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 00:52:42,922 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 00:52:43,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:52:45,985 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 00:52:46,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 00:52:46,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:52:47,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:52:47,550 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 00:52:47,569 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 00:52:49,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 00:52:51,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:52:51,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:52:54,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:52:56,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:52:56,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:52:58,040 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 00:52:58,112 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:52:59,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 00:53:05,629 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:53:07,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:53:07,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 00:53:07,254 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:53:10,065 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.711e+02 2.213e+02 2.464e+02 2.740e+02 4.622e+02, threshold=4.928e+02, percent-clipped=0.0 2023-09-29 00:53:10,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 00:53:11,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:53:15,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:53:15,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:53:15,649 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:53:15,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 00:53:17,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:53:19,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:53:19,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:53:19,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 00:53:20,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:53:20,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:53:22,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:53:22,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 00:53:22,637 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=197000.0, ans=0.2 2023-09-29 00:53:24,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:53:26,416 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:53:27,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 00:53:30,693 INFO [train.py:1039] (1/4) Epoch 6, batch 3000, loss[loss=0.2695, simple_loss=0.3155, pruned_loss=0.1118, over 23529.00 frames. ], tot_loss[loss=0.238, simple_loss=0.2996, pruned_loss=0.08823, over 4705110.31 frames. ], batch size: 256, lr: 1.69e-02, grad_scale: 32.0 2023-09-29 00:53:30,693 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-29 00:53:44,105 INFO [zipformer.py:1853] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([2.8422, 2.4853, 1.7824, 2.4949, 2.1734, 2.4209, 2.4518, 2.5178], device='cuda:1') 2023-09-29 00:53:45,525 INFO [train.py:1071] (1/4) Epoch 6, validation: loss=0.3825, simple_loss=0.3275, pruned_loss=0.2187, over 1125622.00 frames. 2023-09-29 00:53:45,526 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-29 00:53:47,221 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 00:53:47,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 00:53:49,008 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=197066.66666666666, ans=0.1 2023-09-29 00:53:50,298 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:53:50,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:53:51,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 00:53:51,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:54:00,728 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 00:54:08,830 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:54:14,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 00:54:17,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 00:54:19,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 00:54:19,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:54:21,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:54:21,858 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=197200.0, ans=15.0 2023-09-29 00:54:23,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:54:23,388 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 00:54:26,844 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 00:54:28,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:54:28,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 00:54:30,247 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:54:30,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:54:32,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:54:32,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:54:36,364 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=197266.66666666666, ans=0.1 2023-09-29 00:54:36,457 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=197266.66666666666, ans=0.0 2023-09-29 00:54:37,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 00:54:37,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:54:37,788 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:54:39,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:54:42,543 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 00:54:42,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:54:42,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:54:44,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:54:48,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:54:48,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:54:50,351 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 00:54:50,411 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 00:54:50,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:54:50,520 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 00:54:50,602 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:54:53,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 00:54:57,304 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 00:54:57,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 00:54:57,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 00:54:59,395 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 00:54:59,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 00:55:00,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:55:02,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:55:02,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 00:55:02,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:55:03,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:55:04,120 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=197333.33333333334, ans=0.125 2023-09-29 00:55:07,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 00:55:08,902 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.65 vs. limit=22.5 2023-09-29 00:55:09,409 INFO [train.py:1039] (1/4) Epoch 6, batch 3050, loss[loss=0.254, simple_loss=0.3092, pruned_loss=0.09941, over 23395.00 frames. ], tot_loss[loss=0.2384, simple_loss=0.3002, pruned_loss=0.08833, over 4716733.85 frames. ], batch size: 119, lr: 1.69e-02, grad_scale: 32.0 2023-09-29 00:55:09,526 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:55:12,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:55:12,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:55:17,260 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:55:20,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 00:55:26,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 00:55:28,078 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 00:55:28,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:55:31,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:55:33,549 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=197466.66666666666, ans=0.0 2023-09-29 00:55:34,869 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:55:34,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:55:36,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:55:40,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:55:41,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 00:55:41,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:55:41,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:55:41,573 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:55:43,721 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:55:47,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:55:48,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:55:49,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 00:55:50,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:55:50,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 00:55:53,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:55:54,821 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.66 vs. limit=5.0 2023-09-29 00:55:55,138 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:55:56,589 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:55:56,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:56:02,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:56:04,014 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:56:10,217 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 2.122e+02 2.325e+02 2.738e+02 3.532e+02, threshold=4.649e+02, percent-clipped=0.0 2023-09-29 00:56:10,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:56:10,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:56:10,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:56:12,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:56:12,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 00:56:14,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:56:14,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 00:56:17,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:56:17,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:56:20,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 00:56:23,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:56:29,561 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:56:31,034 INFO [train.py:1039] (1/4) Epoch 6, batch 3100, loss[loss=0.2401, simple_loss=0.3148, pruned_loss=0.08273, over 24296.00 frames. ], tot_loss[loss=0.2378, simple_loss=0.2997, pruned_loss=0.08798, over 4724116.25 frames. ], batch size: 74, lr: 1.69e-02, grad_scale: 32.0 2023-09-29 00:56:31,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:56:34,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 00:56:35,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 00:56:36,419 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.79 vs. limit=15.0 2023-09-29 00:56:37,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 00:56:40,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 00:56:43,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:56:44,903 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:56:44,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:56:48,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 00:56:52,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:57:00,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 00:57:04,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 00:57:04,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:57:05,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:57:07,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:57:07,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 00:57:10,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:57:10,081 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 00:57:10,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:57:11,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:57:13,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 00:57:13,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:57:17,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:57:17,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 00:57:20,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 00:57:20,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:57:20,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:57:25,166 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:57:25,186 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:57:25,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:57:25,379 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=197933.33333333334, ans=0.125 2023-09-29 00:57:26,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 00:57:26,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:57:27,693 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=14.49 vs. limit=22.5 2023-09-29 00:57:29,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:57:29,825 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:57:29,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:57:29,839 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 00:57:30,729 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=197933.33333333334, ans=0.0 2023-09-29 00:57:34,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:57:36,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 00:57:39,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:57:39,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 00:57:39,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:57:41,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:57:41,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 00:57:49,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 00:57:51,007 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer_ff3.min_abs, batch_count=198066.66666666666, ans=0.2 2023-09-29 00:57:51,992 INFO [train.py:1039] (1/4) Epoch 6, batch 3150, loss[loss=0.2538, simple_loss=0.2993, pruned_loss=0.1042, over 23899.00 frames. ], tot_loss[loss=0.236, simple_loss=0.2982, pruned_loss=0.08687, over 4727318.40 frames. ], batch size: 195, lr: 1.69e-02, grad_scale: 16.0 2023-09-29 00:57:52,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:57:54,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:57:55,882 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:57:55,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:57:55,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 00:57:59,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:58:00,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 00:58:02,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 00:58:02,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:58:05,784 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 00:58:09,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 00:58:09,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:58:10,980 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 00:58:11,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 00:58:12,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 00:58:14,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 00:58:14,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 00:58:14,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:58:14,105 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:58:14,359 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=198133.33333333334, ans=0.1 2023-09-29 00:58:15,682 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:58:17,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 00:58:20,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:58:20,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:58:20,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:58:23,209 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:58:26,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 00:58:28,475 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:58:31,516 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:58:31,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:58:33,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 00:58:35,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 00:58:36,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:58:36,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 00:58:36,797 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 00:58:36,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:58:36,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:58:39,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 00:58:39,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 00:58:40,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 00:58:42,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 00:58:42,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:58:43,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:58:43,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:58:45,199 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 00:58:45,554 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=198266.66666666666, ans=0.2 2023-09-29 00:58:46,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:58:48,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 00:58:48,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:58:48,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 00:58:49,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 00:58:52,781 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:58:52,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:58:54,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 00:58:55,615 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 2.072e+02 2.389e+02 2.889e+02 3.902e+02, threshold=4.779e+02, percent-clipped=0.0 2023-09-29 00:58:55,861 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 00:58:55,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:59:00,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:59:01,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:59:02,008 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:59:05,786 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=198333.33333333334, ans=0.1 2023-09-29 00:59:07,792 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:59:07,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:59:12,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 00:59:14,244 INFO [train.py:1039] (1/4) Epoch 6, batch 3200, loss[loss=0.2349, simple_loss=0.2933, pruned_loss=0.08832, over 23814.00 frames. ], tot_loss[loss=0.2343, simple_loss=0.297, pruned_loss=0.08583, over 4733920.45 frames. ], batch size: 212, lr: 1.68e-02, grad_scale: 32.0 2023-09-29 00:59:17,025 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=198400.0, ans=0.125 2023-09-29 00:59:18,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:59:18,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 00:59:21,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:59:23,064 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:59:23,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 00:59:24,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:59:29,438 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 00:59:34,136 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 00:59:35,260 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:59:39,249 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=198466.66666666666, ans=0.0 2023-09-29 00:59:44,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 00:59:55,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 00:59:56,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:00:00,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 01:00:00,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 01:00:03,323 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=198600.0, ans=0.09899494936611666 2023-09-29 01:00:03,419 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=198600.0, ans=0.2 2023-09-29 01:00:04,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:00:04,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 01:00:05,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:00:09,089 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 01:00:10,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 01:00:12,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 01:00:17,852 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 01:00:19,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:00:26,630 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:00:26,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:00:26,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:00:26,799 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 01:00:26,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 01:00:31,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:00:33,093 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 01:00:34,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 01:00:36,063 INFO [train.py:1039] (1/4) Epoch 6, batch 3250, loss[loss=0.253, simple_loss=0.3075, pruned_loss=0.09926, over 18076.00 frames. ], tot_loss[loss=0.2349, simple_loss=0.2974, pruned_loss=0.08618, over 4728505.61 frames. ], batch size: 39, lr: 1.68e-02, grad_scale: 32.0 2023-09-29 01:00:36,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 01:00:37,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 01:00:39,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:00:42,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 01:00:42,479 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 01:00:42,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:00:42,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:00:42,703 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 01:00:48,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:00:49,995 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=198733.33333333334, ans=0.125 2023-09-29 01:00:51,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:01:00,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:01:00,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 01:01:02,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:01:02,312 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:01:02,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:01:05,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 01:01:05,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:01:07,270 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=198800.0, ans=0.0 2023-09-29 01:01:08,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:01:08,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 01:01:08,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:01:09,491 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.71 vs. limit=6.0 2023-09-29 01:01:09,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:01:09,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:01:09,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:01:13,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:01:16,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 01:01:17,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:01:17,644 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:01:18,071 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=198866.66666666666, ans=0.1 2023-09-29 01:01:19,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:01:19,330 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:01:19,346 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:01:21,881 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=198866.66666666666, ans=0.0 2023-09-29 01:01:23,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 01:01:24,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:01:24,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:01:26,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:01:27,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:01:33,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:01:39,287 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.10 vs. limit=22.5 2023-09-29 01:01:41,414 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.784e+02 2.144e+02 2.444e+02 2.943e+02 3.918e+02, threshold=4.889e+02, percent-clipped=0.0 2023-09-29 01:01:41,632 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:01:43,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:01:43,009 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 01:01:43,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:01:43,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 01:01:44,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:01:47,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 01:01:47,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 01:01:47,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:01:49,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:01:50,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:01:50,999 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=199000.0, ans=0.1 2023-09-29 01:01:52,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 01:01:52,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:01:55,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:01:56,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:01:57,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 01:01:57,645 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:01:59,082 INFO [train.py:1039] (1/4) Epoch 6, batch 3300, loss[loss=0.2324, simple_loss=0.3062, pruned_loss=0.07932, over 24437.00 frames. ], tot_loss[loss=0.2355, simple_loss=0.298, pruned_loss=0.08652, over 4731444.14 frames. ], batch size: 77, lr: 1.68e-02, grad_scale: 32.0 2023-09-29 01:02:00,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 01:02:00,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 01:02:02,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:02:04,465 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 01:02:07,567 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 01:02:07,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 01:02:07,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:02:12,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:02:14,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:02:14,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:02:17,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 01:02:17,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 01:02:19,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:02:20,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:02:25,040 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 01:02:25,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:02:25,175 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:02:27,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:02:28,736 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 01:02:30,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:02:31,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 01:02:31,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:02:31,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:02:31,967 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 01:02:36,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:02:36,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 01:02:38,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:02:40,106 WARNING [train.py:1197] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 01:02:41,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 01:02:41,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:02:41,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:02:43,774 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 01:02:47,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 01:02:47,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:02:50,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 01:02:51,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:02:55,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 01:02:56,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:02:56,921 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=199266.66666666666, ans=0.125 2023-09-29 01:02:58,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:02:58,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:02:58,291 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:02:58,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 01:03:02,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:03:02,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:03:03,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:03:05,054 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 01:03:06,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 01:03:09,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 01:03:10,249 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:03:10,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:03:13,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:03:13,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:03:14,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 01:03:14,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:03:14,867 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 01:03:16,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:03:17,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 01:03:21,879 INFO [train.py:1039] (1/4) Epoch 6, batch 3350, loss[loss=0.245, simple_loss=0.3004, pruned_loss=0.09473, over 23555.00 frames. ], tot_loss[loss=0.2377, simple_loss=0.2994, pruned_loss=0.08801, over 4726466.31 frames. ], batch size: 120, lr: 1.68e-02, grad_scale: 16.0 2023-09-29 01:03:21,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 01:03:22,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:03:23,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:03:25,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:03:25,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:03:28,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:03:29,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:03:29,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:03:32,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:03:34,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:03:34,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:03:36,624 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=199466.66666666666, ans=0.0 2023-09-29 01:03:37,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:03:40,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:03:40,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:03:42,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:03:44,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 01:03:46,040 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 01:03:46,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:03:49,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 01:03:49,050 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 01:03:50,565 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 01:03:50,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:03:52,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:03:52,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 01:03:52,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:03:52,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:03:52,799 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.05 vs. limit=15.0 2023-09-29 01:03:55,611 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:03:57,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:03:57,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:04:00,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:04:03,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:04:05,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:04:06,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:04:08,602 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 01:04:09,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:04:10,064 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.max_abs, batch_count=199600.0, ans=10.0 2023-09-29 01:04:11,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:04:13,360 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:04:13,376 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:04:16,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:04:18,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 01:04:18,354 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 01:04:18,416 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 01:04:19,857 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:04:19,998 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 01:04:21,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:04:23,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:04:25,671 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.657e+02 2.082e+02 2.289e+02 2.624e+02 4.671e+02, threshold=4.577e+02, percent-clipped=0.0 2023-09-29 01:04:31,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:04:31,844 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=199666.66666666666, ans=0.2 2023-09-29 01:04:32,960 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 01:04:33,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 01:04:34,550 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:04:34,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:04:39,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:04:42,788 INFO [train.py:1039] (1/4) Epoch 6, batch 3400, loss[loss=0.2436, simple_loss=0.2904, pruned_loss=0.09841, over 23817.00 frames. ], tot_loss[loss=0.238, simple_loss=0.2998, pruned_loss=0.08808, over 4713330.11 frames. ], batch size: 179, lr: 1.68e-02, grad_scale: 16.0 2023-09-29 01:04:42,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 01:04:42,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 01:04:43,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:04:45,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:04:45,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 01:04:47,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:04:47,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 01:04:49,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:04:49,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:04:49,672 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 01:04:50,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:04:51,021 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 01:04:55,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 01:04:55,776 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 01:04:55,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:05:00,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:05:00,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 01:05:00,547 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:05:02,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 01:05:07,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:05:09,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 01:05:14,026 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 01:05:16,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:05:17,702 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:05:18,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 01:05:24,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:05:29,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 01:05:35,517 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:05:37,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:05:37,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 01:05:37,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:05:39,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:05:39,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:05:41,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:05:43,344 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=199933.33333333334, ans=0.0 2023-09-29 01:05:44,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:05:48,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 01:05:48,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:05:57,006 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:05:58,706 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 01:06:02,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 01:06:06,477 INFO [train.py:1039] (1/4) Epoch 6, batch 3450, loss[loss=0.2478, simple_loss=0.3071, pruned_loss=0.09429, over 23519.00 frames. ], tot_loss[loss=0.2398, simple_loss=0.3009, pruned_loss=0.08931, over 4697879.10 frames. ], batch size: 93, lr: 1.68e-02, grad_scale: 16.0 2023-09-29 01:06:06,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 01:06:09,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 01:06:11,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:06:13,401 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:06:13,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 01:06:13,770 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=200066.66666666666, ans=0.125 2023-09-29 01:06:14,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:06:18,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 01:06:23,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:06:24,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:06:26,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:06:26,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:06:28,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:06:34,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 01:06:37,819 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=200200.0, ans=0.125 2023-09-29 01:06:40,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 01:06:40,611 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 01:06:40,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:06:42,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:06:48,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 01:06:50,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:06:54,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:06:54,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:06:56,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 01:06:58,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:06:59,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 01:06:59,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:06:59,723 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=200266.66666666666, ans=0.125 2023-09-29 01:07:00,119 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.25 vs. limit=15.0 2023-09-29 01:07:03,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:07:04,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:07:06,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 01:07:11,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:07:12,883 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 2.101e+02 2.630e+02 3.255e+02 5.395e+02, threshold=5.260e+02, percent-clipped=4.0 2023-09-29 01:07:14,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:07:14,883 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=200333.33333333334, ans=0.125 2023-09-29 01:07:16,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:07:19,256 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=200333.33333333334, ans=10.0 2023-09-29 01:07:19,758 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:07:20,307 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=200333.33333333334, ans=0.125 2023-09-29 01:07:23,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:07:23,618 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:07:25,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:07:25,770 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:07:30,104 INFO [train.py:1039] (1/4) Epoch 6, batch 3500, loss[loss=0.2425, simple_loss=0.3, pruned_loss=0.09245, over 23492.00 frames. ], tot_loss[loss=0.2376, simple_loss=0.2984, pruned_loss=0.08836, over 4693439.03 frames. ], batch size: 134, lr: 1.68e-02, grad_scale: 16.0 2023-09-29 01:07:30,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:07:32,711 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=200400.0, ans=0.1 2023-09-29 01:07:35,202 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:07:35,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 01:07:36,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 01:07:39,907 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 01:07:41,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:07:41,619 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 01:07:44,925 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:07:45,112 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=200466.66666666666, ans=0.125 2023-09-29 01:07:47,946 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:07:49,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:07:49,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:07:49,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 01:07:50,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:07:50,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:07:50,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 01:07:55,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:07:55,245 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 01:07:55,485 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=200466.66666666666, ans=10.0 2023-09-29 01:07:58,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:08:01,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:08:03,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 01:08:03,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:08:07,197 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:08:07,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:08:08,881 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:08:10,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:08:10,505 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:08:12,090 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 01:08:12,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 01:08:13,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 01:08:13,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:08:15,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:08:15,578 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=200533.33333333334, ans=0.125 2023-09-29 01:08:16,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:08:16,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:08:19,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 01:08:21,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:08:27,174 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:08:28,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 01:08:28,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 01:08:28,672 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:08:31,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:08:33,857 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:08:35,457 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:08:39,208 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 01:08:40,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:08:42,358 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:08:43,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 01:08:45,400 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 01:08:47,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:08:48,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:08:48,581 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:08:48,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:08:53,068 INFO [train.py:1039] (1/4) Epoch 6, batch 3550, loss[loss=0.2506, simple_loss=0.3048, pruned_loss=0.09818, over 23791.00 frames. ], tot_loss[loss=0.2368, simple_loss=0.2975, pruned_loss=0.08803, over 4700865.94 frames. ], batch size: 164, lr: 1.68e-02, grad_scale: 8.0 2023-09-29 01:08:53,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:08:58,138 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=200733.33333333334, ans=0.0 2023-09-29 01:09:02,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:09:05,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 01:09:07,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:09:09,590 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:09:13,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:09:13,371 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=200800.0, ans=0.1 2023-09-29 01:09:14,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:09:14,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:09:16,361 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:09:16,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:09:17,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:09:17,917 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 01:09:19,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 01:09:24,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:09:24,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:09:27,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:09:27,040 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:09:28,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:09:28,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 01:09:28,475 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:09:30,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:09:31,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 01:09:35,972 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.39 vs. limit=10.0 2023-09-29 01:09:36,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:09:37,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:09:39,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:09:40,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 01:09:42,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 01:09:42,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 01:09:44,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:09:46,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:09:46,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:09:50,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 01:09:52,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:09:59,416 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.687e+02 2.123e+02 2.427e+02 3.037e+02 5.186e+02, threshold=4.854e+02, percent-clipped=0.0 2023-09-29 01:09:59,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:10:01,126 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 01:10:01,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:10:04,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:10:04,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 01:10:04,689 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=201000.0, ans=0.125 2023-09-29 01:10:11,285 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 01:10:12,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:10:12,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:10:14,608 INFO [train.py:1039] (1/4) Epoch 6, batch 3600, loss[loss=0.239, simple_loss=0.3122, pruned_loss=0.08292, over 24289.00 frames. ], tot_loss[loss=0.235, simple_loss=0.2961, pruned_loss=0.08693, over 4706287.89 frames. ], batch size: 74, lr: 1.67e-02, grad_scale: 16.0 2023-09-29 01:10:16,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:10:16,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:10:18,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:10:23,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:10:25,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:10:25,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:10:25,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:10:26,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:10:26,818 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 01:10:31,407 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 01:10:32,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:10:33,001 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=201133.33333333334, ans=0.1 2023-09-29 01:10:37,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:10:37,731 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:10:39,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 01:10:40,693 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:10:40,745 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 01:10:42,187 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:10:45,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:10:47,303 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:10:48,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:10:51,261 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:10:51,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:10:54,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 01:10:54,427 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=201200.0, ans=0.125 2023-09-29 01:11:02,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:11:03,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 01:11:03,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 01:11:08,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:11:10,194 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=201266.66666666666, ans=0.125 2023-09-29 01:11:14,235 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.44 vs. limit=15.0 2023-09-29 01:11:14,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:11:16,377 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:11:24,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 01:11:25,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:11:25,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 01:11:25,931 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=201333.33333333334, ans=0.07 2023-09-29 01:11:27,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 01:11:27,207 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 01:11:28,960 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=201333.33333333334, ans=0.1 2023-09-29 01:11:30,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:11:32,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:11:34,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 01:11:34,383 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:11:35,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:11:35,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:11:35,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 01:11:35,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 01:11:38,727 INFO [train.py:1039] (1/4) Epoch 6, batch 3650, loss[loss=0.1875, simple_loss=0.2592, pruned_loss=0.05792, over 24586.00 frames. ], tot_loss[loss=0.2347, simple_loss=0.2963, pruned_loss=0.08654, over 4722356.79 frames. ], batch size: 60, lr: 1.67e-02, grad_scale: 16.0 2023-09-29 01:11:39,208 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 01:11:40,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:11:41,817 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 01:11:42,020 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=201400.0, ans=0.125 2023-09-29 01:11:46,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 01:11:46,723 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:11:49,997 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=201400.0, ans=0.125 2023-09-29 01:11:51,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 01:11:51,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 01:11:55,870 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.80 vs. limit=6.0 2023-09-29 01:11:56,460 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:11:56,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 01:11:58,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:12:01,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 01:12:01,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:12:01,885 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=201466.66666666666, ans=0.1 2023-09-29 01:12:03,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 01:12:03,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:12:04,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:12:05,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 01:12:05,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 01:12:07,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:12:07,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:12:09,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:12:12,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 01:12:12,605 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 01:12:14,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:12:15,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 01:12:17,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:12:17,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:12:21,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 01:12:25,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:12:25,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 01:12:26,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 01:12:27,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:12:30,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:12:30,409 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=201600.0, ans=0.125 2023-09-29 01:12:32,396 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:12:33,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:12:33,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:12:35,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 01:12:37,605 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:12:37,698 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:12:44,483 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 01:12:47,280 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.749e+02 2.133e+02 2.417e+02 2.802e+02 4.868e+02, threshold=4.835e+02, percent-clipped=1.0 2023-09-29 01:12:49,085 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=201666.66666666666, ans=0.125 2023-09-29 01:12:50,201 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:12:50,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:12:51,811 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 01:12:51,887 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:12:52,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 01:12:53,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:12:55,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 01:12:55,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:12:58,247 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 01:12:59,933 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:13:00,313 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=201733.33333333334, ans=0.125 2023-09-29 01:13:01,771 INFO [train.py:1039] (1/4) Epoch 6, batch 3700, loss[loss=0.2414, simple_loss=0.2998, pruned_loss=0.09154, over 23651.00 frames. ], tot_loss[loss=0.2357, simple_loss=0.2975, pruned_loss=0.08695, over 4728632.61 frames. ], batch size: 149, lr: 1.67e-02, grad_scale: 16.0 2023-09-29 01:13:01,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:13:03,514 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:13:03,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 01:13:03,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:13:05,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 01:13:07,034 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 01:13:10,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 01:13:13,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:13:13,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:13:15,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:13:17,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:13:17,429 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 01:13:17,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:13:19,281 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 01:13:28,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:13:28,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 01:13:29,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:13:29,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 01:13:29,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:13:35,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:13:36,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 01:13:38,192 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:13:39,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:13:42,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:13:42,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 01:13:44,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 01:13:46,035 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.64 vs. limit=22.5 2023-09-29 01:13:48,130 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:13:48,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 01:13:48,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:13:48,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 01:13:53,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:13:54,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:13:59,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:13:59,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 01:14:02,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:14:02,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 01:14:02,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:14:02,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:14:07,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:14:07,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 01:14:09,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 01:14:09,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:14:10,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:14:12,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:14:14,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:14:15,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:14:17,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:14:18,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:14:19,327 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.76 vs. limit=22.5 2023-09-29 01:14:21,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 01:14:23,855 INFO [train.py:1039] (1/4) Epoch 6, batch 3750, loss[loss=0.2457, simple_loss=0.3225, pruned_loss=0.08445, over 24446.00 frames. ], tot_loss[loss=0.2372, simple_loss=0.2994, pruned_loss=0.08748, over 4723568.20 frames. ], batch size: 69, lr: 1.67e-02, grad_scale: 16.0 2023-09-29 01:14:24,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 01:14:25,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 01:14:27,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 01:14:27,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:14:29,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:14:30,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:14:30,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:14:35,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:14:35,879 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=202066.66666666666, ans=0.125 2023-09-29 01:14:38,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:14:40,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 01:14:44,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:14:48,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:14:50,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 01:14:50,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:14:51,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:14:53,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:14:53,869 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=202133.33333333334, ans=0.1 2023-09-29 01:14:55,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 01:14:57,639 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=202200.0, ans=0.1 2023-09-29 01:15:00,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 01:15:01,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:15:01,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:15:03,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:15:08,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:15:10,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 01:15:16,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 01:15:18,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:15:22,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:15:22,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:15:26,706 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 01:15:30,043 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=4.98 vs. limit=10.0 2023-09-29 01:15:30,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 01:15:31,935 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.739e+02 2.301e+02 2.601e+02 3.130e+02 4.781e+02, threshold=5.202e+02, percent-clipped=0.0 2023-09-29 01:15:32,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 01:15:32,392 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=202333.33333333334, ans=0.1 2023-09-29 01:15:35,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:15:36,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:15:39,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 01:15:46,256 INFO [train.py:1039] (1/4) Epoch 6, batch 3800, loss[loss=0.2473, simple_loss=0.306, pruned_loss=0.09428, over 23502.00 frames. ], tot_loss[loss=0.237, simple_loss=0.2991, pruned_loss=0.08747, over 4718004.88 frames. ], batch size: 105, lr: 1.67e-02, grad_scale: 16.0 2023-09-29 01:15:48,477 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:15:53,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:15:54,014 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=202400.0, ans=0.125 2023-09-29 01:15:55,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 01:15:55,459 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 01:15:58,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:15:58,516 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:16:00,058 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 01:16:01,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 01:16:01,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:16:03,344 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:16:05,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:16:05,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:16:05,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:16:06,031 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=202466.66666666666, ans=0.05 2023-09-29 01:16:07,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 01:16:10,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 01:16:10,521 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:16:13,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:16:15,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:16:17,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 01:16:19,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 01:16:19,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:16:20,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:16:22,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:16:28,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 01:16:28,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 01:16:30,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:16:30,910 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=202533.33333333334, ans=0.0 2023-09-29 01:16:38,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:16:43,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:16:46,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 01:16:48,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 01:16:48,361 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:16:51,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:16:51,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:16:53,942 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.44 vs. limit=10.0 2023-09-29 01:16:55,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 01:16:58,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 01:17:00,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 01:17:00,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:17:02,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:17:07,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:17:08,800 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 01:17:10,330 INFO [train.py:1039] (1/4) Epoch 6, batch 3850, loss[loss=0.2462, simple_loss=0.318, pruned_loss=0.08721, over 24335.00 frames. ], tot_loss[loss=0.2354, simple_loss=0.2969, pruned_loss=0.087, over 4719738.71 frames. ], batch size: 74, lr: 1.67e-02, grad_scale: 16.0 2023-09-29 01:17:14,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:17:15,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 01:17:18,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:17:18,594 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:17:23,118 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 01:17:24,819 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:17:27,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 01:17:27,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 01:17:35,443 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:17:37,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:17:38,736 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=202800.0, ans=0.125 2023-09-29 01:17:40,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:17:40,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:17:41,983 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=202866.66666666666, ans=0.125 2023-09-29 01:17:43,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:17:43,313 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:17:44,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:17:44,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:17:46,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:17:49,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:17:51,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:17:51,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:17:51,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 01:17:51,267 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 01:17:51,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:17:51,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:17:54,585 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=202866.66666666666, ans=0.2 2023-09-29 01:17:55,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:17:55,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:17:57,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 01:17:59,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 01:17:59,503 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=202933.33333333334, ans=0.0 2023-09-29 01:18:00,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:18:02,161 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 01:18:05,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 01:18:08,012 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=202933.33333333334, ans=0.125 2023-09-29 01:18:09,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:18:11,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:18:16,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=203000.0, ans=0.0 2023-09-29 01:18:17,504 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.828e+02 2.249e+02 2.602e+02 3.151e+02 5.214e+02, threshold=5.203e+02, percent-clipped=1.0 2023-09-29 01:18:17,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:18:17,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 01:18:20,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 01:18:21,739 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=203000.0, ans=0.1 2023-09-29 01:18:23,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:18:23,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:18:26,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 01:18:26,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 01:18:26,728 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=16.19 vs. limit=15.0 2023-09-29 01:18:27,544 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:18:27,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:18:27,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:18:29,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 01:18:29,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:18:30,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 01:18:30,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:18:30,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:18:32,219 INFO [train.py:1039] (1/4) Epoch 6, batch 3900, loss[loss=0.2501, simple_loss=0.3171, pruned_loss=0.09153, over 23626.00 frames. ], tot_loss[loss=0.2347, simple_loss=0.2959, pruned_loss=0.08681, over 4710748.02 frames. ], batch size: 85, lr: 1.67e-02, grad_scale: 16.0 2023-09-29 01:18:33,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:18:33,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:18:36,830 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:18:36,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:18:36,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:18:38,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:18:38,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 01:18:38,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:18:44,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:18:44,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 01:18:44,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:18:46,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:18:49,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 01:18:49,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:18:49,970 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=203133.33333333334, ans=0.125 2023-09-29 01:18:51,260 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 01:18:52,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 01:18:52,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:18:54,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 01:18:54,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:18:56,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 01:18:58,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 01:19:02,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:19:02,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:19:04,097 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:19:04,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:19:08,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:19:12,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:19:13,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:19:13,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:19:13,790 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:19:19,541 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=203200.0, ans=0.04949747468305833 2023-09-29 01:19:21,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:19:21,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:19:30,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 01:19:33,644 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:19:34,188 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=203266.66666666666, ans=0.125 2023-09-29 01:19:38,855 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=203333.33333333334, ans=0.0 2023-09-29 01:19:41,770 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:19:44,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:19:44,941 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 01:19:46,341 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 01:19:46,361 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:19:46,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 01:19:46,886 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=203333.33333333334, ans=0.125 2023-09-29 01:19:50,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:19:50,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 01:19:55,239 INFO [train.py:1039] (1/4) Epoch 6, batch 3950, loss[loss=0.2477, simple_loss=0.3033, pruned_loss=0.09611, over 23386.00 frames. ], tot_loss[loss=0.2337, simple_loss=0.2955, pruned_loss=0.08597, over 4718650.86 frames. ], batch size: 285, lr: 1.66e-02, grad_scale: 16.0 2023-09-29 01:19:57,211 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 01:19:58,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:19:58,832 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=203400.0, ans=0.125 2023-09-29 01:20:01,888 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 01:20:01,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:20:04,595 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=203400.0, ans=10.0 2023-09-29 01:20:05,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:20:07,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:20:13,298 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 01:20:13,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:20:14,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 01:20:14,862 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 01:20:14,899 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:20:17,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:20:17,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 01:20:17,933 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:20:21,032 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 01:20:24,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:20:24,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:20:24,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:20:24,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:20:26,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:20:36,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:20:38,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:20:42,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 01:20:47,637 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 01:20:47,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 01:20:47,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:20:49,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:20:59,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:20:59,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 01:21:00,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:21:00,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:21:00,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 01:21:02,045 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.704e+02 2.135e+02 2.350e+02 2.654e+02 4.554e+02, threshold=4.701e+02, percent-clipped=0.0 2023-09-29 01:21:06,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:21:08,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:21:12,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 01:21:17,428 INFO [train.py:1039] (1/4) Epoch 6, batch 4000, loss[loss=0.2433, simple_loss=0.3056, pruned_loss=0.09049, over 24034.00 frames. ], tot_loss[loss=0.2342, simple_loss=0.296, pruned_loss=0.08622, over 4725453.12 frames. ], batch size: 86, lr: 1.66e-02, grad_scale: 32.0 2023-09-29 01:21:22,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:21:30,193 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=203733.33333333334, ans=0.125 2023-09-29 01:21:30,626 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.38 vs. limit=15.0 2023-09-29 01:21:32,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:21:32,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=203800.0, ans=0.1 2023-09-29 01:21:34,617 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 01:21:37,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:21:37,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:21:37,380 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:21:37,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 01:21:38,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 01:21:40,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 01:21:40,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 01:21:40,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 01:21:41,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:21:44,573 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=203800.0, ans=0.025 2023-09-29 01:21:47,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:21:47,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:21:47,228 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:21:47,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:21:47,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 01:21:48,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:21:52,330 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 01:21:53,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:21:53,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:21:56,961 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 01:21:57,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 01:21:57,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:22:04,685 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 01:22:06,753 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:22:07,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:22:09,307 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 01:22:10,865 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:22:12,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 01:22:12,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:22:13,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:22:14,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:22:15,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:22:15,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 01:22:15,730 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=203933.33333333334, ans=0.1 2023-09-29 01:22:16,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:22:18,830 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=203933.33333333334, ans=0.1 2023-09-29 01:22:19,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 01:22:19,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:22:22,203 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 01:22:27,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:22:30,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 01:22:35,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:22:35,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:22:35,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:22:36,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:22:39,595 INFO [train.py:1039] (1/4) Epoch 6, batch 4050, loss[loss=0.206, simple_loss=0.2775, pruned_loss=0.0672, over 24349.00 frames. ], tot_loss[loss=0.2343, simple_loss=0.2965, pruned_loss=0.08605, over 4727571.86 frames. ], batch size: 56, lr: 1.66e-02, grad_scale: 32.0 2023-09-29 01:22:45,211 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:22:46,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 01:22:47,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 01:22:49,958 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:22:49,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:22:51,479 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:22:52,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 01:22:54,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:22:58,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:22:59,996 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:23:00,096 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 01:23:03,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:23:03,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:23:03,450 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=204133.33333333334, ans=0.2 2023-09-29 01:23:08,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:23:09,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 01:23:12,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 01:23:13,153 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=204200.0, ans=0.05 2023-09-29 01:23:14,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 01:23:14,535 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 01:23:16,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:23:23,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 01:23:24,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:23:29,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:23:31,909 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=204266.66666666666, ans=0.2 2023-09-29 01:23:33,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:23:33,087 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:23:33,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:23:34,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:23:38,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 01:23:38,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 01:23:41,624 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:23:43,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 01:23:48,267 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.708e+02 2.126e+02 2.466e+02 2.913e+02 5.658e+02, threshold=4.933e+02, percent-clipped=1.0 2023-09-29 01:23:48,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:23:48,758 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=204333.33333333334, ans=0.1 2023-09-29 01:23:50,148 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=204333.33333333334, ans=0.0 2023-09-29 01:23:56,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 01:23:56,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:23:56,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:23:59,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 01:23:59,793 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 01:23:59,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:23:59,913 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=204333.33333333334, ans=0.2 2023-09-29 01:23:59,924 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=204333.33333333334, ans=0.0 2023-09-29 01:24:02,555 INFO [train.py:1039] (1/4) Epoch 6, batch 4100, loss[loss=0.3663, simple_loss=0.3821, pruned_loss=0.1752, over 19187.00 frames. ], tot_loss[loss=0.2367, simple_loss=0.2991, pruned_loss=0.08716, over 4719275.40 frames. ], batch size: 389, lr: 1.66e-02, grad_scale: 32.0 2023-09-29 01:24:02,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:24:02,986 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=204400.0, ans=0.1 2023-09-29 01:24:03,037 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=204400.0, ans=0.0 2023-09-29 01:24:04,310 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:24:04,337 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:24:12,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 01:24:14,717 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 01:24:14,965 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=204400.0, ans=0.125 2023-09-29 01:24:16,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 01:24:17,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 01:24:17,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:24:17,860 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:24:17,911 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:24:19,440 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 01:24:19,565 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 01:24:23,167 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:24:24,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:24:24,668 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:24:24,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:24:29,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 01:24:31,388 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:24:31,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:24:31,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 01:24:31,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:24:32,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:24:32,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:24:32,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:24:33,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 01:24:34,886 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:24:36,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 01:24:38,665 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:24:41,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:24:41,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 01:24:44,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:24:44,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:24:44,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:24:45,498 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.63 vs. limit=15.0 2023-09-29 01:24:46,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 01:24:47,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 01:24:48,563 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:24:50,312 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 01:24:51,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:24:51,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 01:24:53,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:25:00,656 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:25:02,400 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=204600.0, ans=0.1 2023-09-29 01:25:05,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:25:06,664 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:25:15,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:25:15,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:25:18,349 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=204666.66666666666, ans=0.1 2023-09-29 01:25:21,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:25:24,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:25:25,896 INFO [train.py:1039] (1/4) Epoch 6, batch 4150, loss[loss=0.2308, simple_loss=0.2732, pruned_loss=0.09423, over 23445.00 frames. ], tot_loss[loss=0.237, simple_loss=0.2991, pruned_loss=0.08748, over 4711121.13 frames. ], batch size: 285, lr: 1.66e-02, grad_scale: 32.0 2023-09-29 01:25:27,718 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 01:25:29,247 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:25:30,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:25:30,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:25:34,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 01:25:34,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:25:36,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 01:25:37,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 01:25:37,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 01:25:39,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:25:44,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:25:44,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:25:48,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:25:49,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:25:50,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 01:25:52,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 01:25:52,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:25:54,143 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 01:25:58,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:26:02,705 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:26:02,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 01:26:07,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 01:26:07,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:26:07,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 01:26:07,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:26:07,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:26:09,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:26:11,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:26:17,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 01:26:21,679 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 01:26:23,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:26:24,834 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 01:26:24,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:26:26,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 01:26:29,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 01:26:29,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:26:31,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:26:33,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 01:26:33,157 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:26:33,160 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 01:26:33,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 01:26:34,529 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.777e+02 2.193e+02 2.474e+02 2.867e+02 4.434e+02, threshold=4.949e+02, percent-clipped=0.0 2023-09-29 01:26:36,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 01:26:36,426 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:26:36,436 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 01:26:36,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 01:26:37,897 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 01:26:37,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:26:38,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 01:26:39,533 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:26:42,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:26:42,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 01:26:42,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 01:26:47,969 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=205066.66666666666, ans=0.0 2023-09-29 01:26:49,013 INFO [train.py:1039] (1/4) Epoch 6, batch 4200, loss[loss=0.212, simple_loss=0.283, pruned_loss=0.07052, over 24453.00 frames. ], tot_loss[loss=0.2356, simple_loss=0.2979, pruned_loss=0.08668, over 4720704.68 frames. ], batch size: 63, lr: 1.66e-02, grad_scale: 32.0 2023-09-29 01:26:49,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:26:49,478 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=205066.66666666666, ans=0.125 2023-09-29 01:26:50,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 01:26:54,679 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:26:56,239 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:26:57,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:26:57,860 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:26:57,863 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:26:59,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 01:26:59,826 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=205066.66666666666, ans=0.125 2023-09-29 01:27:03,247 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.06 vs. limit=15.0 2023-09-29 01:27:04,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 01:27:04,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:27:07,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:27:09,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:27:09,506 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=205133.33333333334, ans=0.0 2023-09-29 01:27:12,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 01:27:13,742 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:27:13,804 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:27:15,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 01:27:15,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:27:17,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:27:17,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:27:17,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:27:19,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:27:21,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 01:27:22,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:27:27,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 01:27:29,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:27:32,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:27:32,647 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=205200.0, ans=0.2 2023-09-29 01:27:33,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:27:36,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:27:36,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 01:27:36,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:27:36,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:27:39,642 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=205266.66666666666, ans=0.125 2023-09-29 01:27:42,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 01:27:45,411 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:27:52,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:27:55,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 01:27:57,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:27:57,505 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=205333.33333333334, ans=0.125 2023-09-29 01:28:02,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 01:28:04,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:28:05,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 01:28:06,054 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=205333.33333333334, ans=0.0 2023-09-29 01:28:11,984 INFO [train.py:1039] (1/4) Epoch 6, batch 4250, loss[loss=0.2314, simple_loss=0.3105, pruned_loss=0.07616, over 24337.00 frames. ], tot_loss[loss=0.2339, simple_loss=0.2965, pruned_loss=0.0856, over 4717782.54 frames. ], batch size: 74, lr: 1.66e-02, grad_scale: 16.0 2023-09-29 01:28:12,105 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 01:28:16,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:28:16,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 01:28:18,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:28:23,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 01:28:23,521 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 01:28:23,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:28:27,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:28:30,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:28:36,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:28:37,333 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:28:38,922 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:28:38,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:28:40,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:28:42,035 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:28:42,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:28:43,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:28:45,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:28:47,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 01:28:49,628 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.83 vs. limit=22.5 2023-09-29 01:28:51,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 01:28:51,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:28:53,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:28:53,377 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:28:53,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:28:53,532 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:28:55,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:28:55,914 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=205533.33333333334, ans=0.125 2023-09-29 01:28:58,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 01:28:58,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:29:04,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:29:05,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:29:07,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 01:29:07,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:29:09,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 01:29:10,726 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:29:12,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 01:29:13,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:29:13,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:29:16,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 01:29:17,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 01:29:18,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 01:29:21,787 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 2.147e+02 2.416e+02 2.924e+02 5.280e+02, threshold=4.831e+02, percent-clipped=2.0 2023-09-29 01:29:22,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:29:22,521 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=205666.66666666666, ans=0.0 2023-09-29 01:29:25,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:29:26,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:29:26,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:29:29,052 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:29:30,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:29:33,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:29:33,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 01:29:35,423 INFO [train.py:1039] (1/4) Epoch 6, batch 4300, loss[loss=0.246, simple_loss=0.2972, pruned_loss=0.09742, over 23864.00 frames. ], tot_loss[loss=0.2341, simple_loss=0.2967, pruned_loss=0.08572, over 4717584.50 frames. ], batch size: 195, lr: 1.66e-02, grad_scale: 16.0 2023-09-29 01:29:35,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:29:41,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:29:41,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:29:45,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:29:56,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:29:56,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 01:29:56,323 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:29:56,598 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=205800.0, ans=0.0 2023-09-29 01:29:56,704 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=205800.0, ans=0.0 2023-09-29 01:29:59,339 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 01:29:59,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 01:29:59,446 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 01:30:02,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 01:30:06,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:30:09,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 01:30:09,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:30:09,202 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 01:30:12,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 01:30:14,340 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:30:17,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:30:17,446 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:30:17,586 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:30:19,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:30:19,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:30:21,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 01:30:21,452 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 01:30:24,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:30:28,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:30:28,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 01:30:28,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:30:28,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:30:28,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 01:30:28,851 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 01:30:28,951 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 01:30:30,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:30:31,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 01:30:31,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 01:30:35,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:30:36,009 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 01:30:37,375 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:30:39,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:30:39,448 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:30:41,051 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 01:30:43,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:30:43,208 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:30:44,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:30:44,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:30:44,661 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:30:46,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:30:49,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:30:50,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:30:52,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:30:57,153 INFO [train.py:1039] (1/4) Epoch 6, batch 4350, loss[loss=0.2457, simple_loss=0.3085, pruned_loss=0.09151, over 23462.00 frames. ], tot_loss[loss=0.2353, simple_loss=0.2978, pruned_loss=0.08641, over 4714178.87 frames. ], batch size: 93, lr: 1.65e-02, grad_scale: 16.0 2023-09-29 01:30:58,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 01:30:58,897 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 01:31:04,216 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.67 vs. limit=15.0 2023-09-29 01:31:04,898 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:31:06,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:31:11,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:31:11,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:31:13,567 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=206133.33333333334, ans=0.0 2023-09-29 01:31:17,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:31:20,238 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:31:23,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:31:23,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:31:27,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 01:31:30,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:31:32,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 01:31:37,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 01:31:37,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:31:39,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:31:43,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:31:44,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 01:31:49,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:31:49,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 01:31:54,681 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 01:31:56,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:31:56,358 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 01:31:57,771 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 01:31:59,254 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 01:31:59,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:32:00,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:32:02,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:32:03,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:32:04,606 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:32:04,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:32:06,024 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.725e+02 2.197e+02 2.443e+02 2.898e+02 4.711e+02, threshold=4.887e+02, percent-clipped=0.0 2023-09-29 01:32:07,745 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 01:32:07,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:32:07,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:32:07,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:32:07,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 01:32:09,440 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 01:32:09,450 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 01:32:09,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 01:32:12,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:32:12,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:32:13,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:32:14,232 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=206333.33333333334, ans=0.125 2023-09-29 01:32:15,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:32:15,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 01:32:18,835 INFO [train.py:1039] (1/4) Epoch 6, batch 4400, loss[loss=0.2035, simple_loss=0.2734, pruned_loss=0.06683, over 24464.00 frames. ], tot_loss[loss=0.2371, simple_loss=0.2993, pruned_loss=0.08743, over 4715612.62 frames. ], batch size: 58, lr: 1.65e-02, grad_scale: 32.0 2023-09-29 01:32:19,026 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 01:32:19,037 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:32:23,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:32:23,492 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:32:25,831 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:32:27,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 01:32:28,800 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 01:32:28,867 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 01:32:28,900 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 01:32:30,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 01:32:30,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:32:32,159 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 01:32:33,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:32:34,516 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.42 vs. limit=22.5 2023-09-29 01:32:35,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:32:36,002 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 01:32:36,325 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=206466.66666666666, ans=0.125 2023-09-29 01:32:38,984 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:32:38,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 01:32:40,323 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 01:32:43,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 01:32:43,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 01:32:43,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 01:32:43,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:32:45,307 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:32:45,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:32:46,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:32:49,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 01:32:49,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 01:32:51,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:32:53,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:32:53,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:32:54,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:32:54,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:32:54,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 01:32:57,952 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 01:33:00,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:33:06,838 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:33:06,960 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=206600.0, ans=0.125 2023-09-29 01:33:10,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 01:33:12,406 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.45 vs. limit=22.5 2023-09-29 01:33:14,764 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:33:16,814 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.43 vs. limit=15.0 2023-09-29 01:33:19,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:33:20,923 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:33:21,200 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=206600.0, ans=0.125 2023-09-29 01:33:22,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 01:33:22,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:33:22,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 01:33:22,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:33:22,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 01:33:29,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 01:33:31,435 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=206666.66666666666, ans=0.125 2023-09-29 01:33:33,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 01:33:34,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 01:33:34,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:33:34,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 01:33:36,850 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:33:40,034 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:33:41,391 INFO [train.py:1039] (1/4) Epoch 6, batch 4450, loss[loss=0.2902, simple_loss=0.3348, pruned_loss=0.1228, over 22728.00 frames. ], tot_loss[loss=0.238, simple_loss=0.3002, pruned_loss=0.08788, over 4718386.43 frames. ], batch size: 322, lr: 1.65e-02, grad_scale: 16.0 2023-09-29 01:33:41,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 01:33:44,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:33:48,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:33:49,579 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:33:53,726 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.46 vs. limit=15.0 2023-09-29 01:33:54,470 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:33:54,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:33:59,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:33:59,243 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=206800.0, ans=0.125 2023-09-29 01:33:59,622 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.51 vs. limit=15.0 2023-09-29 01:34:00,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:34:04,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:34:05,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:34:05,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 01:34:05,668 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:34:07,791 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:34:07,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:34:07,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 01:34:11,516 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 01:34:16,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:34:17,585 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:34:19,189 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:34:20,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:34:20,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:34:26,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 01:34:26,171 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 01:34:26,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 01:34:26,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:34:29,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:34:29,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 01:34:32,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 01:34:36,504 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:34:37,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 01:34:37,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:34:37,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:34:37,965 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:34:37,987 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:34:40,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:34:45,213 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 01:34:46,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 01:34:48,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 01:34:49,827 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:34:49,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:34:52,741 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 2.158e+02 2.416e+02 2.828e+02 3.801e+02, threshold=4.831e+02, percent-clipped=0.0 2023-09-29 01:34:52,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:34:52,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 01:34:55,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 01:34:58,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 01:34:58,390 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=207000.0, ans=0.1 2023-09-29 01:35:01,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:35:04,022 INFO [train.py:1039] (1/4) Epoch 6, batch 4500, loss[loss=0.2253, simple_loss=0.2838, pruned_loss=0.08339, over 23273.00 frames. ], tot_loss[loss=0.2378, simple_loss=0.2999, pruned_loss=0.08788, over 4713031.47 frames. ], batch size: 119, lr: 1.65e-02, grad_scale: 16.0 2023-09-29 01:35:05,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:35:07,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 01:35:07,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 01:35:08,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:35:16,157 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:35:16,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:35:16,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 01:35:18,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:35:18,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:35:18,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:35:21,671 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=207133.33333333334, ans=0.025 2023-09-29 01:35:32,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:35:34,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:35:36,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:35:36,104 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:35:37,614 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:35:39,517 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=207200.0, ans=0.125 2023-09-29 01:35:43,718 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 01:35:47,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:35:50,018 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=207200.0, ans=0.125 2023-09-29 01:35:51,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 01:35:54,930 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:35:54,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 01:35:57,126 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.75 vs. limit=15.0 2023-09-29 01:35:57,880 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:35:57,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:35:58,798 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.93 vs. limit=6.0 2023-09-29 01:35:59,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:35:59,569 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:36:01,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:36:01,925 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 01:36:01,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 01:36:01,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:36:02,230 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=207266.66666666666, ans=0.1 2023-09-29 01:36:06,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:36:06,687 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:36:09,692 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:36:09,982 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=207333.33333333334, ans=0.125 2023-09-29 01:36:11,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 01:36:11,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:36:14,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 01:36:15,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 01:36:15,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 01:36:20,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 01:36:25,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 01:36:26,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:36:27,424 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.83 vs. limit=6.0 2023-09-29 01:36:27,958 INFO [train.py:1039] (1/4) Epoch 6, batch 4550, loss[loss=0.2429, simple_loss=0.3141, pruned_loss=0.08581, over 24067.00 frames. ], tot_loss[loss=0.2361, simple_loss=0.2977, pruned_loss=0.08722, over 4707121.93 frames. ], batch size: 80, lr: 1.65e-02, grad_scale: 16.0 2023-09-29 01:36:29,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:36:29,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:36:34,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:36:35,992 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.65 vs. limit=6.0 2023-09-29 01:36:40,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:36:42,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:36:45,538 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:36:45,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:36:45,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:36:47,342 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:36:47,401 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:36:51,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:36:55,690 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 01:36:55,792 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 01:36:57,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:36:57,628 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=207466.66666666666, ans=0.125 2023-09-29 01:36:58,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 01:37:02,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 01:37:02,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:37:05,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 01:37:08,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 01:37:09,312 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=207533.33333333334, ans=0.05 2023-09-29 01:37:11,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:37:12,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:37:12,644 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:37:14,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 01:37:17,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:37:20,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:37:20,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:37:21,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:37:23,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 01:37:24,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 01:37:25,015 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:37:25,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 01:37:26,809 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 01:37:26,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:37:27,148 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=207600.0, ans=0.125 2023-09-29 01:37:27,151 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=207600.0, ans=0.125 2023-09-29 01:37:28,835 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:37:28,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:37:30,558 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 01:37:31,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:37:31,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:37:33,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 01:37:33,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 01:37:36,657 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.77 vs. limit=10.0 2023-09-29 01:37:37,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:37:37,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 01:37:37,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 01:37:37,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:37:37,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 01:37:39,095 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.597e+02 2.083e+02 2.307e+02 2.767e+02 3.692e+02, threshold=4.614e+02, percent-clipped=0.0 2023-09-29 01:37:41,569 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.43 vs. limit=15.0 2023-09-29 01:37:42,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:37:42,266 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:37:44,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:37:46,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:37:46,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 01:37:46,465 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=207666.66666666666, ans=0.2 2023-09-29 01:37:47,589 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:37:49,366 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=207733.33333333334, ans=0.2 2023-09-29 01:37:50,416 INFO [train.py:1039] (1/4) Epoch 6, batch 4600, loss[loss=0.2268, simple_loss=0.2869, pruned_loss=0.08332, over 23717.00 frames. ], tot_loss[loss=0.2344, simple_loss=0.2962, pruned_loss=0.08627, over 4705313.31 frames. ], batch size: 149, lr: 1.65e-02, grad_scale: 16.0 2023-09-29 01:37:50,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 01:37:52,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:37:53,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:37:56,831 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:37:56,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:37:58,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:37:59,958 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 01:38:00,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:38:05,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:38:05,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:38:08,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:38:17,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 01:38:18,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:38:20,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:38:22,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:38:22,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:38:25,990 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=207866.66666666666, ans=0.0 2023-09-29 01:38:28,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 01:38:28,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 01:38:28,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:38:34,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:38:34,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 01:38:36,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:38:43,994 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 01:38:44,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 01:38:44,341 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=207933.33333333334, ans=0.2 2023-09-29 01:38:48,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:38:50,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:38:53,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:38:53,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 01:38:53,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:38:55,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 01:38:55,433 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:38:55,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:38:58,745 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:38:58,864 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:39:00,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:39:00,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 01:39:00,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 01:39:00,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 01:39:00,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:39:02,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:39:03,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:39:05,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:39:08,371 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=208000.0, ans=0.07 2023-09-29 01:39:11,419 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=208066.66666666666, ans=0.09899494936611666 2023-09-29 01:39:13,089 INFO [train.py:1039] (1/4) Epoch 6, batch 4650, loss[loss=0.2166, simple_loss=0.2886, pruned_loss=0.07231, over 24326.00 frames. ], tot_loss[loss=0.2338, simple_loss=0.2958, pruned_loss=0.08596, over 4713884.68 frames. ], batch size: 61, lr: 1.65e-02, grad_scale: 16.0 2023-09-29 01:39:16,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:39:18,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:39:20,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:39:20,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:39:21,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:39:21,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:39:21,688 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:39:24,016 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.90 vs. limit=15.0 2023-09-29 01:39:26,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 01:39:31,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:39:32,804 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 01:39:32,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:39:32,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 01:39:34,307 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:39:34,405 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 01:39:34,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 01:39:34,463 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:39:36,073 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:39:39,239 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:39:40,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:39:40,696 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 01:39:42,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:39:44,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 01:39:47,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:39:47,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:39:47,883 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 01:39:50,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:39:52,910 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.31 vs. limit=15.0 2023-09-29 01:39:55,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:39:58,803 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:40:04,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:40:06,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:40:08,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:40:08,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:40:11,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 01:40:12,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 01:40:12,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 01:40:12,831 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 01:40:15,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:40:21,328 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=13.70 vs. limit=22.5 2023-09-29 01:40:23,233 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 2.129e+02 2.354e+02 2.649e+02 3.887e+02, threshold=4.707e+02, percent-clipped=0.0 2023-09-29 01:40:23,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:40:23,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:40:23,517 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 01:40:25,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:40:27,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:40:27,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:40:29,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:40:32,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:40:32,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:40:34,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:40:35,760 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.15 vs. limit=15.0 2023-09-29 01:40:36,326 INFO [train.py:1039] (1/4) Epoch 6, batch 4700, loss[loss=0.2651, simple_loss=0.3174, pruned_loss=0.1064, over 23699.00 frames. ], tot_loss[loss=0.2338, simple_loss=0.2962, pruned_loss=0.08573, over 4712289.33 frames. ], batch size: 232, lr: 1.65e-02, grad_scale: 16.0 2023-09-29 01:40:36,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:40:38,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:40:38,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 01:40:38,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 01:40:38,874 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.68 vs. limit=15.0 2023-09-29 01:40:39,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 01:40:41,200 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 01:40:41,856 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.75 vs. limit=15.0 2023-09-29 01:40:47,953 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=208400.0, ans=0.125 2023-09-29 01:40:49,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:40:49,197 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:40:50,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:40:51,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:40:53,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 01:40:56,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 01:40:58,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 01:41:02,628 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:41:02,771 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:41:02,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:41:07,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:41:08,811 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.93 vs. limit=15.0 2023-09-29 01:41:14,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 01:41:15,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 01:41:18,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:41:26,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 01:41:26,154 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:41:27,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:41:31,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 01:41:34,509 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:41:40,195 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:41:40,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 01:41:43,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:41:43,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:41:46,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:41:47,029 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:41:47,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 01:41:48,638 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 01:41:50,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:41:53,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:41:53,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:41:53,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 01:41:54,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:41:57,783 INFO [train.py:1039] (1/4) Epoch 6, batch 4750, loss[loss=0.2684, simple_loss=0.3168, pruned_loss=0.11, over 22830.00 frames. ], tot_loss[loss=0.234, simple_loss=0.2967, pruned_loss=0.08571, over 4717175.29 frames. ], batch size: 322, lr: 1.64e-02, grad_scale: 16.0 2023-09-29 01:41:58,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 01:41:59,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:42:00,080 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=208733.33333333334, ans=0.125 2023-09-29 01:42:01,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:42:04,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:42:06,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:42:07,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 01:42:07,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:42:11,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 01:42:13,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:42:15,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:42:15,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:42:20,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 01:42:24,897 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:42:26,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 01:42:26,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:42:26,877 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 01:42:31,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:42:31,252 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:42:31,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:42:31,896 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.77 vs. limit=15.0 2023-09-29 01:42:32,723 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 01:42:32,728 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 01:42:38,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 01:42:39,455 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=208866.66666666666, ans=0.125 2023-09-29 01:42:40,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:42:42,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:42:44,746 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=208866.66666666666, ans=0.1 2023-09-29 01:42:46,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 01:42:46,634 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 01:42:46,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:42:46,919 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=208933.33333333334, ans=0.1 2023-09-29 01:42:50,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:42:54,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:42:55,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 01:42:55,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 01:42:56,489 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.39 vs. limit=15.0 2023-09-29 01:42:57,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:42:57,233 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:42:57,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:42:58,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 01:42:58,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 01:43:00,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 01:43:03,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:43:03,910 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=209000.0, ans=0.2 2023-09-29 01:43:06,677 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:43:06,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 01:43:06,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:43:07,029 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=209000.0, ans=0.1 2023-09-29 01:43:09,479 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.730e+02 2.147e+02 2.364e+02 2.785e+02 5.281e+02, threshold=4.728e+02, percent-clipped=1.0 2023-09-29 01:43:09,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:43:09,831 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 01:43:11,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:43:11,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 01:43:15,959 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:43:16,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 01:43:16,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 01:43:17,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 01:43:21,128 INFO [train.py:1039] (1/4) Epoch 6, batch 4800, loss[loss=0.2383, simple_loss=0.2954, pruned_loss=0.09063, over 23472.00 frames. ], tot_loss[loss=0.2376, simple_loss=0.2992, pruned_loss=0.08798, over 4701762.33 frames. ], batch size: 120, lr: 1.64e-02, grad_scale: 32.0 2023-09-29 01:43:21,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 01:43:23,307 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:43:23,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 01:43:27,787 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.09 vs. limit=10.0 2023-09-29 01:43:28,982 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=209066.66666666666, ans=0.125 2023-09-29 01:43:30,150 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:43:30,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:43:30,545 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=209066.66666666666, ans=0.1 2023-09-29 01:43:36,212 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:43:36,607 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=209133.33333333334, ans=0.0 2023-09-29 01:43:37,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:43:37,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:43:38,011 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=209133.33333333334, ans=0.0 2023-09-29 01:43:39,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 01:43:39,642 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=209133.33333333334, ans=0.2 2023-09-29 01:43:39,703 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=209133.33333333334, ans=0.125 2023-09-29 01:43:40,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:43:40,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:43:42,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:43:48,424 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:43:48,804 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=209133.33333333334, ans=0.0 2023-09-29 01:43:49,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:43:50,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:43:52,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:43:52,964 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 01:43:52,996 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:43:53,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:43:56,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:44:00,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:44:03,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:44:03,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 01:44:03,966 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.50 vs. limit=22.5 2023-09-29 01:44:04,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 01:44:04,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:44:07,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 01:44:07,881 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 01:44:08,017 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:44:08,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:44:09,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:44:09,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:44:09,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:44:11,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:44:11,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:44:11,611 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=209266.66666666666, ans=0.0 2023-09-29 01:44:14,426 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:44:17,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:44:18,889 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:44:23,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 01:44:25,242 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:44:25,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:44:25,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 01:44:26,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:44:30,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:44:33,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:44:33,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:44:33,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:44:34,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:44:34,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 01:44:38,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:44:39,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:44:39,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:44:40,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 01:44:42,324 INFO [train.py:1039] (1/4) Epoch 6, batch 4850, loss[loss=0.2306, simple_loss=0.2822, pruned_loss=0.08944, over 23733.00 frames. ], tot_loss[loss=0.2364, simple_loss=0.2983, pruned_loss=0.08721, over 4708123.22 frames. ], batch size: 232, lr: 1.64e-02, grad_scale: 32.0 2023-09-29 01:44:42,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 01:44:42,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:44:42,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:44:42,606 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:44:42,608 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:44:45,843 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:44:52,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 01:44:53,524 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:44:56,834 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:44:59,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 01:44:59,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:45:03,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:45:05,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:45:06,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:45:06,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 01:45:07,071 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 01:45:12,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:45:14,294 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:45:14,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 01:45:15,810 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 01:45:15,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 01:45:18,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:45:19,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:45:23,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:45:24,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 01:45:24,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 01:45:27,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 01:45:34,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:45:34,317 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 01:45:36,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:45:36,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:45:38,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 01:45:42,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 01:45:42,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:45:42,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 01:45:44,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:45:44,346 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:45:45,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 01:45:53,395 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.753e+02 2.040e+02 2.308e+02 2.752e+02 3.700e+02, threshold=4.617e+02, percent-clipped=0.0 2023-09-29 01:45:55,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:46:00,073 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:46:00,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:46:04,394 INFO [train.py:1039] (1/4) Epoch 6, batch 4900, loss[loss=0.2213, simple_loss=0.2804, pruned_loss=0.08116, over 24419.00 frames. ], tot_loss[loss=0.2361, simple_loss=0.2977, pruned_loss=0.08728, over 4692092.67 frames. ], batch size: 63, lr: 1.64e-02, grad_scale: 32.0 2023-09-29 01:46:06,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 01:46:06,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:46:11,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:46:14,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:46:14,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:46:17,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 01:46:23,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 01:46:26,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 01:46:26,974 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=209800.0, ans=0.125 2023-09-29 01:46:28,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 01:46:28,302 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 01:46:29,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:46:29,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:46:29,824 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:46:29,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:46:29,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 01:46:33,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 01:46:33,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 01:46:33,336 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=209800.0, ans=0.0 2023-09-29 01:46:34,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 01:46:36,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 01:46:39,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:46:39,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:46:42,192 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:46:42,216 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 01:46:44,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:46:44,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:46:44,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 01:46:44,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 01:46:46,929 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=209866.66666666666, ans=0.0 2023-09-29 01:46:51,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 01:46:53,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:46:54,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:46:54,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:46:55,160 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=209933.33333333334, ans=0.09899494936611666 2023-09-29 01:46:56,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:46:56,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 01:46:56,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:46:56,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 01:46:59,777 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:47:01,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 01:47:02,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:47:05,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 01:47:07,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:47:07,546 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 01:47:07,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 01:47:08,094 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.21 vs. limit=22.5 2023-09-29 01:47:14,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:47:14,658 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=210000.0, ans=0.2 2023-09-29 01:47:15,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:47:18,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 01:47:18,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 01:47:18,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:47:19,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:47:26,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:47:26,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:47:26,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:47:27,014 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 01:47:28,390 INFO [train.py:1039] (1/4) Epoch 6, batch 4950, loss[loss=0.2583, simple_loss=0.3089, pruned_loss=0.1038, over 23758.00 frames. ], tot_loss[loss=0.235, simple_loss=0.2958, pruned_loss=0.0871, over 4683751.18 frames. ], batch size: 179, lr: 1.64e-02, grad_scale: 16.0 2023-09-29 01:47:28,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 01:47:31,749 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:47:31,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 01:47:36,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 01:47:36,293 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 01:47:36,922 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.72 vs. limit=6.0 2023-09-29 01:47:37,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 01:47:37,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 01:47:37,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:47:39,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:47:39,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:47:39,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:47:41,187 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:47:42,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:47:44,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:47:44,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:47:46,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:47:46,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:47:51,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 01:47:57,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:47:59,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:48:00,226 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=210200.0, ans=0.0 2023-09-29 01:48:02,104 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:48:03,506 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:48:03,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:48:05,187 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 01:48:05,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 01:48:08,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:48:08,791 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=210200.0, ans=0.1 2023-09-29 01:48:08,880 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=210200.0, ans=0.1 2023-09-29 01:48:10,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:48:10,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:48:11,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:48:11,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:48:13,264 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:48:14,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:48:17,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 01:48:20,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:48:24,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:48:24,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:48:25,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 01:48:25,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:48:26,329 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=210266.66666666666, ans=0.125 2023-09-29 01:48:27,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:48:31,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:48:32,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:48:32,844 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:48:34,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:48:35,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:48:36,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:48:38,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:48:38,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:48:39,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:48:41,011 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 2.165e+02 2.494e+02 2.935e+02 4.100e+02, threshold=4.988e+02, percent-clipped=0.0 2023-09-29 01:48:41,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 01:48:44,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:48:49,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 01:48:49,090 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 01:48:50,375 INFO [train.py:1039] (1/4) Epoch 6, batch 5000, loss[loss=0.2153, simple_loss=0.2874, pruned_loss=0.07164, over 24464.00 frames. ], tot_loss[loss=0.2334, simple_loss=0.2946, pruned_loss=0.08611, over 4695230.02 frames. ], batch size: 63, lr: 1.64e-02, grad_scale: 16.0 2023-09-29 01:48:57,164 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:48:57,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 01:48:58,793 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 01:49:00,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 01:49:01,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:49:04,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 01:49:06,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:49:06,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:49:08,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 01:49:08,682 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:49:08,767 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:49:08,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 01:49:08,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:49:08,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:49:09,156 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=210466.66666666666, ans=0.1 2023-09-29 01:49:11,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 01:49:12,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 01:49:14,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:49:14,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 01:49:14,208 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 01:49:14,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:49:15,689 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 01:49:15,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 01:49:15,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 01:49:16,690 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.26 vs. limit=22.5 2023-09-29 01:49:18,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 01:49:18,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:49:18,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:49:20,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 01:49:20,218 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 01:49:21,691 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:49:23,138 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:49:23,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 01:49:23,483 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=210533.33333333334, ans=0.1 2023-09-29 01:49:24,933 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 01:49:25,682 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.59 vs. limit=15.0 2023-09-29 01:49:26,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:49:28,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:49:31,889 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 01:49:35,030 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:49:37,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:49:37,005 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:49:39,111 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.81 vs. limit=15.0 2023-09-29 01:49:41,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 01:49:41,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:49:41,972 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:49:42,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:49:42,801 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.91 vs. limit=10.0 2023-09-29 01:49:45,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 01:49:45,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:49:48,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:49:50,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:49:56,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 01:50:00,638 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.23 vs. limit=15.0 2023-09-29 01:50:01,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:50:01,656 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=210666.66666666666, ans=0.125 2023-09-29 01:50:09,150 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.53 vs. limit=15.0 2023-09-29 01:50:09,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:50:11,327 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:50:11,341 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:50:11,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:50:13,268 INFO [train.py:1039] (1/4) Epoch 6, batch 5050, loss[loss=0.2609, simple_loss=0.3066, pruned_loss=0.1076, over 23745.00 frames. ], tot_loss[loss=0.2345, simple_loss=0.2966, pruned_loss=0.0862, over 4716494.64 frames. ], batch size: 164, lr: 1.64e-02, grad_scale: 16.0 2023-09-29 01:50:13,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 01:50:13,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 01:50:13,506 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:50:17,154 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=19.43 vs. limit=22.5 2023-09-29 01:50:18,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:50:18,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 01:50:19,512 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=8.73 vs. limit=15.0 2023-09-29 01:50:20,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:50:21,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:50:21,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:50:23,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 01:50:23,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:50:23,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:50:26,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 01:50:28,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 01:50:29,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 01:50:40,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 01:50:42,069 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 01:50:42,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:50:42,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 01:50:42,322 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:50:43,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:50:45,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:50:46,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:50:46,758 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 01:50:46,874 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 01:50:49,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:50:52,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:50:54,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:50:54,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 01:50:55,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:50:59,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 01:51:00,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:51:00,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:51:00,904 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 01:51:02,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:51:03,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:51:05,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:51:08,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:51:08,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:51:09,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:51:09,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:51:09,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 01:51:11,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:51:13,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:51:17,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:51:17,891 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 01:51:17,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 01:51:19,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:51:19,499 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:51:21,344 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 01:51:24,334 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.752e+02 2.235e+02 2.528e+02 3.059e+02 5.158e+02, threshold=5.056e+02, percent-clipped=2.0 2023-09-29 01:51:24,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:51:24,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 01:51:24,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:51:28,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:51:29,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:51:29,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 01:51:31,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 01:51:32,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:51:32,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:51:33,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:51:34,397 INFO [train.py:1039] (1/4) Epoch 6, batch 5100, loss[loss=0.1917, simple_loss=0.2547, pruned_loss=0.0643, over 24429.00 frames. ], tot_loss[loss=0.2349, simple_loss=0.297, pruned_loss=0.08635, over 4712101.40 frames. ], batch size: 58, lr: 1.64e-02, grad_scale: 16.0 2023-09-29 01:51:36,171 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 01:51:39,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:51:42,926 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.50 vs. limit=12.0 2023-09-29 01:51:43,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 01:51:43,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 01:51:43,840 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=211066.66666666666, ans=0.1 2023-09-29 01:51:44,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:51:46,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:51:50,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:51:50,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 01:51:50,227 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 01:51:54,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:51:56,380 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:52:00,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:52:04,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 01:52:05,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:52:06,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:52:06,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 01:52:09,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:52:09,860 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:52:09,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 01:52:10,048 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=211200.0, ans=0.0 2023-09-29 01:52:11,390 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 01:52:12,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:52:12,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 01:52:14,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 01:52:18,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:52:30,152 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:52:31,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 01:52:31,877 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 01:52:33,930 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 01:52:36,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 01:52:36,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:52:39,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 01:52:41,780 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=211333.33333333334, ans=0.0 2023-09-29 01:52:43,011 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 01:52:44,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 01:52:47,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:52:49,034 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 01:52:49,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 01:52:50,749 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 01:52:55,334 INFO [train.py:1039] (1/4) Epoch 6, batch 5150, loss[loss=0.2288, simple_loss=0.3024, pruned_loss=0.07762, over 24355.00 frames. ], tot_loss[loss=0.2359, simple_loss=0.2982, pruned_loss=0.08684, over 4722055.61 frames. ], batch size: 77, lr: 1.63e-02, grad_scale: 16.0 2023-09-29 01:52:56,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:52:56,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:52:56,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:52:57,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:52:59,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 01:52:59,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:53:00,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 01:53:00,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 01:53:00,653 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 01:53:00,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:53:00,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 01:53:03,476 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:53:03,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 01:53:05,056 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:53:06,673 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:53:12,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 01:53:12,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 01:53:14,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:53:15,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:53:17,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 01:53:17,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:53:17,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:53:17,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:53:17,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:53:18,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 01:53:20,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:53:20,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:53:23,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 01:53:25,863 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 01:53:27,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:53:29,145 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=211533.33333333334, ans=0.125 2023-09-29 01:53:33,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:53:35,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 01:53:37,483 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.44 vs. limit=15.0 2023-09-29 01:53:39,755 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:53:43,580 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=211600.0, ans=0.125 2023-09-29 01:53:46,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:53:49,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:53:52,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:53:52,377 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:53:55,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 01:53:59,261 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:53:59,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:53:59,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:54:04,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:54:05,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:54:05,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 01:54:08,360 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.647e+02 2.054e+02 2.300e+02 2.668e+02 5.365e+02, threshold=4.600e+02, percent-clipped=1.0 2023-09-29 01:54:11,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:54:13,155 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 01:54:13,427 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=211666.66666666666, ans=0.0 2023-09-29 01:54:14,656 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:54:14,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:54:14,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 01:54:16,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 01:54:16,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:54:16,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:54:18,892 INFO [train.py:1039] (1/4) Epoch 6, batch 5200, loss[loss=0.1957, simple_loss=0.2659, pruned_loss=0.06278, over 24570.00 frames. ], tot_loss[loss=0.2371, simple_loss=0.2988, pruned_loss=0.08771, over 4720626.60 frames. ], batch size: 60, lr: 1.63e-02, grad_scale: 32.0 2023-09-29 01:54:22,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:54:24,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 01:54:27,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:54:28,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 01:54:30,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:54:30,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:54:32,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:54:34,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:54:34,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:54:37,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 01:54:40,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 01:54:41,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:54:43,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 01:54:46,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 01:54:47,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:54:48,069 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 01:54:48,139 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 01:54:48,786 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.14 vs. limit=15.0 2023-09-29 01:54:51,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 01:54:51,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:54:51,928 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 01:54:51,939 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:54:56,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:54:56,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:54:57,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 01:54:57,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:55:00,645 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=211866.66666666666, ans=0.0 2023-09-29 01:55:01,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:55:03,623 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 01:55:05,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 01:55:05,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 01:55:10,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 01:55:11,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 01:55:16,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 01:55:16,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:55:17,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 01:55:19,426 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:55:19,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 01:55:19,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:55:19,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 01:55:24,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:55:24,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:55:30,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:55:30,934 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=212000.0, ans=0.125 2023-09-29 01:55:32,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:55:32,105 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:55:36,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:55:38,310 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 01:55:39,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:55:39,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:55:41,378 INFO [train.py:1039] (1/4) Epoch 6, batch 5250, loss[loss=0.2448, simple_loss=0.2891, pruned_loss=0.1003, over 23742.00 frames. ], tot_loss[loss=0.2359, simple_loss=0.2979, pruned_loss=0.08701, over 4717862.05 frames. ], batch size: 212, lr: 1.63e-02, grad_scale: 32.0 2023-09-29 01:55:42,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:55:42,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 01:55:42,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 01:55:45,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:55:48,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:55:48,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:55:49,887 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:55:56,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:55:57,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:56:00,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:56:00,474 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=212133.33333333334, ans=0.125 2023-09-29 01:56:01,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 01:56:05,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 01:56:05,127 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:56:06,725 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:56:12,603 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=212200.0, ans=0.125 2023-09-29 01:56:19,361 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=212200.0, ans=0.1 2023-09-29 01:56:32,415 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.04 vs. limit=15.0 2023-09-29 01:56:46,382 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=25.86 vs. limit=22.5 2023-09-29 01:56:46,950 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.781e+02 2.177e+02 2.529e+02 3.121e+02 4.794e+02, threshold=5.058e+02, percent-clipped=1.0 2023-09-29 01:56:47,931 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=10.69 vs. limit=15.0 2023-09-29 01:56:55,338 INFO [train.py:1039] (1/4) Epoch 6, batch 5300, loss[loss=0.2335, simple_loss=0.3046, pruned_loss=0.08119, over 23949.00 frames. ], tot_loss[loss=0.2343, simple_loss=0.2954, pruned_loss=0.08659, over 4706871.07 frames. ], batch size: 86, lr: 1.63e-02, grad_scale: 16.0 2023-09-29 01:57:07,589 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=212400.0, ans=0.125 2023-09-29 01:57:11,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:57:11,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 01:57:11,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 01:57:11,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:57:11,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:57:11,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:57:11,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:57:11,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:57:11,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:57:12,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:57:12,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 01:57:12,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:57:13,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 01:57:13,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 01:57:13,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 01:57:13,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 01:57:13,455 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 01:57:13,578 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 01:57:13,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:57:14,258 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:57:14,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:57:14,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:57:14,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:57:15,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:57:15,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:57:15,208 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:57:15,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:57:15,395 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:57:15,403 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:57:15,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:57:15,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:57:16,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 01:57:16,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:57:17,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:57:17,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 01:57:17,349 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 01:57:17,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 01:57:17,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:57:17,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 01:57:17,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 01:57:17,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 01:57:18,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:57:18,847 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:57:18,999 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 01:57:19,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 01:57:19,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 01:57:19,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:57:19,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 01:57:19,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 01:57:20,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 01:57:20,375 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 01:57:30,918 INFO [train.py:1039] (1/4) Epoch 7, batch 0, loss[loss=0.2363, simple_loss=0.31, pruned_loss=0.08131, over 24674.00 frames. ], tot_loss[loss=0.2363, simple_loss=0.31, pruned_loss=0.08131, over 24674.00 frames. ], batch size: 73, lr: 1.53e-02, grad_scale: 32.0 2023-09-29 01:57:30,919 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-29 01:57:45,844 INFO [train.py:1071] (1/4) Epoch 7, validation: loss=0.2938, simple_loss=0.3001, pruned_loss=0.1437, over 1125622.00 frames. 2023-09-29 01:57:45,845 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-29 01:57:47,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 01:57:48,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:57:51,820 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:57:56,562 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:57:56,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:57:58,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:57:58,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 01:58:00,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 01:58:03,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:58:03,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:58:07,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:58:07,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:58:09,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:58:09,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:58:10,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 01:58:12,460 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:58:22,177 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:58:22,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:58:24,447 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 01:58:25,372 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.23 vs. limit=15.0 2023-09-29 01:58:27,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 01:58:29,069 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:58:29,566 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=212613.33333333334, ans=0.125 2023-09-29 01:58:30,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:58:35,987 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:58:40,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:58:40,888 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=212680.0, ans=0.0 2023-09-29 01:58:46,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 01:58:51,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 01:58:51,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:58:51,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:58:51,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:58:53,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:58:54,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 01:58:56,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:58:57,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:59:01,553 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:59:04,537 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 01:59:05,954 INFO [train.py:1039] (1/4) Epoch 7, batch 50, loss[loss=0.2175, simple_loss=0.2904, pruned_loss=0.07228, over 24495.00 frames. ], tot_loss[loss=0.2309, simple_loss=0.2972, pruned_loss=0.08231, over 1082684.32 frames. ], batch size: 66, lr: 1.53e-02, grad_scale: 32.0 2023-09-29 01:59:06,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:59:09,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:59:11,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:59:11,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 01:59:11,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 01:59:12,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:59:15,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:59:16,143 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:59:20,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:59:22,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 01:59:22,848 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:59:31,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 01:59:31,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 01:59:32,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 01:59:33,476 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=19.46 vs. limit=15.0 2023-09-29 01:59:36,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:59:36,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:59:38,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:59:38,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:59:39,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 01:59:39,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 01:59:39,865 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:59:48,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:59:49,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:59:49,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:59:51,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 01:59:54,349 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:59:54,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:59:54,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 01:59:56,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:59:58,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 02:00:00,196 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=213013.33333333334, ans=0.0 2023-09-29 02:00:01,120 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 2.205e+02 2.565e+02 2.922e+02 4.560e+02, threshold=5.129e+02, percent-clipped=0.0 2023-09-29 02:00:06,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:00:06,626 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:00:06,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:00:08,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:00:08,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:00:09,155 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=213013.33333333334, ans=0.125 2023-09-29 02:00:11,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 02:00:11,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 02:00:12,845 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.33 vs. limit=6.0 2023-09-29 02:00:13,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:00:13,647 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:00:15,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:00:16,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:00:16,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 02:00:16,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 02:00:18,100 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 02:00:20,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:00:20,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 02:00:21,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 02:00:21,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 02:00:21,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:00:23,372 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 02:00:24,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 02:00:24,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:00:27,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:00:29,756 INFO [train.py:1039] (1/4) Epoch 7, batch 100, loss[loss=0.2136, simple_loss=0.2743, pruned_loss=0.07642, over 23699.00 frames. ], tot_loss[loss=0.2318, simple_loss=0.2973, pruned_loss=0.08316, over 1885748.98 frames. ], batch size: 135, lr: 1.53e-02, grad_scale: 32.0 2023-09-29 02:00:34,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:00:36,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:00:38,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 02:00:38,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:00:41,340 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:00:41,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:00:41,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 02:00:41,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:00:41,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:00:42,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 02:00:45,191 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=213213.33333333334, ans=0.125 2023-09-29 02:00:46,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 02:00:46,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:00:46,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:00:46,584 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:00:51,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 02:00:52,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:00:52,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:00:54,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 02:00:56,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 02:00:56,865 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=213213.33333333334, ans=0.2 2023-09-29 02:01:01,114 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 02:01:01,161 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 02:01:04,148 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:01:04,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:01:10,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:01:11,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:01:13,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:01:21,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:01:22,045 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 02:01:25,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 02:01:28,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:01:29,088 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=213346.66666666666, ans=0.125 2023-09-29 02:01:30,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:01:33,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:01:35,674 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:01:38,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:01:40,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:01:43,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:01:45,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:01:47,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:01:47,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:01:47,443 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:01:47,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 02:01:47,569 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 02:01:47,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:01:49,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:01:49,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:01:49,267 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:01:49,566 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=213413.33333333334, ans=0.125 2023-09-29 02:01:50,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 02:01:50,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 02:01:50,858 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 02:01:50,870 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:01:50,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:01:52,470 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:01:54,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:01:54,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:01:57,171 INFO [train.py:1039] (1/4) Epoch 7, batch 150, loss[loss=0.2329, simple_loss=0.3085, pruned_loss=0.07861, over 24465.00 frames. ], tot_loss[loss=0.2321, simple_loss=0.2974, pruned_loss=0.08338, over 2522226.56 frames. ], batch size: 69, lr: 1.53e-02, grad_scale: 32.0 2023-09-29 02:01:57,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:01:58,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:01:58,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:02:00,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:02:02,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:02:04,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:02:07,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:02:07,966 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=213480.0, ans=0.1 2023-09-29 02:02:08,968 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:02:13,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 02:02:13,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 02:02:13,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 02:02:15,199 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:02:15,218 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 02:02:17,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:02:17,848 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=213546.66666666666, ans=0.0 2023-09-29 02:02:19,562 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:02:19,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:02:19,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:02:21,120 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:02:22,594 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 02:02:24,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:02:29,215 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=213613.33333333334, ans=0.125 2023-09-29 02:02:30,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:02:33,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 02:02:37,007 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 02:02:39,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:02:41,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:02:41,398 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:02:42,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:02:45,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:02:45,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:02:46,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:02:48,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 02:02:51,340 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.641e+02 2.019e+02 2.375e+02 2.708e+02 4.033e+02, threshold=4.751e+02, percent-clipped=0.0 2023-09-29 02:02:55,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:02:55,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:02:55,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:02:55,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 02:02:58,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:02:59,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 02:03:01,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 02:03:02,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:03:04,482 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:03:06,077 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:03:06,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 02:03:06,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:03:06,185 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 02:03:12,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:03:14,434 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=213746.66666666666, ans=0.125 2023-09-29 02:03:15,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:03:17,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:03:19,257 INFO [train.py:1039] (1/4) Epoch 7, batch 200, loss[loss=0.2365, simple_loss=0.2892, pruned_loss=0.09192, over 23781.00 frames. ], tot_loss[loss=0.2345, simple_loss=0.2983, pruned_loss=0.08537, over 3021346.95 frames. ], batch size: 179, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:03:20,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 02:03:22,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:03:23,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:03:26,703 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 02:03:28,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 02:03:30,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:03:30,649 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=213813.33333333334, ans=0.05 2023-09-29 02:03:32,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:03:34,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:03:34,210 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:03:34,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:03:53,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:03:54,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:03:55,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:03:55,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:03:57,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 02:03:57,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 02:03:58,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:04:00,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:04:02,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:04:02,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:04:03,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 02:04:03,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 02:04:05,920 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:04:07,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:04:12,695 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=214013.33333333334, ans=0.5 2023-09-29 02:04:15,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:04:22,011 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:04:23,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:04:30,393 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:04:33,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 02:04:34,048 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:04:34,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:04:35,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:04:37,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:04:37,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 02:04:37,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:04:38,710 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 02:04:40,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:04:41,381 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:04:42,351 INFO [train.py:1039] (1/4) Epoch 7, batch 250, loss[loss=0.235, simple_loss=0.2822, pruned_loss=0.0939, over 23471.00 frames. ], tot_loss[loss=0.2314, simple_loss=0.2956, pruned_loss=0.08367, over 3402629.02 frames. ], batch size: 285, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:04:42,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:04:43,931 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:04:44,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:04:45,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:04:45,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:04:48,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:04:51,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:05:03,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:05:06,088 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:05:06,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:05:15,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 02:05:16,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 02:05:18,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:05:18,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:05:19,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 02:05:19,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:05:20,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:05:23,169 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:05:24,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 02:05:24,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:05:27,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:05:27,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:05:27,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:05:29,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:05:29,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:05:29,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:05:32,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:05:34,518 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:05:34,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:05:35,828 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.653e+02 2.070e+02 2.355e+02 2.714e+02 5.110e+02, threshold=4.709e+02, percent-clipped=2.0 2023-09-29 02:05:36,812 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=20.53 vs. limit=22.5 2023-09-29 02:05:39,758 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:05:43,628 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=214346.66666666666, ans=0.0 2023-09-29 02:05:46,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:05:47,031 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.60 vs. limit=15.0 2023-09-29 02:05:49,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:05:54,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:05:55,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:05:59,336 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 02:05:59,514 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:05:59,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:06:02,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 02:06:02,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 02:06:03,499 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.80 vs. limit=15.0 2023-09-29 02:06:04,115 INFO [train.py:1039] (1/4) Epoch 7, batch 300, loss[loss=0.2372, simple_loss=0.3114, pruned_loss=0.08151, over 24312.00 frames. ], tot_loss[loss=0.2298, simple_loss=0.2939, pruned_loss=0.08289, over 3699501.74 frames. ], batch size: 74, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:06:04,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:06:04,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 02:06:09,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:06:10,809 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:06:12,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:06:12,885 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=214480.0, ans=0.125 2023-09-29 02:06:14,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 02:06:16,354 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:06:18,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 02:06:18,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 02:06:18,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:06:23,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 02:06:27,017 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 02:06:27,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 02:06:30,385 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 02:06:30,457 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:06:32,547 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=12.88 vs. limit=15.0 2023-09-29 02:06:33,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:06:33,617 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=214546.66666666666, ans=0.0 2023-09-29 02:06:34,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:06:34,918 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 02:06:34,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:06:39,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:06:41,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:06:41,467 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:06:46,164 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 02:06:46,171 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 02:06:48,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:06:50,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:06:52,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 02:06:52,459 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=214613.33333333334, ans=0.5 2023-09-29 02:06:53,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:06:57,068 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:06:58,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:06:58,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 02:06:59,064 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=214680.0, ans=0.2 2023-09-29 02:07:04,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:07:04,206 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 02:07:07,098 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:07:10,093 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:07:10,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 02:07:10,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 02:07:11,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:07:11,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 02:07:15,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:07:17,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:07:18,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:07:18,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:07:18,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:07:25,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:07:25,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 02:07:27,654 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:07:29,101 INFO [train.py:1039] (1/4) Epoch 7, batch 350, loss[loss=0.2357, simple_loss=0.2893, pruned_loss=0.09099, over 23766.00 frames. ], tot_loss[loss=0.2289, simple_loss=0.2922, pruned_loss=0.08277, over 3914887.79 frames. ], batch size: 179, lr: 1.52e-02, grad_scale: 16.0 2023-09-29 02:07:34,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:07:39,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:07:39,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:07:40,751 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 02:07:43,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:07:43,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 02:07:45,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:07:46,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 02:07:47,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:07:50,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 02:07:52,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:07:54,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:07:57,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:07:58,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:07:58,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:07:58,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:07:58,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:07:58,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 02:08:02,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:08:02,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:08:07,515 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:08:10,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:08:10,749 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:08:11,307 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=11.26 vs. limit=15.0 2023-09-29 02:08:12,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:08:12,235 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:08:17,468 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.04 vs. limit=12.0 2023-09-29 02:08:18,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 02:08:18,287 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:08:20,518 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.72 vs. limit=22.5 2023-09-29 02:08:24,061 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.640e+02 2.070e+02 2.347e+02 2.700e+02 4.079e+02, threshold=4.694e+02, percent-clipped=0.0 2023-09-29 02:08:24,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:08:24,299 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:08:24,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:08:27,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 02:08:28,691 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.92 vs. limit=10.0 2023-09-29 02:08:29,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:08:30,813 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 02:08:32,350 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 02:08:32,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:08:35,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:08:35,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 02:08:38,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:08:41,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:08:41,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:08:42,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:08:42,853 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:08:45,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:08:48,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:08:51,118 INFO [train.py:1039] (1/4) Epoch 7, batch 400, loss[loss=0.2021, simple_loss=0.2729, pruned_loss=0.06564, over 24367.00 frames. ], tot_loss[loss=0.2281, simple_loss=0.2921, pruned_loss=0.08205, over 4100272.12 frames. ], batch size: 61, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:08:51,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:08:51,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 02:08:51,371 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:08:52,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:08:54,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:08:54,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:08:57,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:08:57,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:09:01,111 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 02:09:02,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 02:09:02,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:09:04,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 02:09:04,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:09:06,695 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=215213.33333333334, ans=0.1 2023-09-29 02:09:08,812 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.73 vs. limit=15.0 2023-09-29 02:09:10,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:09:10,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:09:10,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 02:09:11,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:09:12,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:09:12,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:09:12,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:09:16,110 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 02:09:16,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 02:09:21,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:09:22,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:09:24,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 02:09:25,813 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 02:09:28,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:09:30,612 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=215280.0, ans=0.09899494936611666 2023-09-29 02:09:31,816 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:09:34,350 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=215280.0, ans=0.125 2023-09-29 02:09:37,523 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=215280.0, ans=0.125 2023-09-29 02:09:38,581 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 02:09:41,641 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 02:09:42,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 02:09:45,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:09:46,365 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=215346.66666666666, ans=0.125 2023-09-29 02:09:47,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:09:47,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 02:09:50,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:09:53,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 02:09:54,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:09:56,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:09:56,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 02:09:59,481 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 02:09:59,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 02:09:59,913 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=215413.33333333334, ans=0.1 2023-09-29 02:10:01,418 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=215413.33333333334, ans=0.0 2023-09-29 02:10:02,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:10:02,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:10:04,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 02:10:07,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 02:10:07,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:10:07,971 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 02:10:09,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 02:10:09,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:10:11,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:10:11,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:10:11,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 02:10:11,546 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=215413.33333333334, ans=0.125 2023-09-29 02:10:12,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:10:13,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=215480.0, ans=0.0 2023-09-29 02:10:14,728 INFO [train.py:1039] (1/4) Epoch 7, batch 450, loss[loss=0.2497, simple_loss=0.2984, pruned_loss=0.1005, over 23766.00 frames. ], tot_loss[loss=0.2302, simple_loss=0.2936, pruned_loss=0.08341, over 4233360.86 frames. ], batch size: 179, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:10:14,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 02:10:18,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 02:10:29,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:10:29,510 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:10:32,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 02:10:32,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 02:10:37,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:10:38,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:10:40,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:10:44,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:10:45,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:10:47,390 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.min_positive, batch_count=215613.33333333334, ans=0.025 2023-09-29 02:10:48,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 02:10:48,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 02:10:50,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 02:10:52,188 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:10:52,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:10:54,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:10:57,276 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 02:10:57,301 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 02:10:57,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:10:57,546 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=215613.33333333334, ans=0.035 2023-09-29 02:10:58,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:11:00,323 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 02:11:04,120 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 02:11:04,178 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:11:04,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 02:11:05,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 02:11:07,506 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=215680.0, ans=0.125 2023-09-29 02:11:08,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:11:08,966 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=215680.0, ans=0.125 2023-09-29 02:11:10,022 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.614e+02 1.891e+02 2.155e+02 2.361e+02 4.169e+02, threshold=4.311e+02, percent-clipped=0.0 2023-09-29 02:11:10,209 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 02:11:10,255 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 02:11:10,581 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=215680.0, ans=0.0 2023-09-29 02:11:11,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 02:11:16,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:11:18,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 02:11:18,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 02:11:20,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:11:26,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:11:26,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:11:30,266 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:11:30,306 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 02:11:32,198 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=215746.66666666666, ans=0.125 2023-09-29 02:11:33,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:11:34,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:11:35,049 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:11:35,066 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 02:11:36,816 INFO [train.py:1039] (1/4) Epoch 7, batch 500, loss[loss=0.2025, simple_loss=0.2686, pruned_loss=0.06822, over 24453.00 frames. ], tot_loss[loss=0.2304, simple_loss=0.2942, pruned_loss=0.08328, over 4342475.76 frames. ], batch size: 58, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:11:37,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 02:11:37,057 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:11:41,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 02:11:44,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 02:11:46,307 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 02:11:47,959 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:11:47,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:11:49,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:11:49,989 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=215813.33333333334, ans=0.1 2023-09-29 02:12:01,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:12:01,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 02:12:03,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 02:12:03,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:12:03,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 02:12:05,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 02:12:08,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:12:08,448 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=215946.66666666666, ans=0.125 2023-09-29 02:12:09,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:12:11,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:12:11,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:12:13,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 02:12:14,907 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 02:12:17,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:12:18,183 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=215946.66666666666, ans=0.0 2023-09-29 02:12:19,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:12:21,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:12:21,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:12:22,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 02:12:23,041 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=215946.66666666666, ans=0.125 2023-09-29 02:12:24,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 02:12:28,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:12:29,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:12:34,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:12:39,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:12:41,589 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=6.22 vs. limit=12.0 2023-09-29 02:12:45,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:12:48,899 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=216080.0, ans=0.125 2023-09-29 02:12:50,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 02:12:50,182 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:12:50,200 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:12:51,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 02:12:52,722 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.78 vs. limit=10.0 2023-09-29 02:12:53,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 02:12:53,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:12:59,427 INFO [train.py:1039] (1/4) Epoch 7, batch 550, loss[loss=0.2615, simple_loss=0.3098, pruned_loss=0.1066, over 22691.00 frames. ], tot_loss[loss=0.2322, simple_loss=0.2959, pruned_loss=0.08422, over 4424421.67 frames. ], batch size: 322, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:12:59,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 02:13:02,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 02:13:02,752 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:13:02,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 02:13:02,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:13:02,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:13:04,309 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:13:04,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:13:04,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:13:04,678 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=216146.66666666666, ans=0.1 2023-09-29 02:13:05,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:13:08,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:13:10,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 02:13:10,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:13:12,407 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten.whitening_limit, batch_count=216146.66666666666, ans=15.0 2023-09-29 02:13:17,981 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:13:18,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:13:19,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:13:21,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:13:26,568 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=216213.33333333334, ans=0.125 2023-09-29 02:13:27,679 WARNING [train.py:1197] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 02:13:29,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 02:13:30,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:13:35,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:13:35,405 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:13:36,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:13:40,085 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:13:40,093 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 02:13:42,934 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:13:43,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 02:13:45,365 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:13:47,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 02:13:47,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 02:13:49,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:13:49,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 02:13:51,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 02:13:52,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:13:52,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:13:54,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:13:54,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:13:55,683 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.742e+02 2.132e+02 2.384e+02 2.779e+02 4.607e+02, threshold=4.767e+02, percent-clipped=1.0 2023-09-29 02:13:56,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:13:59,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:13:59,624 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=216346.66666666666, ans=0.2 2023-09-29 02:14:01,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:14:02,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:14:02,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 02:14:04,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 02:14:05,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:14:05,999 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 02:14:06,314 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=216413.33333333334, ans=0.0 2023-09-29 02:14:07,523 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:14:08,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 02:14:09,013 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 02:14:13,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 02:14:19,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 02:14:22,410 INFO [train.py:1039] (1/4) Epoch 7, batch 600, loss[loss=0.2202, simple_loss=0.2653, pruned_loss=0.08758, over 23353.00 frames. ], tot_loss[loss=0.2329, simple_loss=0.2965, pruned_loss=0.08467, over 4478330.28 frames. ], batch size: 285, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:14:22,469 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:14:22,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 02:14:22,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:14:30,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:14:34,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 02:14:34,477 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 02:14:36,006 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=216480.0, ans=0.1 2023-09-29 02:14:37,459 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:14:39,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:14:42,087 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:14:45,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 02:14:45,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:14:47,265 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=216546.66666666666, ans=0.125 2023-09-29 02:14:49,268 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.56 vs. limit=10.0 2023-09-29 02:14:50,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 02:14:50,335 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=216546.66666666666, ans=0.125 2023-09-29 02:14:51,762 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=216546.66666666666, ans=0.0 2023-09-29 02:14:55,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:14:55,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:14:55,615 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.22 vs. limit=10.0 2023-09-29 02:14:56,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:15:01,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:15:01,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:15:03,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:15:10,415 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=216680.0, ans=0.04949747468305833 2023-09-29 02:15:12,994 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 02:15:14,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:15:16,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:15:16,272 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:15:22,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 02:15:27,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 02:15:27,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:15:29,715 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=216746.66666666666, ans=0.0 2023-09-29 02:15:33,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 02:15:33,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:15:37,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 02:15:38,039 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:15:39,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:15:44,086 INFO [train.py:1039] (1/4) Epoch 7, batch 650, loss[loss=0.2354, simple_loss=0.3033, pruned_loss=0.08373, over 23983.00 frames. ], tot_loss[loss=0.2316, simple_loss=0.2947, pruned_loss=0.08419, over 4533094.74 frames. ], batch size: 86, lr: 1.51e-02, grad_scale: 32.0 2023-09-29 02:15:45,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 02:15:47,189 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 02:15:48,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:15:51,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:15:51,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:15:54,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 02:15:56,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:16:01,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:16:01,546 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:16:04,656 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:16:07,682 WARNING [train.py:1197] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 02:16:11,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:16:11,931 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:16:16,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:16:16,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 02:16:18,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:16:19,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:16:21,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 02:16:21,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:16:22,647 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 02:16:24,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 02:16:25,707 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 02:16:25,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:16:25,747 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:16:25,984 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=216946.66666666666, ans=0.125 2023-09-29 02:16:28,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:16:30,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:16:30,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:16:30,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:16:34,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 02:16:34,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:16:34,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:16:36,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 02:16:37,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:16:38,087 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.68 vs. limit=22.5 2023-09-29 02:16:38,835 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 2.233e+02 2.449e+02 2.795e+02 3.907e+02, threshold=4.898e+02, percent-clipped=0.0 2023-09-29 02:16:38,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 02:16:40,548 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 02:16:40,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 02:16:42,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:16:42,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:16:42,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:16:42,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:16:44,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:16:51,236 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:16:51,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:16:52,821 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:16:54,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:16:54,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 02:16:56,041 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:17:02,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 02:17:02,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:17:03,672 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:17:03,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:17:05,235 INFO [train.py:1039] (1/4) Epoch 7, batch 700, loss[loss=0.232, simple_loss=0.3064, pruned_loss=0.07876, over 24346.00 frames. ], tot_loss[loss=0.2304, simple_loss=0.2942, pruned_loss=0.08327, over 4590875.79 frames. ], batch size: 77, lr: 1.51e-02, grad_scale: 32.0 2023-09-29 02:17:10,859 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 02:17:10,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 02:17:14,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 02:17:15,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:17:18,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:17:21,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 02:17:24,439 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:17:27,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:17:29,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:17:29,446 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=217213.33333333334, ans=0.0 2023-09-29 02:17:30,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 02:17:31,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:17:34,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:17:37,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 02:17:37,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:17:40,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 02:17:45,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 02:17:48,428 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 02:17:49,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:17:52,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:17:52,606 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=217280.0, ans=0.1 2023-09-29 02:17:57,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:17:57,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 02:18:01,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:18:02,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 02:18:02,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 02:18:03,969 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=217346.66666666666, ans=0.1 2023-09-29 02:18:05,876 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.23 vs. limit=15.0 2023-09-29 02:18:06,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:18:08,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:18:11,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:18:14,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:18:14,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 02:18:18,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 02:18:20,248 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 02:18:23,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:18:25,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:18:25,550 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:18:29,126 INFO [train.py:1039] (1/4) Epoch 7, batch 750, loss[loss=0.2637, simple_loss=0.3294, pruned_loss=0.09895, over 24427.00 frames. ], tot_loss[loss=0.23, simple_loss=0.2935, pruned_loss=0.08322, over 4624492.90 frames. ], batch size: 77, lr: 1.51e-02, grad_scale: 32.0 2023-09-29 02:18:29,214 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:18:29,223 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 02:18:33,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 02:18:33,663 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 02:18:33,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 02:18:35,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 02:18:35,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 02:18:35,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:18:38,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 02:18:38,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:18:39,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:18:40,104 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=217480.0, ans=0.2 2023-09-29 02:18:41,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:18:42,869 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:18:44,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 02:18:44,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:18:48,868 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:18:50,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 02:18:52,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:18:54,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:18:56,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:18:56,387 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 02:18:57,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 02:18:58,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:19:00,222 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:19:01,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 02:19:03,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 02:19:03,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:19:05,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 02:19:05,543 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 02:19:05,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 02:19:05,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:19:07,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 02:19:07,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:19:09,094 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=217613.33333333334, ans=0.1 2023-09-29 02:19:14,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:19:14,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:19:14,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 02:19:15,161 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=217613.33333333334, ans=0.125 2023-09-29 02:19:17,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:19:19,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:19:19,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 02:19:21,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:19:23,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 02:19:23,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:19:23,717 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=217680.0, ans=0.07 2023-09-29 02:19:24,700 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.977e+02 2.216e+02 2.498e+02 3.508e+02, threshold=4.433e+02, percent-clipped=0.0 2023-09-29 02:19:25,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:19:26,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 02:19:28,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:19:32,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:19:33,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:19:33,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:19:35,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 02:19:40,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 02:19:41,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:19:43,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:19:45,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:19:46,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:19:48,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:19:48,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 02:19:51,112 INFO [train.py:1039] (1/4) Epoch 7, batch 800, loss[loss=0.2236, simple_loss=0.3008, pruned_loss=0.07317, over 24560.00 frames. ], tot_loss[loss=0.2292, simple_loss=0.2929, pruned_loss=0.08278, over 4646663.54 frames. ], batch size: 71, lr: 1.51e-02, grad_scale: 32.0 2023-09-29 02:19:57,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:19:57,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:19:59,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:19:59,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:20:00,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:20:00,946 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:20:03,340 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=15.45 vs. limit=22.5 2023-09-29 02:20:04,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:20:08,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:20:09,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 02:20:12,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 02:20:12,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:20:14,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:20:14,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:20:15,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:20:15,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 02:20:15,813 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:20:17,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 02:20:19,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:20:20,825 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=217880.0, ans=0.0 2023-09-29 02:20:22,000 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:20:23,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:20:23,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:20:25,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:20:25,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:20:30,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:20:31,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:20:31,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 02:20:33,975 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 02:20:34,020 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 02:20:35,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:20:35,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:20:37,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:20:39,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:20:44,422 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 02:20:44,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 02:20:44,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 02:20:47,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:20:51,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:20:55,209 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:20:56,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 02:20:56,722 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:20:58,522 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=218080.0, ans=0.0 2023-09-29 02:21:01,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 02:21:07,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 02:21:08,201 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=218080.0, ans=0.025 2023-09-29 02:21:10,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:21:11,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 02:21:12,920 INFO [train.py:1039] (1/4) Epoch 7, batch 850, loss[loss=0.2403, simple_loss=0.2945, pruned_loss=0.09308, over 23757.00 frames. ], tot_loss[loss=0.2305, simple_loss=0.2947, pruned_loss=0.08318, over 4668086.86 frames. ], batch size: 212, lr: 1.51e-02, grad_scale: 32.0 2023-09-29 02:21:13,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:21:13,334 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=218146.66666666666, ans=0.0 2023-09-29 02:21:13,785 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.85 vs. limit=15.0 2023-09-29 02:21:15,033 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:21:16,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 02:21:16,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:21:18,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:21:19,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:21:19,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 02:21:21,428 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:21:22,989 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 02:21:23,078 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 02:21:23,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 02:21:24,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 02:21:26,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:21:26,309 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=218146.66666666666, ans=0.0 2023-09-29 02:21:29,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:21:29,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:21:29,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 02:21:29,541 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=218213.33333333334, ans=0.0 2023-09-29 02:21:32,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:21:33,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:21:33,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 02:21:35,775 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=218213.33333333334, ans=0.1 2023-09-29 02:21:35,796 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=218213.33333333334, ans=0.1 2023-09-29 02:21:38,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 02:21:40,696 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:21:42,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 02:21:45,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 02:21:47,322 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 02:21:49,883 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=218280.0, ans=0.0 2023-09-29 02:21:51,153 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 02:21:51,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:21:51,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:21:51,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 02:21:54,188 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:21:55,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:21:57,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 02:22:00,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:22:00,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:22:01,779 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:22:01,816 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 02:22:02,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:22:03,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 02:22:05,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 02:22:09,489 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 2.250e+02 2.603e+02 3.078e+02 4.971e+02, threshold=5.207e+02, percent-clipped=2.0 2023-09-29 02:22:09,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:22:09,663 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:22:11,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:22:11,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:22:12,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:22:15,067 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=218346.66666666666, ans=0.125 2023-09-29 02:22:16,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:22:18,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:22:20,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 02:22:20,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:22:20,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 02:22:23,348 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.39 vs. limit=15.0 2023-09-29 02:22:28,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 02:22:29,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:22:29,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 02:22:30,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:22:31,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:22:32,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 02:22:32,993 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=218413.33333333334, ans=0.125 2023-09-29 02:22:35,500 INFO [train.py:1039] (1/4) Epoch 7, batch 900, loss[loss=0.2566, simple_loss=0.3082, pruned_loss=0.1025, over 23602.00 frames. ], tot_loss[loss=0.2318, simple_loss=0.2958, pruned_loss=0.08393, over 4679805.03 frames. ], batch size: 256, lr: 1.51e-02, grad_scale: 16.0 2023-09-29 02:22:37,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:22:41,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:22:41,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 02:22:44,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:22:45,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 02:22:48,706 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 02:22:50,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:22:50,124 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:22:50,209 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 02:22:51,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:22:58,934 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=218546.66666666666, ans=0.125 2023-09-29 02:23:01,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:23:01,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:23:01,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 02:23:04,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:23:10,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 02:23:11,043 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=218613.33333333334, ans=0.0 2023-09-29 02:23:12,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:23:15,576 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=218613.33333333334, ans=0.1 2023-09-29 02:23:16,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 02:23:18,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:23:18,564 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 02:23:18,700 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 02:23:18,943 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:23:26,123 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 02:23:26,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:23:28,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 02:23:35,550 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:23:35,568 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:23:37,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 02:23:37,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:23:40,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 02:23:43,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:23:43,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:23:45,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:23:45,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:23:48,701 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 02:23:50,133 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 02:23:51,708 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 02:23:51,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 02:23:53,405 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:23:56,775 INFO [train.py:1039] (1/4) Epoch 7, batch 950, loss[loss=0.2566, simple_loss=0.3024, pruned_loss=0.1054, over 23814.00 frames. ], tot_loss[loss=0.2323, simple_loss=0.2963, pruned_loss=0.0842, over 4681117.31 frames. ], batch size: 212, lr: 1.51e-02, grad_scale: 16.0 2023-09-29 02:23:58,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 02:24:03,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:24:05,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:24:05,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:24:06,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 02:24:08,416 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 02:24:08,880 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=218813.33333333334, ans=0.95 2023-09-29 02:24:12,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:24:13,553 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:24:14,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:24:14,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:24:15,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 02:24:16,511 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 02:24:18,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:24:19,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 02:24:19,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:24:24,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:24:24,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:24:24,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:24:25,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 02:24:26,465 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=218880.0, ans=0.125 2023-09-29 02:24:28,211 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 02:24:31,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:24:32,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:24:39,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:24:39,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:24:41,063 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 02:24:43,066 WARNING [train.py:1197] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 02:24:43,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 02:24:44,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:24:44,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:24:44,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:24:50,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 02:24:50,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:24:53,263 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 2.008e+02 2.208e+02 2.603e+02 6.954e+02, threshold=4.417e+02, percent-clipped=1.0 2023-09-29 02:24:54,853 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:24:54,948 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:24:54,974 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 02:24:54,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:24:54,998 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:24:56,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 02:24:59,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:25:01,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:25:08,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:25:08,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 02:25:09,904 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 02:25:10,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=219080.0, ans=0.0 2023-09-29 02:25:13,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:25:17,797 INFO [train.py:1039] (1/4) Epoch 7, batch 1000, loss[loss=0.2071, simple_loss=0.2768, pruned_loss=0.06871, over 24310.00 frames. ], tot_loss[loss=0.2304, simple_loss=0.2941, pruned_loss=0.08337, over 4682128.98 frames. ], batch size: 56, lr: 1.51e-02, grad_scale: 16.0 2023-09-29 02:25:19,411 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 02:25:19,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:25:22,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:25:25,777 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 02:25:25,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 02:25:30,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:25:30,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:25:32,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:25:34,489 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 02:25:39,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 02:25:41,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 02:25:43,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:25:43,810 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=219213.33333333334, ans=0.125 2023-09-29 02:25:44,958 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 02:25:46,499 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 02:25:46,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 02:25:48,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:25:49,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:25:57,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:25:58,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:25:59,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:25:59,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:25:59,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 02:25:59,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:26:01,027 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:26:01,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:26:02,617 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 02:26:06,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 02:26:06,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 02:26:07,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 02:26:10,545 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=219346.66666666666, ans=0.2 2023-09-29 02:26:11,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:26:18,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:26:18,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:26:20,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:26:21,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:26:21,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 02:26:24,712 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:26:24,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 02:26:24,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 02:26:26,997 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:26:27,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:26:28,594 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer_ff2.min_abs, batch_count=219413.33333333334, ans=0.1 2023-09-29 02:26:29,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:26:34,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 02:26:34,525 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=219413.33333333334, ans=0.125 2023-09-29 02:26:37,246 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:26:40,873 INFO [train.py:1039] (1/4) Epoch 7, batch 1050, loss[loss=0.2309, simple_loss=0.3084, pruned_loss=0.07664, over 24328.00 frames. ], tot_loss[loss=0.2292, simple_loss=0.293, pruned_loss=0.0827, over 4693422.00 frames. ], batch size: 74, lr: 1.51e-02, grad_scale: 16.0 2023-09-29 02:26:40,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:26:41,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:26:43,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 02:26:43,557 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.min_positive, batch_count=219480.0, ans=0.025 2023-09-29 02:26:44,938 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:26:46,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:26:48,213 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=219480.0, ans=0.2 2023-09-29 02:26:49,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 02:26:50,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 02:26:51,733 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.24 vs. limit=22.5 2023-09-29 02:26:53,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:26:54,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:26:54,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:26:56,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:26:56,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 02:26:57,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:26:58,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 02:27:00,952 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:27:00,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 02:27:02,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 02:27:09,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:27:10,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 02:27:10,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:27:14,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 02:27:14,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 02:27:14,891 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.65 vs. limit=15.0 2023-09-29 02:27:16,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:27:18,489 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=219613.33333333334, ans=0.125 2023-09-29 02:27:19,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 02:27:21,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 02:27:22,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:27:26,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 02:27:28,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 02:27:29,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:27:30,024 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=219680.0, ans=0.0 2023-09-29 02:27:31,294 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:27:34,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:27:37,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 02:27:39,289 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.682e+02 2.034e+02 2.317e+02 2.689e+02 3.658e+02, threshold=4.634e+02, percent-clipped=0.0 2023-09-29 02:27:39,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 02:27:39,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 02:27:39,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:27:41,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:27:42,570 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 02:27:47,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:27:49,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:27:49,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:27:49,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:27:49,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:27:55,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:27:55,202 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 02:27:55,530 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=219746.66666666666, ans=0.0 2023-09-29 02:27:55,546 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=219746.66666666666, ans=0.125 2023-09-29 02:27:56,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:27:56,859 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 02:27:56,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 02:27:58,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:28:00,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:28:03,714 INFO [train.py:1039] (1/4) Epoch 7, batch 1100, loss[loss=0.2488, simple_loss=0.3109, pruned_loss=0.09331, over 23399.00 frames. ], tot_loss[loss=0.2288, simple_loss=0.2924, pruned_loss=0.08255, over 4703712.16 frames. ], batch size: 93, lr: 1.50e-02, grad_scale: 16.0 2023-09-29 02:28:05,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:28:10,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 02:28:10,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:28:11,690 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:28:11,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 02:28:15,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:28:18,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 02:28:20,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:28:24,514 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=219880.0, ans=0.0 2023-09-29 02:28:25,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:28:25,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 02:28:25,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 02:28:27,329 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:28:27,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:28:29,356 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=30.29 vs. limit=15.0 2023-09-29 02:28:32,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:28:33,959 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 02:28:40,034 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:28:41,912 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=219946.66666666666, ans=0.09899494936611666 2023-09-29 02:28:41,937 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=219946.66666666666, ans=0.2 2023-09-29 02:28:43,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 02:28:44,515 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 02:28:44,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:28:47,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:28:49,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 02:28:49,396 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:28:51,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 02:28:52,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:28:52,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:28:52,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:28:54,349 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:28:55,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 02:29:00,347 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:29:01,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 02:29:03,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:29:06,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 02:29:07,025 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=220013.33333333334, ans=0.125 2023-09-29 02:29:10,254 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 02:29:10,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 02:29:11,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:29:13,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:29:14,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:29:16,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 02:29:16,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:29:16,491 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:29:16,744 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=220080.0, ans=0.125 2023-09-29 02:29:18,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 02:29:18,167 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:29:19,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 02:29:20,047 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=220080.0, ans=0.125 2023-09-29 02:29:21,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:29:21,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:29:22,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 02:29:24,544 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=220146.66666666666, ans=0.125 2023-09-29 02:29:26,262 INFO [train.py:1039] (1/4) Epoch 7, batch 1150, loss[loss=0.2541, simple_loss=0.3126, pruned_loss=0.09777, over 23864.00 frames. ], tot_loss[loss=0.2301, simple_loss=0.2938, pruned_loss=0.08325, over 4701819.54 frames. ], batch size: 179, lr: 1.50e-02, grad_scale: 16.0 2023-09-29 02:29:28,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:29:31,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:29:34,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:29:34,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:29:35,431 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 02:29:35,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:29:37,169 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=220146.66666666666, ans=0.125 2023-09-29 02:29:38,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 02:29:38,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:29:38,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 02:29:45,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 02:29:48,311 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:29:51,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:29:52,926 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:29:53,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 02:29:53,014 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 02:29:53,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:29:59,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 02:29:59,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:29:59,386 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:30:00,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:30:01,276 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.03 vs. limit=6.0 2023-09-29 02:30:10,296 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=220280.0, ans=0.0 2023-09-29 02:30:13,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:30:21,255 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:30:21,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 02:30:22,182 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.06 vs. limit=12.0 2023-09-29 02:30:22,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:30:24,042 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.659e+02 2.131e+02 2.399e+02 2.911e+02 4.367e+02, threshold=4.797e+02, percent-clipped=0.0 2023-09-29 02:30:24,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:30:30,158 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 02:30:30,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:30:33,764 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=220413.33333333334, ans=0.125 2023-09-29 02:30:36,532 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 02:30:42,373 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:30:42,654 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=220413.33333333334, ans=0.025 2023-09-29 02:30:44,469 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:30:44,505 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 02:30:44,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:30:46,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:30:46,687 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.54 vs. limit=15.0 2023-09-29 02:30:48,897 INFO [train.py:1039] (1/4) Epoch 7, batch 1200, loss[loss=0.2218, simple_loss=0.2779, pruned_loss=0.08289, over 23303.00 frames. ], tot_loss[loss=0.2292, simple_loss=0.2934, pruned_loss=0.08248, over 4714750.42 frames. ], batch size: 119, lr: 1.50e-02, grad_scale: 32.0 2023-09-29 02:30:50,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:30:50,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 02:30:54,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:30:54,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:30:55,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:30:57,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:30:58,978 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 02:31:00,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:31:00,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:31:03,487 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 02:31:06,423 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 02:31:10,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 02:31:13,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:31:17,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:31:18,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:31:18,891 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 02:31:19,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:31:21,171 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=220613.33333333334, ans=0.0 2023-09-29 02:31:27,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 02:31:27,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:31:27,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 02:31:29,062 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:31:32,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 02:31:37,183 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=220680.0, ans=0.2 2023-09-29 02:31:38,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 02:31:38,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:31:38,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:31:39,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:31:40,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 02:31:41,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:31:41,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:31:41,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:31:43,260 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 02:31:43,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 02:31:43,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 02:31:44,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 02:31:49,697 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:31:49,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:31:54,089 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 02:31:55,945 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=220746.66666666666, ans=0.0 2023-09-29 02:31:57,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:32:00,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 02:32:03,847 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 02:32:05,451 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:32:05,940 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.55 vs. limit=12.0 2023-09-29 02:32:08,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 02:32:10,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:32:11,615 INFO [train.py:1039] (1/4) Epoch 7, batch 1250, loss[loss=0.1845, simple_loss=0.2566, pruned_loss=0.05618, over 24480.00 frames. ], tot_loss[loss=0.2306, simple_loss=0.294, pruned_loss=0.08358, over 4715086.43 frames. ], batch size: 58, lr: 1.50e-02, grad_scale: 32.0 2023-09-29 02:32:11,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:32:13,632 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=220813.33333333334, ans=0.05 2023-09-29 02:32:14,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 02:32:17,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:32:19,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:32:19,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 02:32:23,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:32:24,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 02:32:28,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 02:32:28,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:32:29,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:32:29,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:32:32,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 02:32:36,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 02:32:36,040 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 02:32:36,049 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:32:36,258 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:32:38,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:32:41,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:32:41,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 02:32:46,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 02:32:46,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:32:49,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:32:50,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 02:32:52,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:32:52,903 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 02:32:52,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:32:54,304 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:32:57,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:32:58,306 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=220946.66666666666, ans=0.125 2023-09-29 02:32:58,721 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.49 vs. limit=15.0 2023-09-29 02:33:04,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:33:04,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:33:04,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 02:33:05,733 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 02:33:05,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 02:33:06,549 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=13.10 vs. limit=15.0 2023-09-29 02:33:08,751 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.710e+02 2.040e+02 2.260e+02 2.662e+02 4.055e+02, threshold=4.521e+02, percent-clipped=0.0 2023-09-29 02:33:10,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:33:12,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 02:33:12,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:33:15,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 02:33:15,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:33:15,935 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=221080.0, ans=0.0 2023-09-29 02:33:18,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 02:33:18,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 02:33:20,291 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:33:20,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 02:33:20,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:33:22,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 02:33:24,994 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:33:26,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:33:27,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:33:30,368 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 02:33:32,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:33:32,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 02:33:34,489 INFO [train.py:1039] (1/4) Epoch 7, batch 1300, loss[loss=0.224, simple_loss=0.2784, pruned_loss=0.08474, over 22736.00 frames. ], tot_loss[loss=0.2313, simple_loss=0.2949, pruned_loss=0.08382, over 4717722.69 frames. ], batch size: 322, lr: 1.50e-02, grad_scale: 32.0 2023-09-29 02:33:39,118 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:33:40,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 02:33:42,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:33:43,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:33:45,363 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:33:47,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 02:33:50,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 02:33:52,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 02:33:53,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 02:33:58,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 02:34:03,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:34:04,028 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=221213.33333333334, ans=0.125 2023-09-29 02:34:05,187 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:34:06,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:34:08,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:34:10,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:34:10,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 02:34:10,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 02:34:18,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:34:18,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 02:34:18,910 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 02:34:20,752 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 02:34:22,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:34:25,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:34:27,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 02:34:27,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:34:27,698 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 02:34:29,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:34:33,791 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:34:33,795 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:34:36,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 02:34:37,161 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=221346.66666666666, ans=0.2 2023-09-29 02:34:38,078 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=221346.66666666666, ans=0.025 2023-09-29 02:34:39,149 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 02:34:40,721 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 02:34:43,054 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.74 vs. limit=22.5 2023-09-29 02:34:45,829 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:34:49,458 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 02:34:50,994 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:34:55,870 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=221480.0, ans=0.0 2023-09-29 02:34:57,758 INFO [train.py:1039] (1/4) Epoch 7, batch 1350, loss[loss=0.2192, simple_loss=0.287, pruned_loss=0.07564, over 24450.00 frames. ], tot_loss[loss=0.2296, simple_loss=0.293, pruned_loss=0.08316, over 4720155.26 frames. ], batch size: 63, lr: 1.50e-02, grad_scale: 32.0 2023-09-29 02:34:58,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 02:35:01,387 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=221480.0, ans=0.0 2023-09-29 02:35:02,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:35:04,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:35:07,320 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:35:08,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:35:10,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:35:10,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:35:15,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:35:17,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 02:35:18,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 02:35:18,960 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=221546.66666666666, ans=0.2 2023-09-29 02:35:20,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:35:22,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 02:35:24,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:35:25,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:35:25,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 02:35:27,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 02:35:29,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 02:35:31,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:35:31,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 02:35:42,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:35:47,975 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=221680.0, ans=0.2 2023-09-29 02:35:51,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:35:51,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:35:52,698 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 02:35:55,566 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 2.214e+02 2.683e+02 3.104e+02 4.290e+02, threshold=5.366e+02, percent-clipped=0.0 2023-09-29 02:35:55,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:35:57,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 02:35:57,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 02:35:59,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:36:01,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:36:03,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 02:36:04,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:36:08,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 02:36:09,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 02:36:14,332 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=221746.66666666666, ans=0.0 2023-09-29 02:36:17,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 02:36:19,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:36:19,554 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=221813.33333333334, ans=0.0 2023-09-29 02:36:20,682 INFO [train.py:1039] (1/4) Epoch 7, batch 1400, loss[loss=0.2413, simple_loss=0.3075, pruned_loss=0.08754, over 23940.00 frames. ], tot_loss[loss=0.2283, simple_loss=0.2908, pruned_loss=0.08285, over 4703421.91 frames. ], batch size: 86, lr: 1.50e-02, grad_scale: 32.0 2023-09-29 02:36:23,022 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:36:23,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:36:27,851 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 02:36:31,369 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 02:36:39,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 02:36:41,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:36:44,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:36:44,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 02:36:49,399 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:36:50,859 WARNING [train.py:1197] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 02:36:54,263 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=221946.66666666666, ans=0.0 2023-09-29 02:37:00,640 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:37:02,107 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:37:02,478 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=221946.66666666666, ans=0.125 2023-09-29 02:37:07,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 02:37:09,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 02:37:09,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:37:09,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:37:11,034 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:37:12,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:37:12,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:37:12,973 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=222013.33333333334, ans=0.0 2023-09-29 02:37:14,136 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:37:15,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 02:37:17,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:37:20,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:37:20,320 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=222013.33333333334, ans=0.125 2023-09-29 02:37:22,111 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=222013.33333333334, ans=0.05 2023-09-29 02:37:23,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:37:32,974 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 02:37:33,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 02:37:35,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:37:36,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 02:37:38,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:37:38,663 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:37:43,602 INFO [train.py:1039] (1/4) Epoch 7, batch 1450, loss[loss=0.2148, simple_loss=0.2748, pruned_loss=0.07745, over 23643.00 frames. ], tot_loss[loss=0.2278, simple_loss=0.2908, pruned_loss=0.08239, over 4715874.15 frames. ], batch size: 232, lr: 1.50e-02, grad_scale: 32.0 2023-09-29 02:37:43,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 02:37:46,830 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:37:46,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:37:46,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 02:37:51,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:37:53,099 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 02:37:54,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:37:54,670 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 02:37:56,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 02:37:56,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 02:37:58,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:37:59,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:37:59,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 02:37:59,961 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:38:00,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:38:01,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 02:38:01,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:38:01,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:38:01,867 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.min_positive, batch_count=222213.33333333334, ans=0.025 2023-09-29 02:38:04,623 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:38:08,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:38:11,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:38:11,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:38:14,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:38:14,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:38:15,258 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=222280.0, ans=0.0 2023-09-29 02:38:18,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:38:18,649 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:38:18,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:38:20,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:38:23,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 02:38:26,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:38:29,634 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 02:38:32,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:38:32,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 02:38:34,882 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:38:36,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 02:38:39,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:38:41,366 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 2.076e+02 2.226e+02 2.557e+02 3.542e+02, threshold=4.452e+02, percent-clipped=0.0 2023-09-29 02:38:41,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 02:38:43,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 02:38:44,606 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:38:47,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:38:49,206 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:38:51,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 02:38:53,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 02:38:53,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 02:38:55,225 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:38:56,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 02:39:01,735 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=222413.33333333334, ans=0.125 2023-09-29 02:39:06,104 INFO [train.py:1039] (1/4) Epoch 7, batch 1500, loss[loss=0.2177, simple_loss=0.2955, pruned_loss=0.06993, over 24564.00 frames. ], tot_loss[loss=0.228, simple_loss=0.2913, pruned_loss=0.08239, over 4711917.58 frames. ], batch size: 71, lr: 1.50e-02, grad_scale: 32.0 2023-09-29 02:39:09,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 02:39:10,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 02:39:10,022 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:39:11,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:39:11,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:39:13,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:39:13,326 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 02:39:16,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:39:16,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 02:39:16,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:39:18,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:39:19,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:39:21,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:39:25,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:39:27,512 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 02:39:27,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 02:39:27,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:39:29,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:39:29,538 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=222546.66666666666, ans=0.125 2023-09-29 02:39:32,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 02:39:35,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 02:39:38,567 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:39:38,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 02:39:41,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 02:39:43,125 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.64 vs. limit=12.0 2023-09-29 02:39:43,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:39:45,117 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:39:45,152 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:39:45,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 02:39:46,720 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:39:46,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:39:48,813 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 02:39:50,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:39:56,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:39:56,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 02:40:03,784 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:40:05,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 02:40:09,825 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 02:40:09,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:40:09,915 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 02:40:11,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:40:13,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:40:13,128 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 02:40:14,708 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 02:40:19,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 02:40:21,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:40:24,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:40:24,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:40:24,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:40:25,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:40:25,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:40:28,008 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 02:40:28,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 02:40:29,315 INFO [train.py:1039] (1/4) Epoch 7, batch 1550, loss[loss=0.2335, simple_loss=0.2869, pruned_loss=0.0901, over 23693.00 frames. ], tot_loss[loss=0.2287, simple_loss=0.2921, pruned_loss=0.08263, over 4716487.95 frames. ], batch size: 232, lr: 1.49e-02, grad_scale: 16.0 2023-09-29 02:40:29,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:40:30,936 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 02:40:31,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 02:40:31,484 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=222813.33333333334, ans=0.0 2023-09-29 02:40:32,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:40:36,281 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:40:36,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:40:36,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:40:37,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:40:39,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:40:40,973 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 02:40:42,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:40:42,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 02:40:43,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 02:40:45,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:40:45,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 02:40:47,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:40:48,378 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 02:40:48,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 02:40:48,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 02:40:50,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:40:51,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:40:56,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:40:59,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 02:40:59,576 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 02:41:07,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:41:13,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:41:13,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:41:13,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:41:13,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 02:41:16,753 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=223013.33333333334, ans=0.125 2023-09-29 02:41:19,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 02:41:21,481 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.98 vs. limit=12.0 2023-09-29 02:41:22,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:41:24,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:41:24,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:41:25,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:41:25,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 02:41:25,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 02:41:27,784 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 2.115e+02 2.414e+02 2.837e+02 4.599e+02, threshold=4.828e+02, percent-clipped=1.0 2023-09-29 02:41:28,300 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=223013.33333333334, ans=0.0 2023-09-29 02:41:29,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:41:29,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:41:30,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 02:41:30,881 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 02:41:31,127 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=223013.33333333334, ans=0.0 2023-09-29 02:41:32,789 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=223080.0, ans=0.125 2023-09-29 02:41:33,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:41:40,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 02:41:45,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:41:45,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:41:47,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 02:41:48,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 02:41:50,276 INFO [train.py:1039] (1/4) Epoch 7, batch 1600, loss[loss=0.2231, simple_loss=0.2961, pruned_loss=0.075, over 24388.00 frames. ], tot_loss[loss=0.2298, simple_loss=0.2934, pruned_loss=0.08315, over 4712245.12 frames. ], batch size: 77, lr: 1.49e-02, grad_scale: 32.0 2023-09-29 02:41:50,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:41:50,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:41:50,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:41:51,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:41:53,866 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=223146.66666666666, ans=0.125 2023-09-29 02:41:54,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:41:55,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 02:41:56,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 02:41:58,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 02:41:59,916 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:42:02,338 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=223146.66666666666, ans=0.0 2023-09-29 02:42:03,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 02:42:03,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:42:06,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:42:08,592 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=223213.33333333334, ans=0.125 2023-09-29 02:42:08,599 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=223213.33333333334, ans=0.0 2023-09-29 02:42:11,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:42:15,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 02:42:17,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:42:19,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 02:42:19,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:42:19,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 02:42:27,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 02:42:33,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:42:35,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 02:42:35,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:42:36,131 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=223280.0, ans=0.0 2023-09-29 02:42:37,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:42:37,341 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:42:40,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 02:42:43,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 02:42:46,476 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:42:46,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:42:46,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:42:48,751 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:42:50,374 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:42:51,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:42:53,932 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:43:00,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:43:02,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:43:05,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 02:43:05,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:43:06,532 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 02:43:11,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:43:13,093 INFO [train.py:1039] (1/4) Epoch 7, batch 1650, loss[loss=0.2366, simple_loss=0.3118, pruned_loss=0.08069, over 24651.00 frames. ], tot_loss[loss=0.2312, simple_loss=0.2947, pruned_loss=0.08386, over 4714423.16 frames. ], batch size: 73, lr: 1.49e-02, grad_scale: 32.0 2023-09-29 02:43:14,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:43:14,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:43:14,803 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 02:43:14,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 02:43:14,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 02:43:14,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 02:43:19,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:43:21,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:43:22,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:43:22,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 02:43:26,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:43:27,794 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 02:43:30,207 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:43:30,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:43:30,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:43:30,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:43:32,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 02:43:33,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 02:43:39,899 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:43:41,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 02:43:48,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 02:43:50,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:43:50,987 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.74 vs. limit=15.0 2023-09-29 02:43:51,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 02:43:52,119 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=223613.33333333334, ans=0.125 2023-09-29 02:43:55,057 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=223613.33333333334, ans=0.04949747468305833 2023-09-29 02:43:56,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:43:59,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:43:59,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:43:59,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:44:01,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:44:01,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:44:05,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:44:05,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:44:07,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:44:07,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:44:08,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:44:10,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 02:44:13,143 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.715e+02 2.058e+02 2.405e+02 2.744e+02 4.179e+02, threshold=4.810e+02, percent-clipped=0.0 2023-09-29 02:44:13,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:44:13,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 02:44:16,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:44:18,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 02:44:18,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 02:44:18,512 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 02:44:19,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:44:21,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:44:21,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:44:23,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:44:23,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 02:44:26,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:44:27,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:44:27,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:44:29,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 02:44:34,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:44:34,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:44:35,909 INFO [train.py:1039] (1/4) Epoch 7, batch 1700, loss[loss=0.2249, simple_loss=0.2545, pruned_loss=0.09761, over 19320.00 frames. ], tot_loss[loss=0.2311, simple_loss=0.294, pruned_loss=0.08412, over 4692477.83 frames. ], batch size: 388, lr: 1.49e-02, grad_scale: 16.0 2023-09-29 02:44:36,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 02:44:37,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:44:37,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:44:37,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:44:39,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:44:39,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:44:39,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 02:44:40,537 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=14.13 vs. limit=15.0 2023-09-29 02:44:41,523 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=223813.33333333334, ans=0.125 2023-09-29 02:44:42,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 02:44:50,086 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=223813.33333333334, ans=0.2 2023-09-29 02:44:51,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:44:52,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:44:57,729 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=223880.0, ans=0.1 2023-09-29 02:44:58,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 02:44:58,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:44:59,053 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:45:00,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:45:05,509 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 02:45:07,136 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:45:07,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:45:07,646 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=223946.66666666666, ans=0.125 2023-09-29 02:45:08,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 02:45:10,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 02:45:12,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 02:45:14,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 02:45:15,962 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:45:16,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 02:45:17,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:45:27,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:45:29,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:45:30,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:45:33,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 02:45:33,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 02:45:33,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:45:35,190 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:45:35,191 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 02:45:36,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:45:36,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:45:36,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:45:36,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:45:39,001 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=224013.33333333334, ans=0.1 2023-09-29 02:45:41,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:45:41,695 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:45:41,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:45:41,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:45:43,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:45:48,544 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:45:50,572 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 02:45:52,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:45:52,320 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:45:55,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 02:45:58,953 INFO [train.py:1039] (1/4) Epoch 7, batch 1750, loss[loss=0.2125, simple_loss=0.2748, pruned_loss=0.07512, over 23231.00 frames. ], tot_loss[loss=0.229, simple_loss=0.2921, pruned_loss=0.08296, over 4697121.75 frames. ], batch size: 105, lr: 1.49e-02, grad_scale: 16.0 2023-09-29 02:46:01,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:46:03,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:46:04,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 02:46:05,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 02:46:06,516 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:46:06,796 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=224146.66666666666, ans=0.025 2023-09-29 02:46:09,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:46:09,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:46:14,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 02:46:15,076 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=224213.33333333334, ans=0.1 2023-09-29 02:46:17,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:46:19,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 02:46:21,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:46:21,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:46:23,429 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=224213.33333333334, ans=0.0 2023-09-29 02:46:25,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 02:46:26,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 02:46:27,001 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:46:28,460 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 02:46:28,723 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=224213.33333333334, ans=0.125 2023-09-29 02:46:36,971 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:46:41,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:46:41,439 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:46:44,572 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:46:44,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:46:46,755 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:46:48,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:46:48,646 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=224346.66666666666, ans=0.09899494936611666 2023-09-29 02:46:51,309 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:46:52,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:46:52,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 02:46:55,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:46:58,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 02:46:58,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:47:00,211 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.434e+02 2.137e+02 2.415e+02 2.786e+02 3.944e+02, threshold=4.830e+02, percent-clipped=0.0 2023-09-29 02:47:00,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:47:01,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:47:05,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 02:47:06,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 02:47:08,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:47:09,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:47:13,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:47:17,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:47:17,854 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:47:19,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 02:47:19,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:47:21,252 INFO [train.py:1039] (1/4) Epoch 7, batch 1800, loss[loss=0.2291, simple_loss=0.2612, pruned_loss=0.09855, over 19197.00 frames. ], tot_loss[loss=0.2279, simple_loss=0.2911, pruned_loss=0.08236, over 4697106.09 frames. ], batch size: 388, lr: 1.49e-02, grad_scale: 16.0 2023-09-29 02:47:21,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 02:47:21,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:47:21,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 02:47:21,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:47:22,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:47:24,633 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:47:26,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:47:27,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 02:47:29,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:47:33,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 02:47:33,802 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=224480.0, ans=0.125 2023-09-29 02:47:35,165 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:47:38,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:47:41,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:47:41,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:47:43,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:47:46,221 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:47:46,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 02:47:46,358 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:47:46,646 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=224546.66666666666, ans=0.0 2023-09-29 02:47:49,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:47:49,958 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=224546.66666666666, ans=0.0 2023-09-29 02:47:52,860 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 02:47:55,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 02:47:55,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 02:47:56,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:47:57,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:47:57,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:47:59,479 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:48:06,778 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 02:48:08,252 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:48:11,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:48:12,247 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=15.31 vs. limit=15.0 2023-09-29 02:48:12,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 02:48:14,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 02:48:14,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 02:48:16,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:48:16,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 02:48:22,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 02:48:23,062 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=224680.0, ans=0.2 2023-09-29 02:48:24,931 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=224680.0, ans=0.125 2023-09-29 02:48:26,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:48:27,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 02:48:27,889 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:48:27,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:48:29,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:48:29,935 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 02:48:32,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 02:48:32,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:48:35,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 02:48:35,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:48:38,370 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:48:38,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:48:38,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:48:40,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:48:40,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:48:40,931 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.71 vs. limit=15.0 2023-09-29 02:48:45,591 INFO [train.py:1039] (1/4) Epoch 7, batch 1850, loss[loss=0.3001, simple_loss=0.3289, pruned_loss=0.1356, over 19380.00 frames. ], tot_loss[loss=0.23, simple_loss=0.2931, pruned_loss=0.08344, over 4696146.52 frames. ], batch size: 388, lr: 1.49e-02, grad_scale: 16.0 2023-09-29 02:48:45,680 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:48:45,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:48:46,370 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.13 vs. limit=22.5 2023-09-29 02:48:47,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:48:48,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:48:55,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:48:55,415 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=224813.33333333334, ans=10.0 2023-09-29 02:48:56,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 02:48:59,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 02:49:02,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 02:49:06,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:49:07,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 02:49:07,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 02:49:09,340 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=224880.0, ans=0.125 2023-09-29 02:49:11,549 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=224880.0, ans=0.0 2023-09-29 02:49:12,935 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=224880.0, ans=0.05 2023-09-29 02:49:19,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:49:21,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 02:49:25,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:49:25,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:49:28,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 02:49:28,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:49:29,792 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 02:49:31,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:49:33,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:49:37,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:49:40,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 02:49:41,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:49:41,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 02:49:41,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:49:44,727 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:49:46,012 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.718e+02 2.062e+02 2.233e+02 2.588e+02 4.432e+02, threshold=4.466e+02, percent-clipped=0.0 2023-09-29 02:49:46,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:49:51,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 02:49:51,682 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:49:55,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:49:57,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 02:49:57,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 02:49:57,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 02:49:58,725 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 02:50:00,320 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=225080.0, ans=0.0 2023-09-29 02:50:02,032 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 02:50:03,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 02:50:03,626 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:50:03,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:50:03,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:50:05,127 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 02:50:05,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 02:50:05,208 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:50:05,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 02:50:08,203 INFO [train.py:1039] (1/4) Epoch 7, batch 1900, loss[loss=0.246, simple_loss=0.3147, pruned_loss=0.08864, over 24070.00 frames. ], tot_loss[loss=0.2297, simple_loss=0.2934, pruned_loss=0.08303, over 4708231.40 frames. ], batch size: 86, lr: 1.49e-02, grad_scale: 16.0 2023-09-29 02:50:08,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 02:50:09,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:50:09,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 02:50:13,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:50:13,476 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 02:50:13,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:50:14,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:50:21,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:50:24,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:50:24,867 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 02:50:26,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 02:50:27,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:50:27,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:50:27,969 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 02:50:29,493 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 02:50:29,721 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=225213.33333333334, ans=0.1 2023-09-29 02:50:33,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 02:50:36,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:50:39,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 02:50:42,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 02:50:54,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 02:50:55,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 02:50:55,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:50:57,391 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 02:50:57,398 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 02:50:57,464 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 02:50:57,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 02:50:57,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:51:02,389 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.98 vs. limit=6.0 2023-09-29 02:51:02,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 02:51:05,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:51:09,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:51:09,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 02:51:12,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 02:51:15,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 02:51:17,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:51:18,054 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=225413.33333333334, ans=0.0 2023-09-29 02:51:22,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:51:22,486 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:51:22,506 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:51:24,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:51:26,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 02:51:27,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 02:51:27,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:51:30,594 INFO [train.py:1039] (1/4) Epoch 7, batch 1950, loss[loss=0.3171, simple_loss=0.349, pruned_loss=0.1426, over 18916.00 frames. ], tot_loss[loss=0.2311, simple_loss=0.2945, pruned_loss=0.08388, over 4695874.44 frames. ], batch size: 388, lr: 1.49e-02, grad_scale: 16.0 2023-09-29 02:51:30,824 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:51:30,826 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 02:51:34,440 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:51:34,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:51:34,524 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:51:36,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:51:40,549 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:51:42,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:51:44,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:51:44,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:51:45,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 02:51:45,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 02:51:47,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:51:47,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:51:50,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:51:50,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:51:50,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:51:53,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:51:55,930 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:51:57,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 02:51:57,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:51:57,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:52:00,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:52:05,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 02:52:05,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:52:05,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 02:52:05,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 02:52:07,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 02:52:07,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:52:07,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:52:12,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:52:14,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:52:19,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 02:52:19,762 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=225680.0, ans=0.2 2023-09-29 02:52:20,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:52:20,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:52:22,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 02:52:23,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:52:27,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:52:28,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:52:30,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:52:32,065 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.833e+02 2.356e+02 2.885e+02 3.538e+02 5.665e+02, threshold=5.770e+02, percent-clipped=6.0 2023-09-29 02:52:37,392 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:52:38,897 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:52:41,469 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=225746.66666666666, ans=0.2 2023-09-29 02:52:42,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:52:44,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:52:47,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:52:47,940 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:52:49,417 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 02:52:49,427 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 02:52:49,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:52:51,045 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 02:52:53,917 INFO [train.py:1039] (1/4) Epoch 7, batch 2000, loss[loss=0.2271, simple_loss=0.2987, pruned_loss=0.07778, over 24510.00 frames. ], tot_loss[loss=0.2335, simple_loss=0.2967, pruned_loss=0.08519, over 4690191.33 frames. ], batch size: 66, lr: 1.49e-02, grad_scale: 32.0 2023-09-29 02:52:53,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:52:57,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:52:58,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:52:58,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:53:01,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:53:03,281 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:53:06,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 02:53:08,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:53:08,740 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=225880.0, ans=0.125 2023-09-29 02:53:10,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:53:13,634 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 02:53:15,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:53:15,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:53:17,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:53:19,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 02:53:19,195 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=225880.0, ans=0.1 2023-09-29 02:53:22,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:53:24,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:53:24,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:53:25,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 02:53:25,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 02:53:28,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 02:53:28,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:53:31,767 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:53:31,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 02:53:31,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:53:33,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:53:34,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:53:36,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 02:53:37,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 02:53:37,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:53:37,970 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:53:40,626 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=225946.66666666666, ans=0.0 2023-09-29 02:53:43,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:53:45,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:53:45,608 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:53:47,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:53:49,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:53:49,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:53:50,713 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:53:50,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:53:52,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:53:52,673 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=226013.33333333334, ans=0.2 2023-09-29 02:53:54,880 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=226013.33333333334, ans=0.1 2023-09-29 02:53:56,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:53:56,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 02:54:02,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 02:54:03,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:54:08,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:54:08,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:54:08,525 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=226080.0, ans=0.125 2023-09-29 02:54:11,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:54:15,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:54:15,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:54:15,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 02:54:16,481 INFO [train.py:1039] (1/4) Epoch 7, batch 2050, loss[loss=0.2274, simple_loss=0.2695, pruned_loss=0.09265, over 19879.00 frames. ], tot_loss[loss=0.2318, simple_loss=0.2947, pruned_loss=0.08439, over 4701427.28 frames. ], batch size: 388, lr: 1.48e-02, grad_scale: 32.0 2023-09-29 02:54:16,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 02:54:19,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:54:19,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:54:23,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:54:23,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:54:30,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:54:31,823 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:54:31,940 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:54:33,481 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:54:35,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 02:54:35,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:54:36,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:54:38,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 02:54:40,228 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=226213.33333333334, ans=0.125 2023-09-29 02:54:44,745 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=226213.33333333334, ans=0.0 2023-09-29 02:54:44,807 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=226213.33333333334, ans=0.125 2023-09-29 02:54:47,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:54:47,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:54:51,332 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 02:54:53,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:54:54,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 02:54:54,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:54:56,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:55:00,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:55:00,567 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 02:55:00,629 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:55:02,811 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:55:02,945 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:55:03,133 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=226280.0, ans=0.0 2023-09-29 02:55:04,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 02:55:07,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:55:10,430 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:55:12,057 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 02:55:14,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:55:19,288 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.639e+02 2.164e+02 2.389e+02 2.987e+02 5.025e+02, threshold=4.777e+02, percent-clipped=0.0 2023-09-29 02:55:19,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:55:26,731 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:55:28,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 02:55:33,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:55:33,641 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:55:36,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:55:38,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 02:55:40,169 INFO [train.py:1039] (1/4) Epoch 7, batch 2100, loss[loss=0.2052, simple_loss=0.2745, pruned_loss=0.06794, over 24321.00 frames. ], tot_loss[loss=0.2297, simple_loss=0.2928, pruned_loss=0.08327, over 4701650.82 frames. ], batch size: 61, lr: 1.48e-02, grad_scale: 16.0 2023-09-29 02:55:41,920 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 02:55:41,921 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:55:42,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:55:42,381 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:55:43,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 02:55:43,594 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:55:43,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 02:55:43,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 02:55:43,955 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=226480.0, ans=0.1 2023-09-29 02:55:46,652 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:55:47,054 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=226480.0, ans=0.125 2023-09-29 02:55:49,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:55:49,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:55:52,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:55:53,043 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=226480.0, ans=0.125 2023-09-29 02:55:53,273 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=226480.0, ans=0.1 2023-09-29 02:55:54,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:55:54,355 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 02:55:56,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:55:56,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 02:55:56,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 02:55:58,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:55:58,898 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.72 vs. limit=12.0 2023-09-29 02:55:59,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:55:59,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 02:56:00,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 02:56:05,286 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 02:56:05,288 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 02:56:08,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:56:08,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:56:13,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:56:13,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 02:56:14,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:56:14,817 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 02:56:17,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 02:56:17,793 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:56:17,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 02:56:17,849 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 02:56:19,293 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 02:56:22,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:56:23,914 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:56:27,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 02:56:27,499 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=226680.0, ans=0.0 2023-09-29 02:56:28,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 02:56:30,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:56:30,760 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=226680.0, ans=0.2 2023-09-29 02:56:32,502 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:56:32,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 02:56:32,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:56:32,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:56:34,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:56:34,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 02:56:36,096 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 02:56:36,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 02:56:41,765 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=14.82 vs. limit=15.0 2023-09-29 02:56:42,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:56:45,527 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:56:47,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 02:56:52,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:56:55,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:56:56,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:56:56,765 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:56:56,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 02:56:56,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 02:56:58,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:56:58,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 02:56:59,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:56:59,946 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:57:01,361 INFO [train.py:1039] (1/4) Epoch 7, batch 2150, loss[loss=0.247, simple_loss=0.2956, pruned_loss=0.09926, over 23782.00 frames. ], tot_loss[loss=0.228, simple_loss=0.2916, pruned_loss=0.08225, over 4705011.41 frames. ], batch size: 164, lr: 1.48e-02, grad_scale: 16.0 2023-09-29 02:57:02,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 02:57:04,486 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 02:57:04,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:57:07,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:57:07,497 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:57:07,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:57:07,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:57:12,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 02:57:15,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:57:15,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:57:18,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:57:18,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:57:20,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:57:23,171 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:57:23,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:57:23,265 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:57:26,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:57:27,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 02:57:32,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:57:34,208 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 02:57:35,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:57:35,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:57:35,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:57:36,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 02:57:37,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:57:37,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:57:38,983 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:57:41,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 02:57:41,459 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=226946.66666666666, ans=0.125 2023-09-29 02:57:42,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:57:43,035 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=226946.66666666666, ans=0.2 2023-09-29 02:57:43,116 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=226946.66666666666, ans=0.125 2023-09-29 02:57:44,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:57:44,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:57:46,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 02:57:46,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:57:49,615 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:57:49,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 02:57:51,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:57:51,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 02:57:52,850 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=227013.33333333334, ans=0.125 2023-09-29 02:57:53,890 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:57:56,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:57:56,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:57:58,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:58:00,792 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.96 vs. limit=22.5 2023-09-29 02:58:01,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 02:58:01,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:58:01,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:58:01,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 02:58:04,438 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.690e+02 2.076e+02 2.402e+02 2.714e+02 3.938e+02, threshold=4.805e+02, percent-clipped=0.0 2023-09-29 02:58:04,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 02:58:04,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:58:05,998 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 02:58:06,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:58:06,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:58:06,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=227080.0, ans=0.04949747468305833 2023-09-29 02:58:07,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 02:58:07,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:58:07,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 02:58:07,755 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 02:58:07,755 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 02:58:09,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 02:58:10,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:58:12,221 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:58:12,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:58:12,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:58:13,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 02:58:15,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:58:15,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:58:22,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:58:24,683 INFO [train.py:1039] (1/4) Epoch 7, batch 2200, loss[loss=0.2324, simple_loss=0.2917, pruned_loss=0.08656, over 23573.00 frames. ], tot_loss[loss=0.2281, simple_loss=0.2918, pruned_loss=0.08219, over 4706386.34 frames. ], batch size: 256, lr: 1.48e-02, grad_scale: 16.0 2023-09-29 02:58:25,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 02:58:29,990 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:58:34,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:58:36,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:58:36,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:58:37,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 02:58:39,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:58:39,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:58:39,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 02:58:44,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 02:58:44,555 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=227213.33333333334, ans=0.07 2023-09-29 02:58:45,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 02:58:47,712 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:58:53,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 02:58:58,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:58:58,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:58:58,467 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:59:03,833 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:59:03,865 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 02:59:05,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 02:59:08,418 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:59:08,528 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 02:59:10,409 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=227280.0, ans=0.0 2023-09-29 02:59:11,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 02:59:14,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:59:14,756 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=227346.66666666666, ans=0.1 2023-09-29 02:59:15,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:59:17,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:59:19,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 02:59:19,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=227346.66666666666, ans=0.0 2023-09-29 02:59:19,751 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=227346.66666666666, ans=0.0 2023-09-29 02:59:20,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:59:21,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 02:59:21,419 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=227346.66666666666, ans=0.0 2023-09-29 02:59:24,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:59:24,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 02:59:24,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:59:26,146 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=227346.66666666666, ans=0.125 2023-09-29 02:59:27,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:59:28,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:59:28,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:59:28,721 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:59:30,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 02:59:30,451 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:59:31,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:59:33,927 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 02:59:37,179 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=227413.33333333334, ans=0.1 2023-09-29 02:59:38,439 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 02:59:38,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:59:40,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 02:59:41,730 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 02:59:43,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 02:59:43,592 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 02:59:45,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 02:59:45,159 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 02:59:46,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:59:48,147 INFO [train.py:1039] (1/4) Epoch 7, batch 2250, loss[loss=0.3059, simple_loss=0.3377, pruned_loss=0.137, over 19518.00 frames. ], tot_loss[loss=0.2293, simple_loss=0.2926, pruned_loss=0.083, over 4687501.53 frames. ], batch size: 388, lr: 1.48e-02, grad_scale: 16.0 2023-09-29 02:59:48,287 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 02:59:48,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:59:50,181 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=227480.0, ans=0.125 2023-09-29 02:59:51,362 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 02:59:51,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:59:54,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:00:01,522 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=227480.0, ans=0.125 2023-09-29 03:00:02,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:00:04,948 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:00:08,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:00:10,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:00:11,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:00:13,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 03:00:13,387 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:00:14,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:00:16,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 03:00:16,771 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=227546.66666666666, ans=0.125 2023-09-29 03:00:16,866 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=227546.66666666666, ans=0.1 2023-09-29 03:00:17,215 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.89 vs. limit=15.0 2023-09-29 03:00:18,004 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:00:18,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:00:19,609 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:00:25,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:00:27,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 03:00:27,211 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 03:00:27,927 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.57 vs. limit=15.0 2023-09-29 03:00:28,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 03:00:30,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:00:30,655 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=227613.33333333334, ans=0.09899494936611666 2023-09-29 03:00:31,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:00:35,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:00:37,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:00:39,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:00:39,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:00:41,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:00:41,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:00:42,094 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 03:00:43,744 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=227680.0, ans=0.1 2023-09-29 03:00:47,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:00:50,691 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.687e+02 2.075e+02 2.318e+02 2.667e+02 5.695e+02, threshold=4.636e+02, percent-clipped=1.0 2023-09-29 03:00:50,811 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 03:00:55,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:00:56,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 03:00:56,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:01:01,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 03:01:03,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 03:01:03,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 03:01:04,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:01:04,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:01:08,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 03:01:08,696 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=227813.33333333334, ans=0.1 2023-09-29 03:01:09,681 INFO [train.py:1039] (1/4) Epoch 7, batch 2300, loss[loss=0.2392, simple_loss=0.3149, pruned_loss=0.08178, over 24300.00 frames. ], tot_loss[loss=0.2298, simple_loss=0.2933, pruned_loss=0.08313, over 4694405.74 frames. ], batch size: 74, lr: 1.48e-02, grad_scale: 16.0 2023-09-29 03:01:10,156 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=227813.33333333334, ans=0.125 2023-09-29 03:01:13,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:01:13,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:01:19,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:01:20,555 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:01:23,504 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 03:01:25,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:01:25,364 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=227880.0, ans=0.0 2023-09-29 03:01:31,434 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:01:31,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 03:01:31,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:01:32,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:01:32,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 03:01:34,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:01:36,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:01:37,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:01:40,738 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:01:42,936 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=227946.66666666666, ans=0.2 2023-09-29 03:01:44,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:01:49,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:01:56,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:01:56,838 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:01:59,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:02:00,807 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten.whitening_limit, batch_count=228013.33333333334, ans=15.0 2023-09-29 03:02:01,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:02:04,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:02:06,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 03:02:06,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:02:06,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 03:02:07,333 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.21 vs. limit=15.0 2023-09-29 03:02:09,562 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 03:02:09,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:02:11,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:02:11,117 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:02:11,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:02:12,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 03:02:12,629 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 03:02:12,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 03:02:12,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:02:12,762 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:02:14,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 03:02:21,871 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:02:25,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:02:30,238 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:02:30,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:02:30,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 03:02:30,643 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=228080.0, ans=0.1 2023-09-29 03:02:32,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 03:02:32,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:02:33,874 INFO [train.py:1039] (1/4) Epoch 7, batch 2350, loss[loss=0.2972, simple_loss=0.3396, pruned_loss=0.1274, over 19864.00 frames. ], tot_loss[loss=0.2307, simple_loss=0.2945, pruned_loss=0.08349, over 4694596.45 frames. ], batch size: 389, lr: 1.48e-02, grad_scale: 16.0 2023-09-29 03:02:34,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 03:02:35,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 03:02:41,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:02:41,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 03:02:48,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 03:02:51,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:02:51,475 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=228213.33333333334, ans=0.0 2023-09-29 03:02:54,289 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:02:54,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:02:54,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:02:54,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:02:57,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 03:03:01,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:03:08,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 03:03:09,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:03:12,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:03:12,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:03:14,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:03:15,796 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 03:03:17,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:03:18,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:03:18,797 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:03:18,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:03:21,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:03:25,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 03:03:26,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:03:29,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:03:29,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:03:31,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 03:03:31,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:03:34,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 03:03:34,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:03:36,032 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.700e+02 2.092e+02 2.385e+02 2.952e+02 4.935e+02, threshold=4.770e+02, percent-clipped=1.0 2023-09-29 03:03:36,567 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=228346.66666666666, ans=0.05 2023-09-29 03:03:39,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 03:03:39,749 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=228413.33333333334, ans=0.125 2023-09-29 03:03:44,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 03:03:46,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:03:46,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 03:03:46,123 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 03:03:46,156 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 03:03:49,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 03:03:50,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:03:51,201 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=228413.33333333334, ans=0.125 2023-09-29 03:03:55,177 INFO [train.py:1039] (1/4) Epoch 7, batch 2400, loss[loss=0.1994, simple_loss=0.2728, pruned_loss=0.063, over 24604.00 frames. ], tot_loss[loss=0.2298, simple_loss=0.2937, pruned_loss=0.08298, over 4697054.81 frames. ], batch size: 60, lr: 1.48e-02, grad_scale: 32.0 2023-09-29 03:03:56,119 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=13.48 vs. limit=15.0 2023-09-29 03:03:57,440 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:03:59,262 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:04:01,845 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=228480.0, ans=0.125 2023-09-29 03:04:03,016 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:04:03,101 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 03:04:05,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 03:04:06,972 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=228480.0, ans=0.125 2023-09-29 03:04:09,235 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.01 vs. limit=15.0 2023-09-29 03:04:12,939 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 03:04:12,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:04:16,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 03:04:16,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:04:17,576 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:04:17,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 03:04:23,210 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=228546.66666666666, ans=0.0 2023-09-29 03:04:24,317 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:04:27,344 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 03:04:31,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 03:04:34,270 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 03:04:38,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:04:40,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:04:44,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:04:46,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 03:04:46,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:04:52,551 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:04:52,955 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=228680.0, ans=0.0 2023-09-29 03:04:54,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:04:56,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:04:57,908 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:04:57,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 03:04:59,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:04:59,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:05:00,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:05:00,034 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 03:05:03,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:05:04,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 03:05:04,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 03:05:06,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 03:05:09,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:05:09,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:05:11,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 03:05:11,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 03:05:11,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 03:05:12,012 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 03:05:12,157 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 03:05:13,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:05:16,685 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:05:16,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:05:16,849 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 03:05:17,628 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.42 vs. limit=15.0 2023-09-29 03:05:18,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:05:18,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 03:05:19,797 INFO [train.py:1039] (1/4) Epoch 7, batch 2450, loss[loss=0.2329, simple_loss=0.2901, pruned_loss=0.08779, over 23718.00 frames. ], tot_loss[loss=0.2284, simple_loss=0.292, pruned_loss=0.0824, over 4686878.86 frames. ], batch size: 149, lr: 1.48e-02, grad_scale: 16.0 2023-09-29 03:05:23,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:05:24,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:05:29,083 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:05:29,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:05:29,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 03:05:31,983 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=8.82 vs. limit=15.0 2023-09-29 03:05:36,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:05:36,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:05:40,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:05:40,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:05:40,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:05:41,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 03:05:42,849 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=228880.0, ans=0.1 2023-09-29 03:05:45,678 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.77 vs. limit=15.0 2023-09-29 03:05:46,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:05:47,879 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 03:05:49,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:05:51,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 03:05:52,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:05:52,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:05:54,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:05:55,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 03:05:57,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:06:06,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:06:08,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:06:09,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:06:09,078 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:06:09,411 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=229013.33333333334, ans=0.0 2023-09-29 03:06:10,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:06:11,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:06:12,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 03:06:12,377 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=229013.33333333334, ans=0.125 2023-09-29 03:06:15,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:06:17,001 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:06:18,036 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.25 vs. limit=10.0 2023-09-29 03:06:20,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:06:20,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:06:23,665 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.673e+02 2.159e+02 2.588e+02 3.094e+02 4.619e+02, threshold=5.175e+02, percent-clipped=0.0 2023-09-29 03:06:24,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:06:24,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 03:06:26,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:06:26,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:06:26,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 03:06:28,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:06:29,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:06:32,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:06:36,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:06:36,149 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:06:39,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 03:06:41,313 INFO [train.py:1039] (1/4) Epoch 7, batch 2500, loss[loss=0.2386, simple_loss=0.3164, pruned_loss=0.08037, over 24638.00 frames. ], tot_loss[loss=0.2265, simple_loss=0.2905, pruned_loss=0.08123, over 4689707.15 frames. ], batch size: 73, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:06:41,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:06:41,938 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.27 vs. limit=12.0 2023-09-29 03:06:46,677 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=229146.66666666666, ans=0.125 2023-09-29 03:06:49,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:06:59,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 03:07:01,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:07:01,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:07:01,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 03:07:06,424 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=229213.33333333334, ans=0.2 2023-09-29 03:07:09,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 03:07:09,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:07:10,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 03:07:10,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 03:07:12,242 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 03:07:12,438 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=229280.0, ans=0.1 2023-09-29 03:07:13,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:07:15,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:07:15,223 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 03:07:15,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:07:15,368 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 03:07:15,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:07:22,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:07:24,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:07:27,597 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 03:07:29,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 03:07:29,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:07:31,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:07:34,635 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:07:37,757 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:07:40,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:07:44,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 03:07:47,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 03:07:47,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:07:47,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 03:07:50,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:07:50,752 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:07:52,934 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 03:07:52,935 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 03:07:52,943 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 03:07:56,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:07:58,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 03:07:58,946 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 03:07:59,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:08:00,467 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 03:08:03,259 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=229480.0, ans=0.2 2023-09-29 03:08:04,195 INFO [train.py:1039] (1/4) Epoch 7, batch 2550, loss[loss=0.2343, simple_loss=0.2897, pruned_loss=0.08944, over 23826.00 frames. ], tot_loss[loss=0.2267, simple_loss=0.2908, pruned_loss=0.08133, over 4703505.28 frames. ], batch size: 195, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:08:04,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 03:08:06,720 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.02 vs. limit=15.0 2023-09-29 03:08:07,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:08:10,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:08:10,315 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:08:11,895 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:08:13,423 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 03:08:13,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 03:08:18,007 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 03:08:19,532 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:08:19,895 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=229546.66666666666, ans=0.0 2023-09-29 03:08:21,107 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:08:24,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:08:24,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 03:08:24,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 03:08:26,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:08:26,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:08:29,789 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 03:08:31,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 03:08:31,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 03:08:31,170 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:08:31,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 03:08:45,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:08:49,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:08:51,161 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:08:51,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:08:51,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 03:08:58,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:08:59,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 03:09:01,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:09:01,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:09:01,822 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.80 vs. limit=15.0 2023-09-29 03:09:02,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 03:09:02,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 03:09:06,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:09:06,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:09:07,776 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.644e+02 2.133e+02 2.441e+02 2.869e+02 4.948e+02, threshold=4.883e+02, percent-clipped=0.0 2023-09-29 03:09:11,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:09:11,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 03:09:11,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:09:13,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:09:13,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 03:09:15,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 03:09:17,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:09:19,037 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=229746.66666666666, ans=0.125 2023-09-29 03:09:23,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:09:26,266 INFO [train.py:1039] (1/4) Epoch 7, batch 2600, loss[loss=0.1971, simple_loss=0.268, pruned_loss=0.06312, over 24318.00 frames. ], tot_loss[loss=0.226, simple_loss=0.2908, pruned_loss=0.08059, over 4711705.26 frames. ], batch size: 61, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:09:26,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:09:28,495 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 03:09:31,591 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 03:09:31,625 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:09:31,676 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 03:09:33,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 03:09:34,574 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 03:09:36,416 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 03:09:37,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:09:37,699 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 03:09:39,941 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 03:09:42,822 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 03:09:43,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:09:45,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 03:09:46,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 03:09:49,699 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 03:09:49,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 03:09:52,589 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.09 vs. limit=15.0 2023-09-29 03:09:53,316 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 03:09:53,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 03:09:59,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:09:59,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:10:01,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:10:01,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 03:10:03,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:10:08,041 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 03:10:14,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:10:16,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:10:17,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 03:10:19,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:10:19,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:10:19,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 03:10:20,183 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=230013.33333333334, ans=0.1 2023-09-29 03:10:22,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 03:10:22,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:10:24,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:10:29,531 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 03:10:29,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:10:30,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:10:32,794 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=230080.0, ans=0.125 2023-09-29 03:10:35,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:10:36,499 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.02 vs. limit=15.0 2023-09-29 03:10:37,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:10:37,493 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 03:10:37,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:10:40,545 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:10:42,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:10:46,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 03:10:48,336 INFO [train.py:1039] (1/4) Epoch 7, batch 2650, loss[loss=0.2088, simple_loss=0.2784, pruned_loss=0.06956, over 20588.00 frames. ], tot_loss[loss=0.227, simple_loss=0.2913, pruned_loss=0.08132, over 4700269.19 frames. ], batch size: 45, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:10:48,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:10:50,612 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 03:10:54,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 03:10:54,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:10:54,547 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=230146.66666666666, ans=0.0 2023-09-29 03:10:55,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 03:10:57,363 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 03:10:57,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:10:59,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:11:01,092 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=230146.66666666666, ans=0.0 2023-09-29 03:11:03,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 03:11:05,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:11:08,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:11:09,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 03:11:09,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 03:11:09,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:11:13,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 03:11:15,151 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 03:11:18,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:11:18,473 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 03:11:19,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:11:21,245 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 03:11:26,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:11:26,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:11:26,575 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:11:26,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:11:31,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 03:11:31,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 03:11:35,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 03:11:38,553 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 03:11:39,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:11:41,313 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:11:41,378 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 03:11:41,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:11:41,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:11:41,738 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=230346.66666666666, ans=0.125 2023-09-29 03:11:44,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:11:46,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:11:48,271 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:11:48,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 03:11:49,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:11:51,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:11:51,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:11:52,730 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.662e+02 2.070e+02 2.281e+02 2.771e+02 4.083e+02, threshold=4.562e+02, percent-clipped=0.0 2023-09-29 03:11:52,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:11:54,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:11:54,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 03:11:58,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:12:00,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:12:00,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:12:00,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 03:12:05,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:12:07,706 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:12:08,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:12:08,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:12:08,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 03:12:10,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:12:10,407 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=230480.0, ans=0.125 2023-09-29 03:12:11,467 INFO [train.py:1039] (1/4) Epoch 7, batch 2700, loss[loss=0.2109, simple_loss=0.2799, pruned_loss=0.07099, over 24499.00 frames. ], tot_loss[loss=0.2287, simple_loss=0.293, pruned_loss=0.08221, over 4701252.05 frames. ], batch size: 63, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:12:13,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:12:13,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 03:12:16,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:12:18,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 03:12:21,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:12:21,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:12:21,337 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:12:22,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:12:22,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:12:22,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:12:22,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 03:12:24,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 03:12:24,432 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:12:26,211 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=230546.66666666666, ans=0.125 2023-09-29 03:12:27,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 03:12:28,404 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=9.05 vs. limit=15.0 2023-09-29 03:12:28,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 03:12:29,033 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:12:32,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:12:32,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 03:12:33,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:12:42,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:12:42,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:12:43,922 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.30 vs. limit=15.0 2023-09-29 03:12:46,829 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=230613.33333333334, ans=0.0 2023-09-29 03:12:47,938 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:12:47,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:12:47,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:12:48,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:12:51,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:12:54,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:12:54,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:12:54,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:12:58,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:12:58,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 03:13:08,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:13:08,904 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:13:12,354 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=230680.0, ans=0.0 2023-09-29 03:13:13,927 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:13:13,930 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:13:16,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:13:17,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:13:19,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:13:20,911 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:13:22,382 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:13:22,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:13:25,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:13:26,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:13:26,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:13:29,488 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=230746.66666666666, ans=0.1 2023-09-29 03:13:30,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 03:13:30,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:13:32,673 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:13:32,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 03:13:34,010 INFO [train.py:1039] (1/4) Epoch 7, batch 2750, loss[loss=0.2221, simple_loss=0.2838, pruned_loss=0.08018, over 23572.00 frames. ], tot_loss[loss=0.229, simple_loss=0.2927, pruned_loss=0.08261, over 4701746.51 frames. ], batch size: 134, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:13:34,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 03:13:34,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:13:34,671 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=230813.33333333334, ans=0.0 2023-09-29 03:13:38,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:13:38,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:13:41,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:13:41,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 03:13:42,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:13:46,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:13:46,925 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.11 vs. limit=15.0 2023-09-29 03:13:47,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 03:13:47,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:13:47,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:13:47,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 03:13:47,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:13:47,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:13:54,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 03:13:57,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:13:57,111 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:13:57,209 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:13:57,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 03:13:58,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:14:00,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:14:01,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:14:01,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:14:08,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 03:14:08,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 03:14:08,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:14:09,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:14:10,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 03:14:16,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:14:18,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 03:14:18,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:14:23,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:14:23,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 03:14:25,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 03:14:31,558 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:14:31,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:14:31,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 03:14:36,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:14:36,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 03:14:38,315 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 2.012e+02 2.382e+02 2.660e+02 4.649e+02, threshold=4.763e+02, percent-clipped=1.0 2023-09-29 03:14:41,680 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 03:14:44,781 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:14:44,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 03:14:46,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:14:50,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:14:50,388 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 03:14:50,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:14:54,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 03:14:54,800 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.42 vs. limit=15.0 2023-09-29 03:14:55,147 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=7.65 vs. limit=15.0 2023-09-29 03:14:55,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:14:55,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:14:55,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 03:14:55,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:14:57,118 INFO [train.py:1039] (1/4) Epoch 7, batch 2800, loss[loss=0.2353, simple_loss=0.3032, pruned_loss=0.08368, over 23244.00 frames. ], tot_loss[loss=0.2268, simple_loss=0.2905, pruned_loss=0.08158, over 4706541.33 frames. ], batch size: 93, lr: 1.47e-02, grad_scale: 32.0 2023-09-29 03:14:57,248 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:14:59,029 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=231146.66666666666, ans=0.09899494936611666 2023-09-29 03:15:00,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:15:00,215 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 03:15:00,216 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 03:15:03,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:15:04,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 03:15:06,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:15:09,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:15:11,492 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=231213.33333333334, ans=0.2 2023-09-29 03:15:11,609 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=231213.33333333334, ans=0.0 2023-09-29 03:15:12,758 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 03:15:15,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 03:15:17,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 03:15:17,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:15:19,031 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:15:19,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:15:20,951 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=231213.33333333334, ans=0.125 2023-09-29 03:15:23,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:15:23,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:15:23,641 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 03:15:25,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:15:35,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:15:37,927 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:15:40,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:15:40,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:15:42,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:15:48,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:15:48,593 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 03:15:48,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:15:50,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:15:50,639 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:15:53,783 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:15:55,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:15:59,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:16:02,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:16:04,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:16:04,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 03:16:04,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 03:16:04,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 03:16:05,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:16:05,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 03:16:05,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:16:08,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:16:08,547 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:16:08,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 03:16:10,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:16:10,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:16:10,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:16:11,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 03:16:17,966 INFO [train.py:1039] (1/4) Epoch 7, batch 2850, loss[loss=0.2143, simple_loss=0.2704, pruned_loss=0.07908, over 23407.00 frames. ], tot_loss[loss=0.2253, simple_loss=0.2888, pruned_loss=0.08086, over 4699865.24 frames. ], batch size: 285, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:16:18,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:16:18,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 03:16:19,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:16:19,864 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=231480.0, ans=0.0 2023-09-29 03:16:21,233 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:16:23,089 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=231480.0, ans=0.125 2023-09-29 03:16:25,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:16:25,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:16:25,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:16:29,613 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:16:29,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:16:31,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:16:32,076 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 03:16:38,726 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 03:16:38,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:16:40,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 03:16:40,428 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=231546.66666666666, ans=0.125 2023-09-29 03:16:41,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:16:45,276 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.31 vs. limit=15.0 2023-09-29 03:16:46,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 03:16:46,263 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 03:16:46,924 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=6.93 vs. limit=15.0 2023-09-29 03:16:47,848 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:17:00,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:17:00,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:17:00,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:17:02,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 03:17:02,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 03:17:02,286 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:17:04,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:17:04,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 03:17:06,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 03:17:07,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:17:07,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:17:09,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:17:12,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:17:13,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:17:13,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:17:13,945 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=231680.0, ans=0.125 2023-09-29 03:17:16,626 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:17:18,445 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=231680.0, ans=0.125 2023-09-29 03:17:19,444 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:17:19,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:17:21,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:17:24,055 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.676e+02 2.041e+02 2.225e+02 2.602e+02 4.724e+02, threshold=4.450e+02, percent-clipped=0.0 2023-09-29 03:17:24,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:17:28,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:17:30,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 03:17:30,312 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 03:17:31,873 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 03:17:33,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:17:33,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 03:17:35,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:17:35,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:17:35,440 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:17:36,187 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:17:36,188 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 03:17:37,483 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 03:17:37,489 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:17:37,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:17:40,488 INFO [train.py:1039] (1/4) Epoch 7, batch 2900, loss[loss=0.2358, simple_loss=0.2938, pruned_loss=0.08888, over 23685.00 frames. ], tot_loss[loss=0.2251, simple_loss=0.2882, pruned_loss=0.08101, over 4686128.81 frames. ], batch size: 232, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:17:42,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 03:17:42,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:17:42,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:17:44,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 03:17:49,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:17:49,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 03:17:49,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 03:17:51,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:17:51,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:17:51,486 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=231813.33333333334, ans=10.0 2023-09-29 03:17:54,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:17:55,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:18:00,211 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:18:00,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:18:02,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 03:18:03,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 03:18:03,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 03:18:05,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:18:08,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 03:18:09,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 03:18:11,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:18:11,184 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 03:18:11,375 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=231880.0, ans=0.0 2023-09-29 03:18:12,566 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:18:14,232 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:18:14,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 03:18:17,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:18:17,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:18:24,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:18:26,094 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:18:27,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 03:18:27,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 03:18:27,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:18:30,653 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=13.83 vs. limit=22.5 2023-09-29 03:18:32,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:18:34,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 03:18:35,813 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:18:41,291 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=232013.33333333334, ans=0.125 2023-09-29 03:18:42,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:18:51,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:18:51,379 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:18:52,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 03:18:56,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:18:56,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 03:18:56,652 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:18:58,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 03:19:02,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:19:04,277 INFO [train.py:1039] (1/4) Epoch 7, batch 2950, loss[loss=0.2148, simple_loss=0.2853, pruned_loss=0.07214, over 24464.00 frames. ], tot_loss[loss=0.2267, simple_loss=0.29, pruned_loss=0.08166, over 4676704.18 frames. ], batch size: 66, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:19:04,452 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 03:19:05,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:19:05,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:19:07,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:19:09,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:19:10,473 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 03:19:11,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 03:19:12,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:19:12,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:19:15,111 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=232146.66666666666, ans=0.05 2023-09-29 03:19:19,456 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:19:22,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:19:23,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:19:25,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:19:29,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:19:29,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:19:30,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:19:32,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:19:32,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:19:35,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 03:19:39,306 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 03:19:40,206 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=7.32 vs. limit=15.0 2023-09-29 03:19:40,648 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 03:19:40,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:19:42,326 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 03:19:43,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 03:19:43,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:19:45,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:19:45,755 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 03:19:45,766 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 03:19:50,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 03:19:50,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:19:51,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:19:54,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:19:57,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:19:58,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:19:58,547 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 03:19:58,596 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:19:58,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 03:20:06,952 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:20:08,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:20:09,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 03:20:09,965 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:20:11,822 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.694e+02 2.080e+02 2.429e+02 2.872e+02 4.397e+02, threshold=4.858e+02, percent-clipped=0.0 2023-09-29 03:20:12,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 03:20:15,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:20:15,207 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=232413.33333333334, ans=0.125 2023-09-29 03:20:16,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:20:16,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:20:18,268 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:20:19,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 03:20:21,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:20:23,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:20:23,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 03:20:23,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:20:23,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:20:24,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:20:26,220 INFO [train.py:1039] (1/4) Epoch 7, batch 3000, loss[loss=0.2132, simple_loss=0.2814, pruned_loss=0.07248, over 24563.00 frames. ], tot_loss[loss=0.2274, simple_loss=0.2912, pruned_loss=0.08178, over 4693025.57 frames. ], batch size: 60, lr: 1.46e-02, grad_scale: 8.0 2023-09-29 03:20:26,220 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-29 03:20:40,701 INFO [train.py:1071] (1/4) Epoch 7, validation: loss=0.3621, simple_loss=0.3045, pruned_loss=0.2099, over 1125622.00 frames. 2023-09-29 03:20:40,702 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-29 03:20:40,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:20:40,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 03:20:42,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:20:45,984 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:20:47,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 03:20:50,581 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 03:20:50,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 03:20:52,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:20:54,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:20:54,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 03:20:54,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:21:02,401 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 03:21:02,645 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=232546.66666666666, ans=0.125 2023-09-29 03:21:11,648 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:21:18,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 03:21:18,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:21:21,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:21:22,053 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=232613.33333333334, ans=0.125 2023-09-29 03:21:23,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:21:23,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:21:24,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:21:24,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 03:21:28,370 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 03:21:28,725 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=232680.0, ans=0.2 2023-09-29 03:21:30,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:21:30,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 03:21:31,794 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:21:31,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:21:33,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:21:33,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:21:36,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 03:21:38,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:21:38,382 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:21:40,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:21:42,978 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 03:21:43,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:21:43,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:21:43,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:21:46,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:21:46,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:21:48,442 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 03:21:48,490 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 03:21:48,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:21:50,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 03:21:50,163 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:21:51,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 03:21:55,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:21:57,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 03:21:57,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 03:21:57,447 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 03:21:57,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 03:21:58,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:22:00,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:22:01,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 03:22:01,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:22:01,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:22:03,349 INFO [train.py:1039] (1/4) Epoch 7, batch 3050, loss[loss=0.2532, simple_loss=0.3067, pruned_loss=0.09989, over 23386.00 frames. ], tot_loss[loss=0.2297, simple_loss=0.293, pruned_loss=0.08315, over 4694672.82 frames. ], batch size: 119, lr: 1.46e-02, grad_scale: 8.0 2023-09-29 03:22:03,847 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=232813.33333333334, ans=0.125 2023-09-29 03:22:05,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 03:22:07,303 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=232813.33333333334, ans=0.04949747468305833 2023-09-29 03:22:08,479 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:22:08,760 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=232813.33333333334, ans=0.0 2023-09-29 03:22:11,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:22:11,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:22:13,447 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=232813.33333333334, ans=0.025 2023-09-29 03:22:14,534 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:22:17,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 03:22:20,107 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 03:22:21,731 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=2.884e-03 2023-09-29 03:22:24,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 03:22:25,902 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 03:22:25,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:22:31,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:22:33,654 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=232880.0, ans=0.125 2023-09-29 03:22:35,046 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:22:35,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:22:36,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:22:39,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:22:39,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:22:39,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:22:41,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:22:41,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:22:42,771 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:22:44,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:22:46,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:22:46,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 03:22:48,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:22:48,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:22:52,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:22:53,908 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 03:22:54,044 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:22:54,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:23:00,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:23:01,008 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:23:09,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:23:09,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:23:09,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:23:11,240 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 2.044e+02 2.286e+02 2.664e+02 4.744e+02, threshold=4.572e+02, percent-clipped=0.0 2023-09-29 03:23:11,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:23:11,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 03:23:12,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:23:14,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 03:23:14,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:23:15,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:23:17,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 03:23:21,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:23:25,328 INFO [train.py:1039] (1/4) Epoch 7, batch 3100, loss[loss=0.2259, simple_loss=0.3059, pruned_loss=0.07294, over 24545.00 frames. ], tot_loss[loss=0.2276, simple_loss=0.2917, pruned_loss=0.08171, over 4712471.00 frames. ], batch size: 71, lr: 1.46e-02, grad_scale: 8.0 2023-09-29 03:23:26,990 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:23:27,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:23:30,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 03:23:31,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 03:23:35,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 03:23:36,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 03:23:38,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:23:43,243 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:23:43,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:23:45,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 03:23:48,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:23:51,770 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=233213.33333333334, ans=0.125 2023-09-29 03:23:53,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 03:23:58,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 03:23:59,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:23:59,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:23:59,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:24:01,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 03:24:04,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:24:04,083 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 03:24:04,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:24:05,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:24:05,746 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=233280.0, ans=0.125 2023-09-29 03:24:07,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 03:24:09,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:24:13,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 03:24:14,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 03:24:15,119 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=233346.66666666666, ans=0.125 2023-09-29 03:24:16,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 03:24:17,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:24:18,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:24:21,672 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:24:21,690 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:24:21,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:24:23,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 03:24:23,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:24:23,540 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=233346.66666666666, ans=0.1 2023-09-29 03:24:24,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:24:26,504 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:24:26,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:24:26,519 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 03:24:30,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:24:30,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 03:24:32,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:24:33,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 03:24:33,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:24:33,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:24:33,995 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=233413.33333333334, ans=0.125 2023-09-29 03:24:35,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 03:24:40,577 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=233413.33333333334, ans=0.125 2023-09-29 03:24:47,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 03:24:48,344 INFO [train.py:1039] (1/4) Epoch 7, batch 3150, loss[loss=0.2269, simple_loss=0.2793, pruned_loss=0.0873, over 23837.00 frames. ], tot_loss[loss=0.2256, simple_loss=0.2899, pruned_loss=0.08064, over 4715625.38 frames. ], batch size: 212, lr: 1.46e-02, grad_scale: 8.0 2023-09-29 03:24:50,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:24:50,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:24:50,348 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=233480.0, ans=0.1 2023-09-29 03:24:51,916 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:24:51,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:24:53,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 03:24:55,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:24:55,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 03:24:57,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 03:24:58,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:25:00,794 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 03:25:03,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 03:25:04,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:25:05,453 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 03:25:05,749 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=233546.66666666666, ans=0.0 2023-09-29 03:25:06,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 03:25:08,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 03:25:08,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 03:25:08,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 03:25:08,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:25:08,631 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:25:08,788 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:25:10,324 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 03:25:14,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:25:14,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:25:15,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:25:17,565 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 03:25:22,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 03:25:23,542 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:25:26,445 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 03:25:27,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:25:27,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 03:25:31,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 03:25:32,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:25:32,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 03:25:33,001 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 03:25:33,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:25:33,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:25:33,944 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=233613.33333333334, ans=0.0 2023-09-29 03:25:35,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 03:25:36,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 03:25:38,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 03:25:39,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 03:25:39,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:25:41,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:25:41,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:25:42,825 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 03:25:42,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:25:44,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 03:25:44,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:25:44,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 03:25:46,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 03:25:48,447 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:25:49,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:25:51,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 03:25:53,450 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 03:25:53,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:25:56,382 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.504e+02 2.159e+02 2.421e+02 2.808e+02 3.931e+02, threshold=4.841e+02, percent-clipped=0.0 2023-09-29 03:25:56,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:25:58,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:25:58,149 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:26:00,373 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.07 vs. limit=12.0 2023-09-29 03:26:04,155 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:26:04,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:26:07,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 03:26:11,312 INFO [train.py:1039] (1/4) Epoch 7, batch 3200, loss[loss=0.2235, simple_loss=0.2961, pruned_loss=0.07543, over 24359.00 frames. ], tot_loss[loss=0.2244, simple_loss=0.2884, pruned_loss=0.08016, over 4708734.11 frames. ], batch size: 77, lr: 1.46e-02, grad_scale: 16.0 2023-09-29 03:26:11,694 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=233813.33333333334, ans=0.125 2023-09-29 03:26:14,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:26:14,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 03:26:14,621 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=233813.33333333334, ans=0.0 2023-09-29 03:26:18,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:26:20,377 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:26:20,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 03:26:23,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:26:26,408 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.90 vs. limit=6.0 2023-09-29 03:26:27,265 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 03:26:30,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:26:39,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:26:51,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 03:26:51,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:26:54,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 03:26:56,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 03:27:01,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:27:01,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:27:01,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:27:04,724 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 03:27:06,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 03:27:07,212 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=234013.33333333334, ans=15.0 2023-09-29 03:27:07,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 03:27:10,920 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 03:27:12,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:27:19,382 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:27:19,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:27:19,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:27:20,891 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 03:27:20,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 03:27:21,276 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=234080.0, ans=0.1 2023-09-29 03:27:25,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:27:27,389 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 03:27:28,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 03:27:30,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 03:27:31,012 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.99 vs. limit=15.0 2023-09-29 03:27:32,188 INFO [train.py:1039] (1/4) Epoch 7, batch 3250, loss[loss=0.236, simple_loss=0.2956, pruned_loss=0.08819, over 23778.00 frames. ], tot_loss[loss=0.2247, simple_loss=0.2888, pruned_loss=0.08032, over 4716358.08 frames. ], batch size: 212, lr: 1.46e-02, grad_scale: 16.0 2023-09-29 03:27:32,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 03:27:33,088 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.08 vs. limit=15.0 2023-09-29 03:27:33,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:27:37,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 03:27:37,025 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 03:27:38,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:27:38,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:27:38,738 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=234146.66666666666, ans=0.1 2023-09-29 03:27:40,035 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 03:27:43,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:27:46,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:27:48,351 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=234213.33333333334, ans=0.0 2023-09-29 03:27:53,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:27:53,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 03:27:53,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:27:53,676 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=234213.33333333334, ans=0.2 2023-09-29 03:27:54,740 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:27:54,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:27:56,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:27:56,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 03:27:58,077 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=234213.33333333334, ans=0.1 2023-09-29 03:28:00,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:28:00,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 03:28:02,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:28:02,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:28:02,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:28:02,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:28:04,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:28:04,845 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=234280.0, ans=0.125 2023-09-29 03:28:08,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:28:09,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:28:09,780 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=234280.0, ans=0.2 2023-09-29 03:28:10,956 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:28:12,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:28:12,539 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:28:12,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:28:17,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 03:28:17,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:28:17,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:28:18,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:28:18,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 03:28:19,208 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=234346.66666666666, ans=0.125 2023-09-29 03:28:25,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:28:27,304 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=234346.66666666666, ans=0.2 2023-09-29 03:28:35,542 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:28:35,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:28:35,602 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 03:28:35,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:28:35,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 03:28:37,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:28:38,678 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 2.002e+02 2.212e+02 2.720e+02 4.684e+02, threshold=4.424e+02, percent-clipped=0.0 2023-09-29 03:28:38,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 03:28:38,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 03:28:38,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:28:41,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:28:42,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:28:44,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 03:28:44,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:28:46,021 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=234413.33333333334, ans=0.0 2023-09-29 03:28:48,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:28:48,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:28:50,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 03:28:50,360 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:28:52,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:28:52,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 03:28:53,361 INFO [train.py:1039] (1/4) Epoch 7, batch 3300, loss[loss=0.2287, simple_loss=0.3081, pruned_loss=0.07469, over 24462.00 frames. ], tot_loss[loss=0.2266, simple_loss=0.2905, pruned_loss=0.08139, over 4708118.72 frames. ], batch size: 69, lr: 1.46e-02, grad_scale: 16.0 2023-09-29 03:28:55,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:28:55,851 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 03:28:58,638 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 03:29:00,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 03:29:00,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:29:03,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:29:05,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:29:05,941 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=20.82 vs. limit=22.5 2023-09-29 03:29:06,604 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:29:06,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 03:29:08,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 03:29:10,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:29:12,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:29:16,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 03:29:18,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:29:19,020 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:29:20,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:29:20,511 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 03:29:23,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:29:23,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 03:29:23,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 03:29:23,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:29:23,622 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 03:29:28,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:29:28,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 03:29:30,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:29:30,260 WARNING [train.py:1197] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 03:29:31,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 03:29:31,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:29:33,558 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=234613.33333333334, ans=0.125 2023-09-29 03:29:34,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:29:35,021 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 03:29:38,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 03:29:38,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:29:42,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 03:29:45,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:29:46,127 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=234680.0, ans=0.1 2023-09-29 03:29:48,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 03:29:48,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:29:50,808 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=234680.0, ans=0.125 2023-09-29 03:29:53,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:29:53,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:29:53,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:29:54,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 03:29:56,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:29:58,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:29:58,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:29:59,974 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 03:30:00,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 03:30:03,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 03:30:03,430 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 03:30:04,685 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:30:04,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:30:06,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:30:06,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:30:08,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 03:30:08,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:30:08,515 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 03:30:09,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:30:11,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 03:30:15,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 03:30:15,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:30:16,677 INFO [train.py:1039] (1/4) Epoch 7, batch 3350, loss[loss=0.2645, simple_loss=0.3143, pruned_loss=0.1074, over 22820.00 frames. ], tot_loss[loss=0.227, simple_loss=0.2909, pruned_loss=0.08151, over 4712843.73 frames. ], batch size: 322, lr: 1.46e-02, grad_scale: 16.0 2023-09-29 03:30:16,792 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:30:18,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:30:18,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:30:20,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:30:23,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:30:23,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:30:26,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:30:28,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:30:28,454 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=234813.33333333334, ans=0.1 2023-09-29 03:30:29,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:30:31,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:30:33,599 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=234880.0, ans=0.0 2023-09-29 03:30:34,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:30:34,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:30:36,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:30:36,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 03:30:38,079 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 03:30:39,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:30:43,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 03:30:43,199 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 03:30:45,993 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:30:46,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:30:46,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:30:46,502 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=234880.0, ans=0.0 2023-09-29 03:30:47,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 03:30:47,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:30:47,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:30:49,773 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:30:51,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:30:52,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:30:52,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:30:56,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:30:59,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:30:59,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:31:02,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:31:04,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:31:07,716 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:31:07,730 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:31:10,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:31:11,664 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=235013.33333333334, ans=0.0 2023-09-29 03:31:12,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 03:31:12,708 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 03:31:12,748 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 03:31:14,126 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:31:15,769 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 03:31:17,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:31:17,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:31:22,155 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=235080.0, ans=0.125 2023-09-29 03:31:23,272 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 1.937e+02 2.206e+02 2.498e+02 3.654e+02, threshold=4.412e+02, percent-clipped=0.0 2023-09-29 03:31:27,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:31:27,193 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 03:31:28,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 03:31:28,830 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:31:30,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:31:34,211 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=235080.0, ans=0.125 2023-09-29 03:31:35,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:31:38,481 INFO [train.py:1039] (1/4) Epoch 7, batch 3400, loss[loss=0.2324, simple_loss=0.2909, pruned_loss=0.08695, over 23876.00 frames. ], tot_loss[loss=0.2293, simple_loss=0.2929, pruned_loss=0.08289, over 4704395.48 frames. ], batch size: 179, lr: 1.46e-02, grad_scale: 16.0 2023-09-29 03:31:38,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 03:31:38,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 03:31:38,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:31:40,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:31:42,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 03:31:43,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:31:43,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 03:31:45,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:31:45,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:31:45,897 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 03:31:48,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:31:48,873 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 03:31:52,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 03:31:52,074 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 03:31:52,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:31:56,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:31:56,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 03:31:56,878 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:31:58,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 03:32:03,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:32:05,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 03:32:12,023 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:32:14,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:32:15,578 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:32:17,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 03:32:22,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:32:25,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 03:32:30,169 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:32:31,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:32:31,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 03:32:31,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:32:33,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:32:33,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:32:35,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:32:38,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:32:43,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:32:43,573 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:32:51,137 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:32:52,720 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 03:32:57,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 03:33:00,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 03:33:01,929 INFO [train.py:1039] (1/4) Epoch 7, batch 3450, loss[loss=0.1861, simple_loss=0.268, pruned_loss=0.05214, over 24465.00 frames. ], tot_loss[loss=0.2297, simple_loss=0.2933, pruned_loss=0.0831, over 4691964.56 frames. ], batch size: 66, lr: 1.46e-02, grad_scale: 16.0 2023-09-29 03:33:06,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 03:33:08,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:33:09,963 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:33:09,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 03:33:10,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:33:10,440 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=235480.0, ans=0.0 2023-09-29 03:33:13,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:33:18,090 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=14.48 vs. limit=15.0 2023-09-29 03:33:18,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:33:18,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:33:20,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:33:20,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:33:23,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:33:30,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 03:33:33,821 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=235613.33333333334, ans=0.0 2023-09-29 03:33:35,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 03:33:35,185 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 03:33:35,255 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:33:38,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:33:42,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 03:33:43,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:33:48,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:33:50,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:33:50,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 03:33:51,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:33:54,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 03:33:54,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:33:57,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:34:00,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:34:02,417 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=235680.0, ans=0.04949747468305833 2023-09-29 03:34:03,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 03:34:06,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:34:09,684 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.654e+02 1.982e+02 2.278e+02 2.656e+02 4.314e+02, threshold=4.555e+02, percent-clipped=0.0 2023-09-29 03:34:11,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:34:13,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:34:16,597 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:34:16,823 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=235746.66666666666, ans=0.0 2023-09-29 03:34:21,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:34:21,280 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:34:21,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:34:23,390 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:34:24,909 INFO [train.py:1039] (1/4) Epoch 7, batch 3500, loss[loss=0.2092, simple_loss=0.2906, pruned_loss=0.06385, over 24565.00 frames. ], tot_loss[loss=0.227, simple_loss=0.2909, pruned_loss=0.08156, over 4699580.39 frames. ], batch size: 71, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:34:26,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:34:27,084 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=235813.33333333334, ans=0.1 2023-09-29 03:34:30,379 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:34:30,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 03:34:32,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 03:34:35,685 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 03:34:37,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:34:37,428 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 03:34:39,258 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=235813.33333333334, ans=0.125 2023-09-29 03:34:42,198 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:34:44,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:34:44,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:34:44,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:34:44,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=235880.0, ans=0.2 2023-09-29 03:34:46,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 03:34:47,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:34:47,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:34:47,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 03:34:49,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:34:50,800 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 03:34:52,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:34:56,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:34:57,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 03:34:57,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:35:01,362 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:35:04,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:35:05,004 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:35:07,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:35:09,447 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:35:12,677 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 03:35:12,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 03:35:14,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 03:35:14,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:35:15,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:35:16,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:35:17,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:35:21,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 03:35:21,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:35:27,175 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:35:28,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 03:35:28,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 03:35:28,805 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:35:32,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:35:32,506 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:35:35,509 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:35:38,510 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 03:35:38,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:35:42,604 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:35:42,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 03:35:44,336 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 03:35:47,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:35:47,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:35:47,501 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:35:47,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:35:49,007 INFO [train.py:1039] (1/4) Epoch 7, batch 3550, loss[loss=0.2351, simple_loss=0.3061, pruned_loss=0.082, over 23742.00 frames. ], tot_loss[loss=0.2249, simple_loss=0.2889, pruned_loss=0.08043, over 4689966.54 frames. ], batch size: 85, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:35:50,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:36:00,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:36:00,580 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=236146.66666666666, ans=0.125 2023-09-29 03:36:03,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 03:36:06,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:36:08,463 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:36:08,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:36:10,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:36:10,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 03:36:14,359 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:36:14,653 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=236213.33333333334, ans=0.09899494936611666 2023-09-29 03:36:15,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:36:15,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:36:15,884 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 03:36:16,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 03:36:22,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:36:22,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:36:23,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:36:23,776 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:36:23,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:36:23,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 03:36:25,260 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:36:26,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:36:28,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 03:36:34,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:36:34,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:36:36,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:36:38,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 03:36:40,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:36:41,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 03:36:41,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:36:43,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:36:43,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:36:47,585 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 03:36:50,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:36:54,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:36:56,467 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 03:36:56,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:36:57,954 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.671e+02 2.050e+02 2.243e+02 2.592e+02 3.943e+02, threshold=4.485e+02, percent-clipped=0.0 2023-09-29 03:36:58,314 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 03:36:59,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:37:01,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 03:37:07,831 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 03:37:07,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:37:07,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:37:10,908 INFO [train.py:1039] (1/4) Epoch 7, batch 3600, loss[loss=0.2312, simple_loss=0.2862, pruned_loss=0.08809, over 23448.00 frames. ], tot_loss[loss=0.2248, simple_loss=0.2886, pruned_loss=0.08046, over 4680039.51 frames. ], batch size: 285, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:37:10,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:37:11,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:37:12,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:37:16,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:37:16,720 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=236480.0, ans=0.125 2023-09-29 03:37:18,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:37:19,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:37:20,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:37:21,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:37:21,995 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 03:37:25,097 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 03:37:26,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:37:29,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:37:32,972 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:37:33,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:37:33,301 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=236546.66666666666, ans=0.125 2023-09-29 03:37:34,600 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:37:34,630 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 03:37:34,754 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:37:38,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:37:38,388 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:37:41,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:37:42,874 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:37:42,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:37:45,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 03:37:50,590 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=236613.33333333334, ans=0.125 2023-09-29 03:37:51,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:37:54,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 03:37:54,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 03:37:57,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:38:02,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:38:05,356 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:38:09,202 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=236680.0, ans=0.125 2023-09-29 03:38:11,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:38:11,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:38:11,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 03:38:13,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 03:38:13,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 03:38:15,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:38:17,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:38:18,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 03:38:20,126 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:38:20,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:38:20,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:38:22,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 03:38:23,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 03:38:27,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:38:28,879 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 03:38:32,303 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=236746.66666666666, ans=0.125 2023-09-29 03:38:34,831 INFO [train.py:1039] (1/4) Epoch 7, batch 3650, loss[loss=0.2014, simple_loss=0.2675, pruned_loss=0.06764, over 24490.00 frames. ], tot_loss[loss=0.2248, simple_loss=0.2891, pruned_loss=0.08026, over 4691890.36 frames. ], batch size: 63, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:38:34,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 03:38:36,514 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:38:39,066 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.44 vs. limit=15.0 2023-09-29 03:38:39,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 03:38:42,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 03:38:46,416 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:38:46,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 03:38:47,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 03:38:51,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 03:38:51,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:38:51,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 03:38:51,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 03:38:53,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:38:53,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 03:38:54,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 03:38:55,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:38:57,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:38:58,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:39:01,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 03:39:02,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 03:39:04,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:39:05,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 03:39:05,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:39:05,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:39:13,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:39:16,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:39:16,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:39:17,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:39:19,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:39:21,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:39:22,106 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=13.07 vs. limit=22.5 2023-09-29 03:39:24,756 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:39:26,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:39:26,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:39:28,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 03:39:28,523 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:39:30,429 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:39:35,179 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=237013.33333333334, ans=0.2 2023-09-29 03:39:36,575 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 03:39:38,916 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=237013.33333333334, ans=0.1 2023-09-29 03:39:41,615 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:39:41,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:39:42,028 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=237080.0, ans=0.0 2023-09-29 03:39:43,149 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 03:39:43,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:39:44,558 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 2.001e+02 2.357e+02 2.750e+02 4.366e+02, threshold=4.713e+02, percent-clipped=0.0 2023-09-29 03:39:44,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:39:46,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:39:47,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 03:39:47,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:39:51,045 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 03:39:52,617 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:39:52,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:39:56,982 INFO [train.py:1039] (1/4) Epoch 7, batch 3700, loss[loss=0.24, simple_loss=0.2935, pruned_loss=0.09329, over 23605.00 frames. ], tot_loss[loss=0.225, simple_loss=0.2896, pruned_loss=0.08022, over 4703774.12 frames. ], batch size: 135, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:39:57,116 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:39:57,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 03:39:57,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:39:57,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 03:39:58,645 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 03:40:02,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 03:40:02,860 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=237146.66666666666, ans=0.0 2023-09-29 03:40:04,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:40:06,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:40:07,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:40:07,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:40:09,152 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 03:40:10,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:40:12,739 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 03:40:20,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:40:21,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 03:40:23,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 03:40:23,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 03:40:23,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:40:29,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:40:30,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 03:40:30,662 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:40:32,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:40:35,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:40:35,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 03:40:37,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 03:40:42,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:40:42,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 03:40:42,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:40:43,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 03:40:50,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:40:50,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:40:50,723 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=237346.66666666666, ans=0.0 2023-09-29 03:40:51,422 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.94 vs. limit=15.0 2023-09-29 03:40:53,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:40:54,424 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.47 vs. limit=15.0 2023-09-29 03:40:55,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 03:40:58,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:40:58,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 03:40:58,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:40:58,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:41:01,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:41:02,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 03:41:02,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 03:41:04,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:41:04,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:41:07,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:41:07,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:41:11,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:41:14,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:41:17,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:41:18,613 INFO [train.py:1039] (1/4) Epoch 7, batch 3750, loss[loss=0.1949, simple_loss=0.2723, pruned_loss=0.05871, over 24476.00 frames. ], tot_loss[loss=0.2263, simple_loss=0.2911, pruned_loss=0.08076, over 4700647.47 frames. ], batch size: 66, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:41:18,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 03:41:20,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 03:41:23,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 03:41:23,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 03:41:25,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:41:27,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:41:27,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:41:30,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:41:33,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:41:37,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:41:39,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:41:40,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:41:43,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:41:45,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 03:41:47,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:41:48,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:41:49,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:41:54,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 03:41:57,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 03:41:59,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:41:59,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:42:01,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:42:02,066 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=237613.33333333334, ans=0.125 2023-09-29 03:42:05,633 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.83 vs. limit=15.0 2023-09-29 03:42:06,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:42:07,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 03:42:10,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 03:42:11,318 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=237680.0, ans=0.125 2023-09-29 03:42:14,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:42:17,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:42:18,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:42:20,848 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:42:27,329 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=12.86 vs. limit=15.0 2023-09-29 03:42:27,704 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 2.086e+02 2.277e+02 2.555e+02 3.671e+02, threshold=4.554e+02, percent-clipped=0.0 2023-09-29 03:42:27,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 03:42:29,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 03:42:32,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 03:42:32,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:42:35,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 03:42:40,454 INFO [train.py:1039] (1/4) Epoch 7, batch 3800, loss[loss=0.2256, simple_loss=0.2772, pruned_loss=0.08706, over 23432.00 frames. ], tot_loss[loss=0.225, simple_loss=0.2904, pruned_loss=0.07979, over 4724204.42 frames. ], batch size: 285, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:42:45,223 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:42:49,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:42:49,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 03:42:50,556 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.97 vs. limit=22.5 2023-09-29 03:42:51,363 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 03:42:53,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:42:55,233 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:42:55,349 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 03:42:56,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 03:42:56,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:42:57,289 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=237880.0, ans=0.0 2023-09-29 03:42:58,462 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:43:00,250 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=237880.0, ans=0.1 2023-09-29 03:43:01,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:43:01,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:43:01,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:43:03,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 03:43:06,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 03:43:06,916 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=237880.0, ans=0.2 2023-09-29 03:43:08,084 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:43:10,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:43:13,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:43:14,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 03:43:16,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 03:43:17,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:43:19,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:43:20,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:43:22,921 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.40 vs. limit=22.5 2023-09-29 03:43:24,272 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=237946.66666666666, ans=0.2 2023-09-29 03:43:26,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 03:43:26,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 03:43:27,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:43:32,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:43:40,170 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=238013.33333333334, ans=0.125 2023-09-29 03:43:41,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:43:45,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 03:43:46,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 03:43:48,212 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:43:49,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:43:51,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:43:52,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 03:43:55,240 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.95 vs. limit=15.0 2023-09-29 03:43:56,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 03:43:56,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 03:43:56,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:43:57,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:44:03,080 INFO [train.py:1039] (1/4) Epoch 7, batch 3850, loss[loss=0.2372, simple_loss=0.3029, pruned_loss=0.08574, over 24476.00 frames. ], tot_loss[loss=0.2246, simple_loss=0.289, pruned_loss=0.08016, over 4710955.46 frames. ], batch size: 66, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:44:03,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:44:04,706 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:44:11,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:44:11,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 03:44:12,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 03:44:15,138 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:44:18,861 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 03:44:20,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:44:23,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 03:44:24,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 03:44:29,728 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:44:31,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:44:34,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:44:34,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 03:44:39,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:44:39,510 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:44:40,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:44:40,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:44:41,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:44:44,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:44:44,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:44:44,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:44:46,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 03:44:46,476 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 03:44:48,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:44:48,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:44:52,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:44:52,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:44:52,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 03:44:56,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 03:44:56,915 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=238346.66666666666, ans=0.0 2023-09-29 03:44:58,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:44:59,902 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 03:45:01,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 03:45:06,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:45:08,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:45:13,054 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.749e+02 2.089e+02 2.371e+02 2.859e+02 5.421e+02, threshold=4.742e+02, percent-clipped=3.0 2023-09-29 03:45:13,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:45:13,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 03:45:15,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 03:45:18,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:45:19,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:45:19,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 03:45:19,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:45:21,666 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:45:23,165 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:45:23,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:45:23,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 03:45:23,493 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=238413.33333333334, ans=0.09899494936611666 2023-09-29 03:45:24,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:45:26,770 INFO [train.py:1039] (1/4) Epoch 7, batch 3900, loss[loss=0.2123, simple_loss=0.2786, pruned_loss=0.07298, over 24360.00 frames. ], tot_loss[loss=0.2236, simple_loss=0.2878, pruned_loss=0.07969, over 4712354.09 frames. ], batch size: 61, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:45:28,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 03:45:28,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:45:28,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:45:29,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:45:29,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:45:31,497 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:45:32,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:45:32,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:45:33,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:45:33,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 03:45:34,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:45:39,107 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:45:39,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 03:45:41,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:45:41,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:45:43,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 03:45:44,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:45:46,222 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 03:45:47,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 03:45:47,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:45:50,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 03:45:50,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:45:50,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 03:45:52,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 03:45:57,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:45:59,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:45:59,930 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 03:46:00,152 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=238613.33333333334, ans=0.2 2023-09-29 03:46:01,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:46:05,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:46:07,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:46:12,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:46:12,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:46:13,584 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:46:19,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:46:20,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:46:25,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 03:46:26,846 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:46:32,468 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=238746.66666666666, ans=0.2 2023-09-29 03:46:37,979 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:46:39,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:46:41,156 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 03:46:41,224 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 03:46:41,259 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:46:42,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 03:46:44,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:46:44,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 03:46:48,695 INFO [train.py:1039] (1/4) Epoch 7, batch 3950, loss[loss=0.2274, simple_loss=0.3028, pruned_loss=0.07604, over 24566.00 frames. ], tot_loss[loss=0.2237, simple_loss=0.2881, pruned_loss=0.07965, over 4711091.45 frames. ], batch size: 71, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:46:52,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:46:54,207 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 03:46:55,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:46:57,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:46:59,161 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=238813.33333333334, ans=0.0 2023-09-29 03:47:00,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:47:03,997 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 03:47:06,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 03:47:06,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 03:47:07,453 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 03:47:07,500 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:47:11,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:47:11,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 03:47:11,088 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:47:14,106 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 03:47:17,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:47:17,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 03:47:17,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 03:47:18,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 03:47:18,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:47:30,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:47:31,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:47:36,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 03:47:37,949 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.23 vs. limit=15.0 2023-09-29 03:47:42,702 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.15 vs. limit=10.0 2023-09-29 03:47:45,014 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 03:47:45,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 03:47:45,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:47:45,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:47:51,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:47:52,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 03:47:52,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:47:53,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:47:53,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 03:47:56,648 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=239080.0, ans=0.125 2023-09-29 03:47:57,667 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 2.025e+02 2.218e+02 2.611e+02 3.934e+02, threshold=4.435e+02, percent-clipped=0.0 2023-09-29 03:47:57,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:47:58,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:47:58,305 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=239080.0, ans=0.125 2023-09-29 03:48:00,375 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=239080.0, ans=0.125 2023-09-29 03:48:01,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 03:48:07,093 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=239080.0, ans=0.125 2023-09-29 03:48:11,823 INFO [train.py:1039] (1/4) Epoch 7, batch 4000, loss[loss=0.236, simple_loss=0.2873, pruned_loss=0.09235, over 23676.00 frames. ], tot_loss[loss=0.2237, simple_loss=0.2883, pruned_loss=0.0795, over 4719839.19 frames. ], batch size: 164, lr: 1.45e-02, grad_scale: 32.0 2023-09-29 03:48:12,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:48:18,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:48:24,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:48:24,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:48:26,191 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:48:26,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 03:48:26,476 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=239213.33333333334, ans=0.1 2023-09-29 03:48:27,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:48:27,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 03:48:27,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:48:27,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 03:48:30,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:48:33,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:48:34,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:48:34,012 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:48:36,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:48:36,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 03:48:36,578 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 03:48:38,737 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=10.38 vs. limit=10.0 2023-09-29 03:48:39,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:48:41,238 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 03:48:41,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:48:43,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:48:46,656 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 03:48:47,658 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.92 vs. limit=10.0 2023-09-29 03:48:48,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 03:48:49,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:48:54,983 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=239280.0, ans=0.125 2023-09-29 03:48:57,509 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 03:48:57,604 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:49:00,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:49:00,805 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=239346.66666666666, ans=0.1 2023-09-29 03:49:02,029 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 03:49:03,553 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:49:03,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 03:49:03,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:49:05,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:49:06,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:49:06,805 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=239346.66666666666, ans=0.035 2023-09-29 03:49:08,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:49:09,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 03:49:09,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:49:11,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 03:49:11,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:49:13,320 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 03:49:18,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 03:49:20,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 03:49:22,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:49:22,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:49:23,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:49:25,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:49:28,001 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=239413.33333333334, ans=0.125 2023-09-29 03:49:29,127 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:49:30,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 03:49:30,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 03:49:33,604 INFO [train.py:1039] (1/4) Epoch 7, batch 4050, loss[loss=0.2362, simple_loss=0.2906, pruned_loss=0.09087, over 23853.00 frames. ], tot_loss[loss=0.2241, simple_loss=0.289, pruned_loss=0.07957, over 4709419.04 frames. ], batch size: 195, lr: 1.44e-02, grad_scale: 32.0 2023-09-29 03:49:33,689 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 03:49:33,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:49:33,874 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:49:36,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:49:37,294 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=239480.0, ans=0.2 2023-09-29 03:49:38,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:49:43,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:49:43,297 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=239480.0, ans=0.125 2023-09-29 03:49:46,688 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:49:46,769 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 03:49:49,689 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=239546.66666666666, ans=15.0 2023-09-29 03:49:50,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:49:51,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:49:56,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:49:58,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:50:00,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 03:50:03,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 03:50:03,810 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 03:50:05,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:50:12,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 03:50:13,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:50:14,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:50:18,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:50:19,622 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:50:19,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:50:23,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:50:25,351 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=239680.0, ans=0.09899494936611666 2023-09-29 03:50:26,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 03:50:26,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 03:50:28,622 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:50:28,803 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=239680.0, ans=0.0 2023-09-29 03:50:30,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 03:50:34,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:50:40,614 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=239746.66666666666, ans=0.125 2023-09-29 03:50:41,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 03:50:43,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:50:43,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 03:50:44,862 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.660e+02 1.977e+02 2.189e+02 2.469e+02 3.390e+02, threshold=4.378e+02, percent-clipped=0.0 2023-09-29 03:50:47,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 03:50:47,866 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 03:50:47,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:50:51,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:50:52,619 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:50:52,655 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:50:56,007 INFO [train.py:1039] (1/4) Epoch 7, batch 4100, loss[loss=0.2087, simple_loss=0.2881, pruned_loss=0.06463, over 23976.00 frames. ], tot_loss[loss=0.2259, simple_loss=0.2904, pruned_loss=0.08065, over 4711037.08 frames. ], batch size: 80, lr: 1.44e-02, grad_scale: 16.0 2023-09-29 03:50:56,748 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.90 vs. limit=15.0 2023-09-29 03:51:01,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 03:51:02,743 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 03:51:04,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 03:51:07,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 03:51:07,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:51:08,393 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.81 vs. limit=15.0 2023-09-29 03:51:09,018 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:51:09,067 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:51:09,097 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:51:10,560 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 03:51:14,221 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:51:14,641 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.max_abs, batch_count=239880.0, ans=10.0 2023-09-29 03:51:15,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:51:15,809 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:51:17,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 03:51:19,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:51:20,478 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:51:21,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:51:21,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 03:51:23,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:51:23,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:51:23,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:51:23,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:51:23,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 03:51:26,608 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:51:28,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 03:51:30,103 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:51:31,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:51:31,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 03:51:32,460 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.60 vs. limit=15.0 2023-09-29 03:51:33,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:51:33,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:51:34,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:51:37,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 03:51:38,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 03:51:40,331 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:51:48,019 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 03:51:48,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:51:50,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:51:53,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:51:53,684 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=240013.33333333334, ans=0.125 2023-09-29 03:51:56,781 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:52:01,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:52:02,559 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:52:03,634 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.64 vs. limit=22.5 2023-09-29 03:52:08,037 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=240080.0, ans=0.125 2023-09-29 03:52:09,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:52:09,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:52:13,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:52:16,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:52:21,218 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:52:21,385 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:52:22,666 INFO [train.py:1039] (1/4) Epoch 7, batch 4150, loss[loss=0.2397, simple_loss=0.3045, pruned_loss=0.08749, over 23998.00 frames. ], tot_loss[loss=0.2258, simple_loss=0.291, pruned_loss=0.08029, over 4721989.88 frames. ], batch size: 86, lr: 1.44e-02, grad_scale: 16.0 2023-09-29 03:52:22,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:52:22,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:52:24,051 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.18 vs. limit=15.0 2023-09-29 03:52:26,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 03:52:26,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:52:27,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 03:52:29,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 03:52:29,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 03:52:31,206 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=240146.66666666666, ans=0.125 2023-09-29 03:52:32,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:52:38,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:52:38,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:52:42,019 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.81 vs. limit=12.0 2023-09-29 03:52:42,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:52:45,141 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:52:45,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 03:52:46,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 03:52:48,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:52:49,706 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 03:52:54,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:52:56,049 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:52:58,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 03:53:01,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 03:53:01,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:53:01,847 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=240280.0, ans=0.0 2023-09-29 03:53:03,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 03:53:03,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:53:03,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:53:06,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:53:06,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:53:10,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 03:53:13,977 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:53:16,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:53:18,312 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 03:53:18,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:53:19,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 03:53:20,356 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=240346.66666666666, ans=0.1 2023-09-29 03:53:21,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:53:22,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:53:24,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:53:24,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 03:53:24,638 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:53:24,642 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 03:53:26,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 03:53:29,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 03:53:29,777 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:53:29,784 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:53:29,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 03:53:31,297 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 03:53:32,553 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.682e+02 2.142e+02 2.391e+02 2.660e+02 4.088e+02, threshold=4.782e+02, percent-clipped=0.0 2023-09-29 03:53:32,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:53:32,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 03:53:32,788 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:53:35,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:53:35,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 03:53:36,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:53:42,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:53:44,027 INFO [train.py:1039] (1/4) Epoch 7, batch 4200, loss[loss=0.2212, simple_loss=0.2611, pruned_loss=0.0907, over 22688.00 frames. ], tot_loss[loss=0.2246, simple_loss=0.2898, pruned_loss=0.07967, over 4733609.96 frames. ], batch size: 323, lr: 1.44e-02, grad_scale: 16.0 2023-09-29 03:53:44,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 03:53:45,714 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:53:48,056 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:53:51,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 03:53:51,530 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:53:51,533 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:53:54,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 03:53:57,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 03:53:57,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:53:59,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:54:01,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:54:05,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 03:54:06,247 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=240546.66666666666, ans=0.04949747468305833 2023-09-29 03:54:08,382 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=240546.66666666666, ans=0.0 2023-09-29 03:54:09,390 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:54:09,449 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:54:10,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 03:54:10,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:54:11,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:54:12,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:54:12,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 03:54:15,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:54:17,672 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.88 vs. limit=15.0 2023-09-29 03:54:18,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 03:54:18,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:54:21,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 03:54:23,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:54:27,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:54:30,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:54:32,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:54:32,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 03:54:33,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:54:33,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:54:38,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 03:54:39,966 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:54:42,852 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=240680.0, ans=0.0 2023-09-29 03:54:45,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:54:49,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 03:54:51,612 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=240746.66666666666, ans=0.2 2023-09-29 03:54:52,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:54:56,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 03:54:56,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:54:58,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 03:55:05,099 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:55:06,371 INFO [train.py:1039] (1/4) Epoch 7, batch 4250, loss[loss=0.2095, simple_loss=0.2707, pruned_loss=0.0742, over 23872.00 frames. ], tot_loss[loss=0.2232, simple_loss=0.2881, pruned_loss=0.07912, over 4705450.89 frames. ], batch size: 195, lr: 1.44e-02, grad_scale: 16.0 2023-09-29 03:55:09,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:55:09,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:55:13,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:55:17,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 03:55:19,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 03:55:19,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:55:21,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:55:24,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:55:29,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:55:31,422 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:55:33,101 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:55:33,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:55:34,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:55:36,220 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:55:37,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:55:40,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:55:42,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:55:44,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 03:55:47,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 03:55:47,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:55:48,156 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=240946.66666666666, ans=0.95 2023-09-29 03:55:49,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:55:49,454 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:55:51,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 03:55:51,543 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:55:52,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:55:56,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 03:55:57,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:56:00,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:56:02,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:56:03,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 03:56:03,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 03:56:06,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 03:56:07,458 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:56:09,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 03:56:12,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:56:12,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:56:13,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 03:56:15,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 03:56:16,640 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.803e+02 2.211e+02 2.441e+02 2.743e+02 4.963e+02, threshold=4.882e+02, percent-clipped=1.0 2023-09-29 03:56:16,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 03:56:16,944 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=241080.0, ans=0.015 2023-09-29 03:56:17,151 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=241080.0, ans=0.0 2023-09-29 03:56:19,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:56:23,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:56:25,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:56:26,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:56:28,511 INFO [train.py:1039] (1/4) Epoch 7, batch 4300, loss[loss=0.2039, simple_loss=0.2776, pruned_loss=0.06513, over 24475.00 frames. ], tot_loss[loss=0.2236, simple_loss=0.2885, pruned_loss=0.07934, over 4704212.60 frames. ], batch size: 66, lr: 1.44e-02, grad_scale: 8.0 2023-09-29 03:56:28,717 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:56:30,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:56:30,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:56:30,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 03:56:31,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:56:37,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:56:37,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:56:40,181 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=241146.66666666666, ans=0.125 2023-09-29 03:56:43,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:56:50,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:56:50,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 03:56:51,509 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:56:54,640 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 03:56:54,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:56:54,707 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 03:56:58,944 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=13.39 vs. limit=15.0 2023-09-29 03:56:59,629 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=241280.0, ans=0.125 2023-09-29 03:57:01,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 03:57:01,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:57:04,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 03:57:04,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:57:05,007 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 03:57:07,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 03:57:08,296 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=241280.0, ans=0.0 2023-09-29 03:57:09,504 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:57:11,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:57:11,188 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:57:12,679 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 03:57:14,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:57:16,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:57:16,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 03:57:16,415 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 03:57:18,726 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=241346.66666666666, ans=0.1 2023-09-29 03:57:19,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:57:20,277 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=241346.66666666666, ans=0.1 2023-09-29 03:57:23,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:57:23,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 03:57:23,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:57:24,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:57:24,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 03:57:24,486 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 03:57:25,901 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 03:57:26,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:57:27,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 03:57:27,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 03:57:29,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:57:32,872 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 03:57:32,977 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:57:36,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:57:36,602 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:57:39,557 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 03:57:41,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:57:41,009 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:57:41,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:57:41,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:57:42,642 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:57:42,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:57:44,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:57:46,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:57:46,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:57:51,827 INFO [train.py:1039] (1/4) Epoch 7, batch 4350, loss[loss=0.2717, simple_loss=0.3144, pruned_loss=0.1145, over 19300.00 frames. ], tot_loss[loss=0.2236, simple_loss=0.2889, pruned_loss=0.07913, over 4717453.32 frames. ], batch size: 388, lr: 1.44e-02, grad_scale: 8.0 2023-09-29 03:57:53,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 03:57:53,666 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 03:57:58,221 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:58:01,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:58:04,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:58:04,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:58:10,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 03:58:13,475 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:58:15,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:58:16,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:58:20,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 03:58:23,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:58:23,469 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=241613.33333333334, ans=0.025 2023-09-29 03:58:24,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 03:58:30,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 03:58:31,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:58:31,758 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=241613.33333333334, ans=0.0 2023-09-29 03:58:32,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:58:33,264 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=241613.33333333334, ans=0.1 2023-09-29 03:58:37,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:58:39,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 03:58:44,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:58:46,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 03:58:50,875 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 03:58:52,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:58:52,445 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:58:53,868 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 03:58:54,210 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=241680.0, ans=0.1 2023-09-29 03:58:56,024 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 03:58:56,032 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:58:56,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:58:57,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:58:57,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:58:57,762 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=241746.66666666666, ans=0.1 2023-09-29 03:58:59,727 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:58:59,792 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:59:02,870 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 03:59:02,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:59:02,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:59:02,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:59:04,150 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.804e+02 2.126e+02 2.357e+02 2.632e+02 4.633e+02, threshold=4.715e+02, percent-clipped=0.0 2023-09-29 03:59:04,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 03:59:05,860 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 03:59:05,868 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 03:59:05,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 03:59:08,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:59:09,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 03:59:09,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:59:10,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:59:12,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 03:59:13,469 INFO [train.py:1039] (1/4) Epoch 7, batch 4400, loss[loss=0.2438, simple_loss=0.3174, pruned_loss=0.08506, over 24074.00 frames. ], tot_loss[loss=0.2251, simple_loss=0.2903, pruned_loss=0.07997, over 4715640.83 frames. ], batch size: 80, lr: 1.44e-02, grad_scale: 16.0 2023-09-29 03:59:15,069 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 03:59:15,081 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:59:19,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:59:20,988 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:59:21,418 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=241813.33333333334, ans=0.125 2023-09-29 03:59:22,699 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:59:24,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 03:59:24,411 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 03:59:25,803 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 03:59:25,838 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 03:59:25,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 03:59:26,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:59:29,549 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 03:59:31,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:59:32,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:59:32,618 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 03:59:37,760 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:59:37,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 03:59:39,147 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 03:59:42,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 03:59:43,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 03:59:43,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 03:59:43,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:59:45,227 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:59:45,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:59:46,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:59:48,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 03:59:48,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 03:59:50,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:59:51,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:59:51,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:59:51,902 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=241946.66666666666, ans=0.1 2023-09-29 03:59:53,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:59:53,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:59:53,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 03:59:55,811 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 03:59:58,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:00:05,627 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:00:08,519 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 04:00:15,067 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:00:16,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:00:18,296 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:00:19,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 04:00:19,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:00:19,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:00:19,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 04:00:21,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 04:00:24,708 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=242080.0, ans=0.05 2023-09-29 04:00:25,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 04:00:29,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 04:00:31,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 04:00:31,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:00:31,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 04:00:34,732 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:00:36,392 INFO [train.py:1039] (1/4) Epoch 7, batch 4450, loss[loss=0.1922, simple_loss=0.2731, pruned_loss=0.05568, over 24491.00 frames. ], tot_loss[loss=0.2254, simple_loss=0.2907, pruned_loss=0.07999, over 4719989.55 frames. ], batch size: 66, lr: 1.44e-02, grad_scale: 16.0 2023-09-29 04:00:36,494 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:00:37,033 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=242146.66666666666, ans=0.0 2023-09-29 04:00:38,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 04:00:43,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:00:44,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:00:44,829 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:00:51,639 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=242213.33333333334, ans=0.04949747468305833 2023-09-29 04:00:52,808 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:00:52,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:00:53,347 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=242213.33333333334, ans=0.2 2023-09-29 04:00:56,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:00:56,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:01:01,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:01:01,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:01:03,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 04:01:03,388 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:01:03,517 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:01:03,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:01:03,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:01:07,761 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 04:01:12,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:01:12,523 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:01:14,701 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:01:14,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:01:16,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:01:21,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 04:01:22,898 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 04:01:22,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 04:01:22,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:01:25,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:01:27,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 04:01:30,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 04:01:33,767 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:01:35,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 04:01:35,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:01:35,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:01:35,883 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:01:35,904 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:01:39,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:01:39,986 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=242346.66666666666, ans=0.1 2023-09-29 04:01:41,346 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 04:01:42,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 04:01:44,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 04:01:45,881 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:01:47,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:01:49,220 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.705e+02 2.081e+02 2.382e+02 2.836e+02 4.315e+02, threshold=4.764e+02, percent-clipped=0.0 2023-09-29 04:01:49,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:01:50,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 04:01:52,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 04:01:52,568 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 04:01:55,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 04:01:57,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:01:59,207 INFO [train.py:1039] (1/4) Epoch 7, batch 4500, loss[loss=0.2136, simple_loss=0.2887, pruned_loss=0.06925, over 24655.00 frames. ], tot_loss[loss=0.2261, simple_loss=0.291, pruned_loss=0.08059, over 4705211.23 frames. ], batch size: 68, lr: 1.44e-02, grad_scale: 16.0 2023-09-29 04:02:04,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:02:05,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 04:02:05,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 04:02:08,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:02:12,666 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:02:12,774 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:02:14,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 04:02:15,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:02:15,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:02:15,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:02:26,102 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=242546.66666666666, ans=0.125 2023-09-29 04:02:27,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:02:27,660 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=242546.66666666666, ans=0.0 2023-09-29 04:02:29,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:02:31,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:02:32,472 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:02:34,018 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:02:35,814 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=242613.33333333334, ans=0.125 2023-09-29 04:02:40,141 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 04:02:40,305 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=242613.33333333334, ans=0.1 2023-09-29 04:02:44,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:02:49,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 04:02:50,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:02:52,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 04:02:52,481 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:02:54,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:02:54,993 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=242680.0, ans=0.125 2023-09-29 04:02:56,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:02:57,483 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:02:59,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:02:59,170 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 04:02:59,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 04:02:59,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:03:04,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:03:04,477 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:03:07,516 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:03:09,724 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.56 vs. limit=15.0 2023-09-29 04:03:10,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:03:10,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:03:13,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 04:03:13,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 04:03:13,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 04:03:19,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 04:03:22,554 INFO [train.py:1039] (1/4) Epoch 7, batch 4550, loss[loss=0.2251, simple_loss=0.2576, pruned_loss=0.09625, over 19341.00 frames. ], tot_loss[loss=0.225, simple_loss=0.2899, pruned_loss=0.07998, over 4707392.06 frames. ], batch size: 388, lr: 1.43e-02, grad_scale: 16.0 2023-09-29 04:03:22,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 04:03:22,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:03:28,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:03:29,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:03:31,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:03:36,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:03:36,667 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=242813.33333333334, ans=0.125 2023-09-29 04:03:37,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:03:39,576 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:03:40,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 04:03:40,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:03:42,659 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:03:44,098 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:03:45,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:03:49,022 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 04:03:49,122 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 04:03:51,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:03:52,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 04:03:57,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 04:03:57,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:04:01,498 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=242946.66666666666, ans=0.125 2023-09-29 04:04:02,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 04:04:04,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 04:04:07,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:04:07,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:04:07,341 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 04:04:07,608 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=242946.66666666666, ans=0.2 2023-09-29 04:04:09,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 04:04:12,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:04:15,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:04:15,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:04:17,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:04:17,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 04:04:17,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 04:04:18,980 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:04:19,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 04:04:19,922 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=23.20 vs. limit=22.5 2023-09-29 04:04:20,815 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 04:04:20,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:04:22,575 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=243013.33333333334, ans=0.0 2023-09-29 04:04:23,747 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:04:23,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:04:25,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:04:25,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:04:28,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 04:04:29,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 04:04:31,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:04:31,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 04:04:31,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 04:04:31,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:04:31,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 04:04:33,630 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer_ff3.min_abs, batch_count=243080.0, ans=0.2 2023-09-29 04:04:34,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:04:34,941 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:04:36,259 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 2.221e+02 2.599e+02 3.023e+02 5.403e+02, threshold=5.198e+02, percent-clipped=1.0 2023-09-29 04:04:37,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:04:40,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:04:40,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 04:04:41,659 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:04:43,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 04:04:46,064 INFO [train.py:1039] (1/4) Epoch 7, batch 4600, loss[loss=0.2272, simple_loss=0.2852, pruned_loss=0.08461, over 23597.00 frames. ], tot_loss[loss=0.2239, simple_loss=0.2889, pruned_loss=0.07945, over 4704468.75 frames. ], batch size: 256, lr: 1.43e-02, grad_scale: 16.0 2023-09-29 04:04:46,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:04:47,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:04:50,732 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:04:50,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:04:52,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:04:53,869 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 04:04:55,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 04:04:59,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:05:00,198 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=243213.33333333334, ans=0.125 2023-09-29 04:05:01,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:05:01,772 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=243213.33333333334, ans=0.04949747468305833 2023-09-29 04:05:03,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:05:10,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 04:05:12,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:05:15,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:05:17,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:05:17,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:05:23,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 04:05:23,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 04:05:23,896 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=243280.0, ans=0.1 2023-09-29 04:05:25,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:05:32,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:05:32,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:05:35,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:05:37,513 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=243346.66666666666, ans=0.0 2023-09-29 04:05:39,768 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 04:05:41,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 04:05:45,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:05:46,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:05:48,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:05:48,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 04:05:50,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:05:50,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 04:05:50,582 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:05:50,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:05:52,194 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:05:53,664 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:05:55,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:05:56,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 04:05:56,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 04:05:56,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 04:05:56,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:05:58,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:05:59,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:06:01,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:06:07,198 INFO [train.py:1039] (1/4) Epoch 7, batch 4650, loss[loss=0.2258, simple_loss=0.3041, pruned_loss=0.07374, over 24397.00 frames. ], tot_loss[loss=0.223, simple_loss=0.2886, pruned_loss=0.07866, over 4709864.97 frames. ], batch size: 77, lr: 1.43e-02, grad_scale: 16.0 2023-09-29 04:06:12,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:06:16,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:06:16,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:06:18,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:06:18,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:06:18,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:06:19,817 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:06:22,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 04:06:26,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:06:29,604 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 04:06:29,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:06:29,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 04:06:31,112 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:06:31,232 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 04:06:32,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 04:06:32,562 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:06:32,668 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:06:34,429 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 04:06:37,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:06:37,465 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 04:06:40,729 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=243613.33333333334, ans=0.07 2023-09-29 04:06:41,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:06:43,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 04:06:45,312 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=243613.33333333334, ans=0.125 2023-09-29 04:06:46,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:06:46,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:06:46,626 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 04:06:48,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:06:50,674 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=243613.33333333334, ans=0.0 2023-09-29 04:06:51,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:06:56,280 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:06:56,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=243680.0, ans=0.125 2023-09-29 04:06:58,813 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=243680.0, ans=0.1 2023-09-29 04:07:00,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:07:04,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:07:04,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:07:06,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:07:06,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 04:07:07,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 04:07:07,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 04:07:07,945 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 04:07:09,694 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=243680.0, ans=0.2 2023-09-29 04:07:10,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:07:12,832 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=243746.66666666666, ans=0.0 2023-09-29 04:07:18,480 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 1.995e+02 2.331e+02 2.666e+02 3.727e+02, threshold=4.663e+02, percent-clipped=0.0 2023-09-29 04:07:18,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:07:18,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:07:18,659 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 04:07:18,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:07:19,609 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.86 vs. limit=22.5 2023-09-29 04:07:20,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:07:20,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:07:22,282 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:07:24,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:07:24,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:07:26,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:07:27,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:07:28,322 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=243813.33333333334, ans=0.125 2023-09-29 04:07:29,921 INFO [train.py:1039] (1/4) Epoch 7, batch 4700, loss[loss=0.2537, simple_loss=0.3113, pruned_loss=0.09801, over 23880.00 frames. ], tot_loss[loss=0.2236, simple_loss=0.2894, pruned_loss=0.07891, over 4719842.25 frames. ], batch size: 195, lr: 1.43e-02, grad_scale: 16.0 2023-09-29 04:07:30,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:07:30,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 04:07:31,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 04:07:32,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 04:07:33,143 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 04:07:40,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:07:42,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:07:44,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:07:45,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:07:47,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 04:07:50,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 04:07:51,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 04:07:53,651 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:07:54,631 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=243880.0, ans=0.2 2023-09-29 04:07:55,725 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:07:57,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:08:01,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:08:06,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:08:09,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 04:08:11,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:08:13,630 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.12 vs. limit=15.0 2023-09-29 04:08:17,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 04:08:19,093 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:08:22,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:08:22,561 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=244013.33333333334, ans=0.0 2023-09-29 04:08:25,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 04:08:26,992 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:08:31,126 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=244013.33333333334, ans=0.1 2023-09-29 04:08:32,132 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:08:32,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 04:08:33,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:08:33,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:08:36,411 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.85 vs. limit=15.0 2023-09-29 04:08:36,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:08:38,371 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:08:38,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 04:08:38,544 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 04:08:38,796 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=244080.0, ans=0.09899494936611666 2023-09-29 04:08:41,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:08:43,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:08:43,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:08:43,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 04:08:45,206 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=244080.0, ans=0.2 2023-09-29 04:08:46,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:08:47,128 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=11.11 vs. limit=15.0 2023-09-29 04:08:48,235 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=244080.0, ans=0.125 2023-09-29 04:08:49,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 04:08:50,856 INFO [train.py:1039] (1/4) Epoch 7, batch 4750, loss[loss=0.2131, simple_loss=0.294, pruned_loss=0.06612, over 24615.00 frames. ], tot_loss[loss=0.225, simple_loss=0.2906, pruned_loss=0.07967, over 4712207.78 frames. ], batch size: 73, lr: 1.43e-02, grad_scale: 16.0 2023-09-29 04:08:52,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:08:52,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:08:58,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:08:58,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:09:01,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 04:09:01,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:09:02,541 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=7.33 vs. limit=15.0 2023-09-29 04:09:05,002 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=244213.33333333334, ans=0.0 2023-09-29 04:09:06,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 04:09:09,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:09:09,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:09:09,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:09:14,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 04:09:19,399 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:09:20,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 04:09:21,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:09:24,245 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=244280.0, ans=0.2 2023-09-29 04:09:25,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:09:25,497 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:09:25,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:09:26,976 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 04:09:26,981 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 04:09:34,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 04:09:37,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:09:39,343 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=244346.66666666666, ans=0.125 2023-09-29 04:09:41,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:09:42,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:09:42,834 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 04:09:42,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:09:46,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:09:48,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:09:51,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 04:09:51,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 04:09:51,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:09:53,128 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:09:53,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:09:54,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 04:09:54,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 04:09:54,979 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 04:09:55,103 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.max_abs, batch_count=244413.33333333334, ans=10.0 2023-09-29 04:09:56,576 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=244413.33333333334, ans=0.2 2023-09-29 04:09:57,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 04:10:00,214 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.17 vs. limit=22.5 2023-09-29 04:10:00,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:10:02,301 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.530e+02 1.924e+02 2.114e+02 2.511e+02 3.995e+02, threshold=4.229e+02, percent-clipped=0.0 2023-09-29 04:10:02,454 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:10:02,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 04:10:02,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:10:04,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:10:06,910 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:10:07,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:10:08,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:10:08,902 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=244413.33333333334, ans=10.0 2023-09-29 04:10:10,071 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:10:11,388 INFO [train.py:1039] (1/4) Epoch 7, batch 4800, loss[loss=0.1882, simple_loss=0.2588, pruned_loss=0.05884, over 24447.00 frames. ], tot_loss[loss=0.226, simple_loss=0.2912, pruned_loss=0.08044, over 4710674.17 frames. ], batch size: 58, lr: 1.43e-02, grad_scale: 32.0 2023-09-29 04:10:11,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 04:10:11,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 04:10:13,786 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 04:10:17,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 04:10:19,220 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:10:20,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 04:10:26,069 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:10:27,566 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:10:32,230 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 04:10:32,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:10:32,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:10:32,621 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=244546.66666666666, ans=0.04949747468305833 2023-09-29 04:10:33,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 04:10:35,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:10:35,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:10:35,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 04:10:41,354 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:10:42,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:10:43,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:10:44,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:10:44,593 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 04:10:44,615 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:10:44,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:10:48,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:10:51,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:10:53,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:10:53,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 04:10:55,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 04:10:57,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:10:58,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 04:10:58,786 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 04:10:58,995 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=244613.33333333334, ans=0.0 2023-09-29 04:11:00,212 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:11:00,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:11:00,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:11:00,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:11:00,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:11:02,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:11:02,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:11:05,233 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 04:11:07,872 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:11:09,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:11:11,172 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:11:11,542 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=244680.0, ans=0.2 2023-09-29 04:11:15,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 04:11:15,966 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=244746.66666666666, ans=0.125 2023-09-29 04:11:17,112 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:11:17,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:11:17,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:11:17,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:11:17,544 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=244746.66666666666, ans=0.0 2023-09-29 04:11:19,856 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=244746.66666666666, ans=0.125 2023-09-29 04:11:22,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:11:23,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:11:23,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:11:23,368 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=244746.66666666666, ans=0.05 2023-09-29 04:11:24,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:11:24,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:11:26,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 04:11:30,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:11:30,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:11:30,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:11:31,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 04:11:34,709 INFO [train.py:1039] (1/4) Epoch 7, batch 4850, loss[loss=0.2028, simple_loss=0.2724, pruned_loss=0.06655, over 24301.00 frames. ], tot_loss[loss=0.2258, simple_loss=0.291, pruned_loss=0.08032, over 4712474.70 frames. ], batch size: 56, lr: 1.43e-02, grad_scale: 32.0 2023-09-29 04:11:36,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 04:11:36,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:11:36,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:11:37,858 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:11:37,860 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:11:40,902 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:11:48,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 04:11:49,939 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:11:54,713 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:11:55,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 04:11:56,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:12:00,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:12:00,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:12:03,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:12:03,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 04:12:04,026 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=244880.0, ans=0.125 2023-09-29 04:12:04,442 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.59 vs. limit=6.0 2023-09-29 04:12:07,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:12:10,314 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:12:10,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 04:12:10,435 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:12:10,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 04:12:14,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:12:14,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:12:17,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:12:18,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 04:12:19,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 04:12:20,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 04:12:25,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:12:27,952 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 04:12:29,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:12:29,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:12:31,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 04:12:33,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 04:12:33,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:12:35,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 04:12:35,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:12:38,725 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:12:38,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 04:12:46,412 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.770e+02 2.140e+02 2.410e+02 2.869e+02 4.952e+02, threshold=4.821e+02, percent-clipped=3.0 2023-09-29 04:12:46,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:12:52,831 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:12:52,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:12:54,946 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=245146.66666666666, ans=0.125 2023-09-29 04:12:55,908 INFO [train.py:1039] (1/4) Epoch 7, batch 4900, loss[loss=0.1953, simple_loss=0.262, pruned_loss=0.06426, over 24355.00 frames. ], tot_loss[loss=0.2244, simple_loss=0.2892, pruned_loss=0.0798, over 4723616.08 frames. ], batch size: 56, lr: 1.43e-02, grad_scale: 32.0 2023-09-29 04:12:57,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 04:12:57,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:13:03,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:13:05,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:13:05,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:13:09,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 04:13:15,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 04:13:17,887 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=245213.33333333334, ans=0.1 2023-09-29 04:13:19,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 04:13:20,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 04:13:20,570 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 04:13:22,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:13:22,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:13:22,081 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:13:22,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:13:22,207 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 04:13:25,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 04:13:25,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 04:13:28,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:13:28,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 04:13:31,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:13:31,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:13:33,909 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:13:33,924 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 04:13:34,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:13:35,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:13:35,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 04:13:35,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 04:13:38,072 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.76 vs. limit=6.0 2023-09-29 04:13:40,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 04:13:42,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:13:44,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:13:44,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:13:46,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:13:46,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 04:13:46,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:13:47,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 04:13:48,062 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=245346.66666666666, ans=0.125 2023-09-29 04:13:50,759 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:13:52,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 04:13:53,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:13:57,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 04:13:58,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:13:58,619 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 04:13:58,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 04:14:04,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:14:05,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:14:06,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 04:14:06,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 04:14:06,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:14:08,907 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=245413.33333333334, ans=0.2 2023-09-29 04:14:10,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:14:14,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:14:15,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:14:15,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:14:15,084 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 04:14:16,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 04:14:18,199 INFO [train.py:1039] (1/4) Epoch 7, batch 4950, loss[loss=0.2126, simple_loss=0.2893, pruned_loss=0.06797, over 24092.00 frames. ], tot_loss[loss=0.2236, simple_loss=0.2875, pruned_loss=0.07984, over 4707395.58 frames. ], batch size: 86, lr: 1.43e-02, grad_scale: 32.0 2023-09-29 04:14:18,479 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:14:20,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 04:14:23,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 04:14:24,906 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 04:14:24,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 04:14:25,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 04:14:25,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:14:26,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:14:26,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:14:26,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:14:28,421 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:14:30,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:14:31,514 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:14:32,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:14:34,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:14:34,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:14:38,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 04:14:38,630 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=245546.66666666666, ans=0.125 2023-09-29 04:14:47,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:14:48,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:14:50,282 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:14:50,377 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:14:52,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:14:53,647 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 04:14:53,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 04:14:57,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:14:58,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:15:00,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:15:01,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:15:01,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:15:03,384 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 04:15:04,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:15:06,738 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=245680.0, ans=0.125 2023-09-29 04:15:07,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 04:15:08,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:15:09,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:15:09,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:15:10,012 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=245680.0, ans=0.0 2023-09-29 04:15:12,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 04:15:12,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:15:15,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 04:15:18,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:15:21,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:15:21,496 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:15:21,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:15:23,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:15:23,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:15:25,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:15:26,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:15:26,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:15:28,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 04:15:28,703 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=245746.66666666666, ans=0.07 2023-09-29 04:15:31,886 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.749e+02 2.077e+02 2.324e+02 2.627e+02 6.143e+02, threshold=4.647e+02, percent-clipped=3.0 2023-09-29 04:15:32,168 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:15:32,366 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=245746.66666666666, ans=0.125 2023-09-29 04:15:36,537 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=245746.66666666666, ans=0.025 2023-09-29 04:15:39,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 04:15:39,485 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 04:15:40,902 INFO [train.py:1039] (1/4) Epoch 7, batch 5000, loss[loss=0.2044, simple_loss=0.2719, pruned_loss=0.06849, over 24359.00 frames. ], tot_loss[loss=0.2227, simple_loss=0.2866, pruned_loss=0.07939, over 4711652.43 frames. ], batch size: 56, lr: 1.43e-02, grad_scale: 32.0 2023-09-29 04:15:47,821 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:15:47,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:15:49,511 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 04:15:49,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 04:15:53,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:15:55,292 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=245813.33333333334, ans=0.2 2023-09-29 04:15:56,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 04:15:56,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:15:56,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 04:15:56,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 04:15:56,555 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:15:58,015 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:15:59,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 04:15:59,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:15:59,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:16:01,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 04:16:01,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 04:16:03,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:16:03,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 04:16:03,402 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 04:16:04,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:16:04,910 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 04:16:04,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 04:16:04,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 04:16:07,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 04:16:07,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:16:08,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:16:10,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 04:16:11,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:16:11,709 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:16:13,206 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:16:16,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 04:16:19,607 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 04:16:19,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:16:21,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:16:21,732 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=245946.66666666666, ans=0.1 2023-09-29 04:16:24,542 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 04:16:28,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:16:28,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:16:28,499 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:16:33,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 04:16:34,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:16:34,739 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:16:34,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:16:36,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 04:16:37,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:16:40,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:16:42,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:16:47,688 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_ff2.min_abs, batch_count=246080.0, ans=0.1 2023-09-29 04:16:50,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 04:16:53,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:17:02,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:17:03,839 INFO [train.py:1039] (1/4) Epoch 7, batch 5050, loss[loss=0.2177, simple_loss=0.294, pruned_loss=0.07066, over 24448.00 frames. ], tot_loss[loss=0.2231, simple_loss=0.2874, pruned_loss=0.07936, over 4717983.58 frames. ], batch size: 69, lr: 1.43e-02, grad_scale: 16.0 2023-09-29 04:17:04,070 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:17:05,413 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:17:05,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:17:05,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 04:17:06,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:17:08,381 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:17:13,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:17:13,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 04:17:14,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:17:15,415 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=246146.66666666666, ans=0.125 2023-09-29 04:17:16,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:17:18,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:17:18,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 04:17:20,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:17:20,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:17:23,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 04:17:23,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 04:17:24,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 04:17:31,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 04:17:31,943 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 04:17:33,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:17:33,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 04:17:33,459 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:17:33,716 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=246213.33333333334, ans=0.125 2023-09-29 04:17:35,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:17:37,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:17:37,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:17:37,253 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 04:17:38,803 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 04:17:40,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:17:42,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:17:45,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:17:47,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 04:17:48,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:17:50,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 04:17:52,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:17:52,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:17:52,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:17:53,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:17:56,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:17:58,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:17:59,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:17:59,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:17:59,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:17:59,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 04:17:59,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:18:03,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:18:06,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:18:06,582 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 04:18:06,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 04:18:08,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:18:10,291 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:18:10,348 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 04:18:13,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:18:13,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 04:18:13,465 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:18:18,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:18:18,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:18:18,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 04:18:19,888 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.599e+02 2.259e+02 2.586e+02 3.154e+02 5.284e+02, threshold=5.172e+02, percent-clipped=3.0 2023-09-29 04:18:20,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 04:18:22,227 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.73 vs. limit=15.0 2023-09-29 04:18:24,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:18:24,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:18:24,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:18:26,486 INFO [train.py:1039] (1/4) Epoch 7, batch 5100, loss[loss=0.23, simple_loss=0.2874, pruned_loss=0.08634, over 23773.00 frames. ], tot_loss[loss=0.2244, simple_loss=0.2888, pruned_loss=0.07999, over 4717651.08 frames. ], batch size: 212, lr: 1.42e-02, grad_scale: 8.0 2023-09-29 04:18:28,063 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 04:18:31,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:18:35,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 04:18:35,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 04:18:35,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:18:38,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:18:42,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:18:42,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 04:18:42,580 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 04:18:47,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:18:48,009 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:18:53,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:18:54,891 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=246546.66666666666, ans=0.0 2023-09-29 04:18:57,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 04:18:57,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:19:00,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:19:00,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 04:19:02,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:19:03,648 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:19:03,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 04:19:06,110 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=246613.33333333334, ans=0.2 2023-09-29 04:19:07,207 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 04:19:07,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:19:07,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 04:19:07,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 04:19:12,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:19:22,499 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:19:25,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 04:19:27,552 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 04:19:27,565 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 04:19:29,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 04:19:29,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:19:32,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 04:19:35,344 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 04:19:37,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 04:19:38,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:19:40,189 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 04:19:41,357 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.86 vs. limit=15.0 2023-09-29 04:19:43,787 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 04:19:45,149 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 04:19:48,732 INFO [train.py:1039] (1/4) Epoch 7, batch 5150, loss[loss=0.2294, simple_loss=0.2827, pruned_loss=0.08801, over 23317.00 frames. ], tot_loss[loss=0.2247, simple_loss=0.2895, pruned_loss=0.07994, over 4716890.50 frames. ], batch size: 119, lr: 1.42e-02, grad_scale: 8.0 2023-09-29 04:19:50,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:19:50,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:19:50,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:19:51,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:19:51,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 04:19:53,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:19:54,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 04:19:54,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 04:19:55,067 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 04:19:55,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:19:55,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 04:19:56,954 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=246813.33333333334, ans=0.1 2023-09-29 04:19:58,053 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:19:58,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 04:19:58,341 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:20:01,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:20:07,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 04:20:07,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 04:20:08,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:20:08,669 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.51 vs. limit=22.5 2023-09-29 04:20:09,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:20:11,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 04:20:11,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:20:11,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:20:12,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:20:12,602 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:20:12,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 04:20:14,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:20:14,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:20:15,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 04:20:18,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 04:20:18,157 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=246880.0, ans=0.125 2023-09-29 04:20:19,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:20:26,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:20:29,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 04:20:30,963 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=246946.66666666666, ans=0.0 2023-09-29 04:20:33,007 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:20:39,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:20:40,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:20:44,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:20:44,102 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:20:47,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 04:20:49,347 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:20:50,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:20:50,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:20:55,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:20:55,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:20:56,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 04:20:59,713 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=247080.0, ans=0.125 2023-09-29 04:21:00,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:21:03,072 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 04:21:05,025 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.669e+02 2.008e+02 2.205e+02 2.538e+02 3.618e+02, threshold=4.410e+02, percent-clipped=0.0 2023-09-29 04:21:05,341 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:21:05,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:21:06,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 04:21:06,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 04:21:06,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:21:06,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:21:07,271 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=247080.0, ans=0.125 2023-09-29 04:21:10,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:21:11,448 INFO [train.py:1039] (1/4) Epoch 7, batch 5200, loss[loss=0.2314, simple_loss=0.3081, pruned_loss=0.07734, over 24012.00 frames. ], tot_loss[loss=0.2252, simple_loss=0.2903, pruned_loss=0.08009, over 4719606.61 frames. ], batch size: 80, lr: 1.42e-02, grad_scale: 16.0 2023-09-29 04:21:12,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:21:13,325 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=247146.66666666666, ans=0.125 2023-09-29 04:21:14,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:21:19,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 04:21:21,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:21:22,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:21:25,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:21:27,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:21:27,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:21:30,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 04:21:33,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 04:21:35,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:21:36,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 04:21:38,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 04:21:41,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 04:21:42,563 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 04:21:42,641 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 04:21:44,437 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=247280.0, ans=0.125 2023-09-29 04:21:45,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 04:21:47,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:21:47,111 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 04:21:47,123 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:21:47,979 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.83 vs. limit=6.0 2023-09-29 04:21:48,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:21:49,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:21:50,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 04:21:50,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:21:50,862 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.05 vs. limit=15.0 2023-09-29 04:21:53,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:21:57,624 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 04:21:57,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 04:21:57,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 04:22:01,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 04:22:03,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 04:22:09,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:22:09,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:22:11,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 04:22:11,468 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:22:13,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 04:22:13,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:22:13,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 04:22:18,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:22:18,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:22:21,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:22:22,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:22:22,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:22:27,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:22:28,814 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 04:22:30,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:22:30,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:22:31,788 INFO [train.py:1039] (1/4) Epoch 7, batch 5250, loss[loss=0.2522, simple_loss=0.3203, pruned_loss=0.09201, over 23755.00 frames. ], tot_loss[loss=0.2244, simple_loss=0.2895, pruned_loss=0.07968, over 4730792.88 frames. ], batch size: 85, lr: 1.42e-02, grad_scale: 16.0 2023-09-29 04:22:31,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:22:31,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 04:22:34,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:22:35,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:22:40,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:22:40,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:22:42,273 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:22:48,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:22:50,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:22:52,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:22:53,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 04:22:55,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 04:22:55,548 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:22:57,065 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:23:17,225 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=247680.0, ans=0.0 2023-09-29 04:23:21,329 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=247680.0, ans=0.0 2023-09-29 04:23:41,159 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.654e+02 2.140e+02 2.318e+02 2.697e+02 3.802e+02, threshold=4.635e+02, percent-clipped=0.0 2023-09-29 04:23:46,214 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=247813.33333333334, ans=0.125 2023-09-29 04:23:47,147 INFO [train.py:1039] (1/4) Epoch 7, batch 5300, loss[loss=0.226, simple_loss=0.2854, pruned_loss=0.0833, over 23303.00 frames. ], tot_loss[loss=0.2234, simple_loss=0.2879, pruned_loss=0.07942, over 4717638.80 frames. ], batch size: 119, lr: 1.42e-02, grad_scale: 16.0 2023-09-29 04:23:51,763 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=247813.33333333334, ans=0.1 2023-09-29 04:23:58,701 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=247813.33333333334, ans=0.1 2023-09-29 04:24:01,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:24:01,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 04:24:01,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 04:24:02,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:24:02,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:24:02,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:24:02,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:24:02,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:24:02,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:24:02,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:24:02,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 04:24:03,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:24:03,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 04:24:03,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 04:24:03,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 04:24:03,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 04:24:03,801 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 04:24:03,931 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 04:24:04,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:24:05,017 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:24:05,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:24:05,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:24:05,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:24:05,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:24:05,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:24:05,900 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:24:06,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:24:06,073 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:24:06,080 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:24:06,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:24:06,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:24:07,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 04:24:07,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:24:07,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:24:07,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 04:24:07,673 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 04:24:08,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:24:08,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:24:08,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 04:24:08,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 04:24:08,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 04:24:09,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:24:09,630 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:24:09,783 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 04:24:09,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 04:24:09,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 04:24:10,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:24:10,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 04:24:10,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 04:24:10,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 04:24:10,653 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 04:24:19,232 INFO [train.py:1039] (1/4) Epoch 8, batch 0, loss[loss=0.2372, simple_loss=0.2967, pruned_loss=0.08889, over 23759.00 frames. ], tot_loss[loss=0.2372, simple_loss=0.2967, pruned_loss=0.08889, over 23759.00 frames. ], batch size: 232, lr: 1.34e-02, grad_scale: 32.0 2023-09-29 04:24:19,232 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-29 04:24:33,519 INFO [train.py:1071] (1/4) Epoch 8, validation: loss=0.2869, simple_loss=0.2985, pruned_loss=0.1377, over 1125622.00 frames. 2023-09-29 04:24:33,520 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-29 04:24:33,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 04:24:35,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:24:36,752 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:24:39,354 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.29 vs. limit=22.5 2023-09-29 04:24:41,399 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:24:41,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:24:41,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:24:42,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 04:24:44,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 04:24:47,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:24:49,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:24:53,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:24:53,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:24:55,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:24:55,437 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:24:57,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 04:25:01,043 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:25:10,298 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:25:10,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:25:13,280 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 04:25:17,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:25:17,849 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 04:25:19,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:25:24,213 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:25:30,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:25:35,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 04:25:39,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 04:25:39,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:25:39,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:25:40,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:25:41,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:25:42,218 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=248160.0, ans=0.2 2023-09-29 04:25:46,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 04:25:47,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:25:49,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:25:52,366 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:25:55,303 INFO [train.py:1039] (1/4) Epoch 8, batch 50, loss[loss=0.2427, simple_loss=0.2927, pruned_loss=0.09638, over 22724.00 frames. ], tot_loss[loss=0.2212, simple_loss=0.2888, pruned_loss=0.07676, over 1069715.11 frames. ], batch size: 322, lr: 1.34e-02, grad_scale: 32.0 2023-09-29 04:25:55,387 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 04:25:55,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:25:55,931 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_ff2.min_abs, batch_count=248226.66666666666, ans=0.1 2023-09-29 04:25:57,212 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=248226.66666666666, ans=0.0 2023-09-29 04:26:00,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:26:01,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:26:01,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 04:26:03,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:26:03,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:26:06,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:26:07,948 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:26:08,316 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=248226.66666666666, ans=0.125 2023-09-29 04:26:09,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:26:12,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 04:26:14,277 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:26:23,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 04:26:24,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 04:26:26,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 04:26:27,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:26:29,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:26:29,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:26:31,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:26:31,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 04:26:32,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 04:26:32,542 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:26:40,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:26:42,323 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:26:42,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 04:26:42,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 04:26:45,462 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 04:26:46,039 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.01 vs. limit=22.5 2023-09-29 04:26:46,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 04:26:46,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 04:26:49,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:26:49,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 04:26:50,453 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.737e+02 2.177e+02 2.443e+02 2.821e+02 4.431e+02, threshold=4.886e+02, percent-clipped=0.0 2023-09-29 04:26:55,510 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.58 vs. limit=15.0 2023-09-29 04:26:56,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:26:56,939 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=248426.66666666666, ans=0.125 2023-09-29 04:26:57,973 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:26:58,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:27:00,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:27:00,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 04:27:04,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 04:27:04,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 04:27:05,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:27:05,677 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 04:27:07,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:27:07,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:27:07,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 04:27:08,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 04:27:10,256 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 04:27:12,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:27:12,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:27:13,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 04:27:13,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 04:27:13,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:27:15,402 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:27:16,870 INFO [train.py:1039] (1/4) Epoch 8, batch 100, loss[loss=0.1968, simple_loss=0.2679, pruned_loss=0.06286, over 20105.00 frames. ], tot_loss[loss=0.2249, simple_loss=0.2913, pruned_loss=0.07923, over 1853865.78 frames. ], batch size: 44, lr: 1.34e-02, grad_scale: 32.0 2023-09-29 04:27:16,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 04:27:16,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:27:18,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:27:19,690 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=7.37 vs. limit=15.0 2023-09-29 04:27:23,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:27:26,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:27:30,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 04:27:30,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:27:34,015 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:27:34,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:27:34,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:27:34,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:27:34,133 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:27:37,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 04:27:38,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 04:27:40,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:27:40,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:27:40,095 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:27:44,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 04:27:44,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:27:46,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:27:48,238 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 04:27:49,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 04:27:52,891 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 04:27:52,917 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 04:27:54,584 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:27:54,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:27:57,686 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=7.80 vs. limit=15.0 2023-09-29 04:27:59,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 04:28:01,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:28:01,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:28:07,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:28:08,988 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 04:28:10,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 04:28:13,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:28:15,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:28:18,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:28:20,437 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:28:23,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:28:25,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:28:25,907 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.68 vs. limit=15.0 2023-09-29 04:28:29,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:28:29,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:28:31,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:28:31,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:28:31,541 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:28:32,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 04:28:33,020 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 04:28:33,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:28:33,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:28:34,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:28:34,697 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:28:34,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 04:28:34,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 04:28:34,852 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 04:28:34,863 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:28:36,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:28:38,425 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:28:40,436 INFO [train.py:1039] (1/4) Epoch 8, batch 150, loss[loss=0.2436, simple_loss=0.2959, pruned_loss=0.09561, over 22889.00 frames. ], tot_loss[loss=0.2261, simple_loss=0.2921, pruned_loss=0.08009, over 2490921.61 frames. ], batch size: 322, lr: 1.34e-02, grad_scale: 32.0 2023-09-29 04:28:40,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:28:40,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:28:42,351 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=248893.33333333334, ans=0.0 2023-09-29 04:28:42,748 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.12 vs. limit=15.0 2023-09-29 04:28:43,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:28:43,879 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=248893.33333333334, ans=0.0 2023-09-29 04:28:46,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:28:46,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:28:46,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:28:48,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:28:50,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:28:51,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:28:51,850 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:28:55,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 04:28:56,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 04:28:56,910 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 04:28:59,961 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:28:59,970 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 04:29:01,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:29:04,976 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:29:04,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:29:05,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:29:06,502 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:29:08,224 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 04:29:11,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:29:15,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:29:20,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:29:20,321 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 04:29:24,005 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2.whitening_limit, batch_count=249026.66666666666, ans=15.0 2023-09-29 04:29:24,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:29:24,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:29:24,980 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:29:25,320 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=249026.66666666666, ans=0.0 2023-09-29 04:29:25,382 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=249026.66666666666, ans=0.125 2023-09-29 04:29:26,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:29:28,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:29:30,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:29:31,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:29:31,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 04:29:36,130 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.653e+02 2.113e+02 2.401e+02 2.733e+02 5.079e+02, threshold=4.803e+02, percent-clipped=1.0 2023-09-29 04:29:38,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:29:40,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:29:40,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:29:40,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:29:43,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:29:45,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 04:29:48,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:29:50,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:29:52,396 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:29:55,414 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:29:55,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 04:29:55,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:29:55,518 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 04:29:58,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:30:02,340 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=249226.66666666666, ans=0.125 2023-09-29 04:30:03,312 INFO [train.py:1039] (1/4) Epoch 8, batch 200, loss[loss=0.2159, simple_loss=0.2942, pruned_loss=0.06879, over 24401.00 frames. ], tot_loss[loss=0.2262, simple_loss=0.2921, pruned_loss=0.08013, over 2987076.68 frames. ], batch size: 77, lr: 1.33e-02, grad_scale: 32.0 2023-09-29 04:30:03,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:30:03,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:30:05,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 04:30:07,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:30:07,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:30:10,207 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 04:30:11,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 04:30:13,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:30:15,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:30:18,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:30:18,985 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:30:19,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:30:28,942 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=249293.33333333334, ans=0.04949747468305833 2023-09-29 04:30:38,228 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=249360.0, ans=0.1 2023-09-29 04:30:43,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:30:43,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:30:43,495 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=249360.0, ans=0.0 2023-09-29 04:30:44,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:30:46,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:30:46,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 04:30:46,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 04:30:47,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:30:49,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 04:30:49,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:30:49,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:30:51,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 04:30:53,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 04:30:53,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:30:58,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:30:58,806 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=249426.66666666666, ans=0.025 2023-09-29 04:31:03,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:31:09,835 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:31:09,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:31:16,840 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:31:19,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 04:31:21,361 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:31:21,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:31:21,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:31:23,078 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:31:24,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 04:31:25,899 INFO [train.py:1039] (1/4) Epoch 8, batch 250, loss[loss=0.2096, simple_loss=0.2741, pruned_loss=0.07255, over 24422.00 frames. ], tot_loss[loss=0.2242, simple_loss=0.291, pruned_loss=0.07867, over 3380766.58 frames. ], batch size: 58, lr: 1.33e-02, grad_scale: 32.0 2023-09-29 04:31:25,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:31:26,003 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 04:31:28,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:31:29,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:31:30,750 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.79 vs. limit=6.0 2023-09-29 04:31:32,709 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:31:32,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:31:34,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:31:36,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:31:36,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:31:39,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:31:50,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:31:52,706 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=249626.66666666666, ans=0.2 2023-09-29 04:31:55,356 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:31:55,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:32:03,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 04:32:03,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 04:32:04,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:32:05,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:32:05,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 04:32:05,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:32:05,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:32:09,087 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=249693.33333333334, ans=0.125 2023-09-29 04:32:10,201 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:32:13,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 04:32:13,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:32:16,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:32:16,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:32:16,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:32:17,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:32:17,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 04:32:17,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:32:20,304 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.06 vs. limit=12.0 2023-09-29 04:32:20,840 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.722e+02 2.201e+02 2.590e+02 2.939e+02 4.400e+02, threshold=5.181e+02, percent-clipped=0.0 2023-09-29 04:32:20,993 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:32:22,533 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:32:22,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:32:27,107 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:32:33,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:32:36,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:32:41,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:32:41,427 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=249826.66666666666, ans=0.125 2023-09-29 04:32:42,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:32:46,212 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 04:32:47,792 INFO [train.py:1039] (1/4) Epoch 8, batch 300, loss[loss=0.2471, simple_loss=0.3188, pruned_loss=0.08771, over 24325.00 frames. ], tot_loss[loss=0.2229, simple_loss=0.2897, pruned_loss=0.0781, over 3681129.24 frames. ], batch size: 77, lr: 1.33e-02, grad_scale: 16.0 2023-09-29 04:32:47,871 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:32:47,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:32:48,305 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=249893.33333333334, ans=0.0 2023-09-29 04:32:49,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 04:32:50,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 04:32:51,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:32:51,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 04:32:55,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:32:55,992 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:33:00,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:33:00,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 04:33:02,137 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:33:04,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 04:33:04,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 04:33:04,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:33:04,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=249960.0, ans=0.1 2023-09-29 04:33:08,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 04:33:14,134 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:33:16,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 04:33:19,254 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 04:33:19,318 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:33:20,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:33:22,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:33:22,611 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 04:33:22,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:33:25,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:33:27,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:33:28,693 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:33:33,304 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 04:33:33,311 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 04:33:33,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:33:36,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:33:38,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 04:33:40,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:33:45,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:33:47,837 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=250093.33333333334, ans=0.2 2023-09-29 04:33:49,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:33:49,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 04:33:54,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:33:54,659 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 04:33:56,324 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:33:56,761 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=250160.0, ans=0.125 2023-09-29 04:33:57,822 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:33:57,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 04:33:57,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 04:33:59,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:34:01,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 04:34:01,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:34:02,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:34:04,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:34:04,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:34:05,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:34:07,480 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 04:34:10,633 INFO [train.py:1039] (1/4) Epoch 8, batch 350, loss[loss=0.2417, simple_loss=0.3079, pruned_loss=0.08777, over 24059.00 frames. ], tot_loss[loss=0.2206, simple_loss=0.2866, pruned_loss=0.07733, over 3904058.06 frames. ], batch size: 80, lr: 1.33e-02, grad_scale: 16.0 2023-09-29 04:34:12,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:34:12,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 04:34:15,325 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:34:15,797 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=250226.66666666666, ans=0.2 2023-09-29 04:34:17,243 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 04:34:22,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:34:25,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:34:26,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:34:27,979 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 04:34:29,144 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 04:34:30,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:34:30,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 04:34:33,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:34:33,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 04:34:35,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:34:36,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 04:34:38,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:34:40,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:34:41,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:34:43,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:34:43,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:34:45,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:34:45,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:34:45,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:34:46,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:34:46,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:34:55,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:34:55,540 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 04:34:55,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:34:55,655 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:34:59,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 04:34:59,752 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:34:59,980 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=250426.66666666666, ans=0.125 2023-09-29 04:35:05,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:35:05,760 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:35:05,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:35:07,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 04:35:07,738 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=250426.66666666666, ans=0.0 2023-09-29 04:35:08,751 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.953e+02 2.305e+02 2.882e+02 6.292e+02, threshold=4.610e+02, percent-clipped=1.0 2023-09-29 04:35:09,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:35:10,503 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 04:35:12,084 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 04:35:12,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:35:15,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:35:15,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 04:35:17,571 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.69 vs. limit=15.0 2023-09-29 04:35:18,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:35:22,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:35:22,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:35:24,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:35:24,367 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:35:27,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:35:31,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:35:34,509 INFO [train.py:1039] (1/4) Epoch 8, batch 400, loss[loss=0.2012, simple_loss=0.2757, pruned_loss=0.06331, over 24637.00 frames. ], tot_loss[loss=0.22, simple_loss=0.2852, pruned_loss=0.07737, over 4059395.68 frames. ], batch size: 65, lr: 1.33e-02, grad_scale: 32.0 2023-09-29 04:35:34,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:35:34,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 04:35:36,199 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:35:36,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:35:36,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:35:37,278 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.26 vs. limit=10.0 2023-09-29 04:35:37,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:35:39,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:35:39,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:35:41,608 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 04:35:43,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 04:35:43,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:35:44,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 04:35:45,034 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=250560.0, ans=0.1 2023-09-29 04:35:46,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:35:49,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:35:49,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:35:49,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 04:35:49,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:35:49,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:35:51,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:35:51,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:35:55,262 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 04:35:56,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 04:35:56,838 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=250626.66666666666, ans=0.0 2023-09-29 04:36:03,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:36:03,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:36:05,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 04:36:07,301 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 04:36:09,223 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=250693.33333333334, ans=0.0 2023-09-29 04:36:10,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:36:12,276 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:36:19,879 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 04:36:22,966 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 04:36:24,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 04:36:27,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:36:27,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:36:29,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 04:36:33,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:36:33,682 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=250760.0, ans=0.1 2023-09-29 04:36:36,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 04:36:39,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:36:43,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:36:44,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 04:36:46,285 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 04:36:46,502 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=250826.66666666666, ans=0.0 2023-09-29 04:36:47,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 04:36:50,048 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=13.75 vs. limit=15.0 2023-09-29 04:36:50,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 04:36:50,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:36:52,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 04:36:55,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 04:36:55,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:36:55,496 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 04:36:56,361 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.62 vs. limit=15.0 2023-09-29 04:36:56,768 INFO [train.py:1039] (1/4) Epoch 8, batch 450, loss[loss=0.2203, simple_loss=0.2846, pruned_loss=0.07799, over 23311.00 frames. ], tot_loss[loss=0.221, simple_loss=0.2864, pruned_loss=0.0778, over 4207722.64 frames. ], batch size: 119, lr: 1.33e-02, grad_scale: 16.0 2023-09-29 04:36:57,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 04:36:57,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:36:58,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:36:59,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:36:59,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 04:36:59,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:37:01,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 04:37:03,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:37:16,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:37:16,421 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:37:18,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 04:37:18,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 04:37:23,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 04:37:25,136 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=250960.0, ans=0.0 2023-09-29 04:37:26,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:37:28,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:37:29,794 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=251026.66666666666, ans=0.2 2023-09-29 04:37:31,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:37:32,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:37:32,976 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=251026.66666666666, ans=0.125 2023-09-29 04:37:35,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 04:37:35,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 04:37:38,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 04:37:38,726 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:37:38,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:37:41,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:37:43,848 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 04:37:43,862 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 04:37:45,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:37:47,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:37:49,331 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 04:37:52,496 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 04:37:52,550 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:37:52,907 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=251093.33333333334, ans=0.125 2023-09-29 04:37:54,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 04:37:54,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 04:37:56,022 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 2.144e+02 2.402e+02 2.848e+02 5.479e+02, threshold=4.804e+02, percent-clipped=2.0 2023-09-29 04:37:57,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:37:59,301 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 04:37:59,350 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 04:37:59,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 04:38:04,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:38:04,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 04:38:05,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 04:38:06,374 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.69 vs. limit=6.0 2023-09-29 04:38:07,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:38:13,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:38:15,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:38:17,117 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:38:18,557 INFO [train.py:1039] (1/4) Epoch 8, batch 500, loss[loss=0.2181, simple_loss=0.2961, pruned_loss=0.07009, over 24454.00 frames. ], tot_loss[loss=0.2221, simple_loss=0.288, pruned_loss=0.07817, over 4328989.30 frames. ], batch size: 69, lr: 1.33e-02, grad_scale: 16.0 2023-09-29 04:38:18,625 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 04:38:22,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:38:22,807 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=251226.66666666666, ans=0.1 2023-09-29 04:38:24,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:38:24,093 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:38:25,514 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 04:38:25,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 04:38:25,711 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:38:29,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 04:38:32,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 04:38:35,419 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:38:37,012 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:38:38,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:38:39,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:38:49,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:38:49,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 04:38:50,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 04:38:50,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:38:51,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 04:38:51,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:38:54,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:38:56,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:38:56,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:38:56,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:38:58,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 04:39:00,710 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 04:39:02,547 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=251360.0, ans=0.2 2023-09-29 04:39:03,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:39:04,164 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=251360.0, ans=0.0 2023-09-29 04:39:05,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:39:06,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:39:07,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:39:07,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 04:39:07,173 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=251426.66666666666, ans=0.035 2023-09-29 04:39:09,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 04:39:13,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:39:14,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:39:17,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:39:19,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:39:25,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:39:28,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 04:39:28,093 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:39:28,131 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:39:32,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 04:39:34,012 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 04:39:35,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:39:35,933 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=251493.33333333334, ans=0.0 2023-09-29 04:39:41,370 INFO [train.py:1039] (1/4) Epoch 8, batch 550, loss[loss=0.2352, simple_loss=0.2974, pruned_loss=0.08648, over 24425.00 frames. ], tot_loss[loss=0.2228, simple_loss=0.2885, pruned_loss=0.07853, over 4413087.09 frames. ], batch size: 63, lr: 1.33e-02, grad_scale: 16.0 2023-09-29 04:39:41,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 04:39:43,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 04:39:43,236 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:39:43,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 04:39:44,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:39:44,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:39:46,097 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:39:47,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:39:47,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:39:49,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:39:50,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:39:52,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 04:39:52,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:39:58,149 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:39:58,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:40:00,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:40:02,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:40:08,758 WARNING [train.py:1197] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 04:40:10,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 04:40:11,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:40:16,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:40:16,329 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:40:16,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 04:40:20,926 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:40:20,935 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 04:40:22,408 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:40:23,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 04:40:25,624 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:40:25,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 04:40:27,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 04:40:27,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:40:28,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 04:40:29,254 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=251760.0, ans=0.125 2023-09-29 04:40:29,590 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.84 vs. limit=15.0 2023-09-29 04:40:30,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 04:40:30,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:40:32,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:40:32,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:40:32,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:40:37,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:40:37,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:40:38,972 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 2.030e+02 2.358e+02 2.809e+02 4.445e+02, threshold=4.716e+02, percent-clipped=0.0 2023-09-29 04:40:39,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:40:41,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:40:42,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 04:40:43,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:40:44,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:40:46,073 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:40:46,167 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:40:46,640 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=14.33 vs. limit=22.5 2023-09-29 04:40:48,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 04:40:49,025 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 04:40:50,699 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=251826.66666666666, ans=0.125 2023-09-29 04:40:55,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 04:40:58,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 04:40:59,764 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:40:59,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 04:41:01,244 INFO [train.py:1039] (1/4) Epoch 8, batch 600, loss[loss=0.2618, simple_loss=0.3029, pruned_loss=0.1104, over 19973.00 frames. ], tot_loss[loss=0.2236, simple_loss=0.2891, pruned_loss=0.07908, over 4477926.37 frames. ], batch size: 388, lr: 1.33e-02, grad_scale: 16.0 2023-09-29 04:41:01,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:41:01,969 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.44 vs. limit=6.0 2023-09-29 04:41:08,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:41:12,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 04:41:14,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 04:41:15,767 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 04:41:18,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:41:21,190 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:41:24,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 04:41:24,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:41:30,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 04:41:33,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:41:33,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:41:33,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:41:40,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:41:40,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:41:43,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:41:51,183 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 04:41:52,203 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.56 vs. limit=6.0 2023-09-29 04:41:56,285 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:41:56,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:41:56,305 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:42:01,753 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=13.44 vs. limit=15.0 2023-09-29 04:42:02,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 04:42:07,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 04:42:08,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:42:12,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 04:42:12,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:42:15,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 04:42:15,909 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:42:16,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:42:22,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 04:42:25,708 INFO [train.py:1039] (1/4) Epoch 8, batch 650, loss[loss=0.1996, simple_loss=0.2804, pruned_loss=0.05937, over 24636.00 frames. ], tot_loss[loss=0.2221, simple_loss=0.2877, pruned_loss=0.07828, over 4533000.23 frames. ], batch size: 68, lr: 1.33e-02, grad_scale: 16.0 2023-09-29 04:42:25,803 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 04:42:28,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:42:29,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:42:31,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:42:34,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 04:42:35,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:42:40,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:42:40,543 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:42:43,629 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:42:47,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 04:42:48,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:42:48,954 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:42:52,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:42:52,324 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=252293.33333333334, ans=0.125 2023-09-29 04:42:54,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 04:42:55,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:42:57,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:42:59,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 04:42:59,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:43:02,776 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 04:43:05,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:43:05,748 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 04:43:05,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:43:05,783 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:43:10,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:43:11,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:43:11,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:43:11,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:43:12,682 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=13.63 vs. limit=15.0 2023-09-29 04:43:13,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 04:43:14,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:43:14,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:43:16,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 04:43:16,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:43:18,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 04:43:19,542 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 04:43:19,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 04:43:19,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:43:19,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:43:21,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:43:21,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:43:23,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:43:26,234 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 2.104e+02 2.347e+02 2.945e+02 4.272e+02, threshold=4.693e+02, percent-clipped=0.0 2023-09-29 04:43:30,114 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=252493.33333333334, ans=0.1 2023-09-29 04:43:31,443 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:43:32,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:43:34,476 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:43:37,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:43:37,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 04:43:38,635 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:43:46,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 04:43:46,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:43:47,889 INFO [train.py:1039] (1/4) Epoch 8, batch 700, loss[loss=0.2188, simple_loss=0.2785, pruned_loss=0.07954, over 23269.00 frames. ], tot_loss[loss=0.2205, simple_loss=0.2863, pruned_loss=0.07738, over 4569638.63 frames. ], batch size: 119, lr: 1.33e-02, grad_scale: 8.0 2023-09-29 04:43:47,972 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:43:48,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:43:52,715 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 04:43:52,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 04:43:55,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 04:43:56,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:43:59,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:44:01,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 04:44:06,549 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:44:08,469 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=252626.66666666666, ans=0.125 2023-09-29 04:44:09,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:44:11,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:44:13,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 04:44:13,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:44:16,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:44:18,364 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=252626.66666666666, ans=0.2 2023-09-29 04:44:19,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 04:44:19,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:44:21,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 04:44:22,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 04:44:26,022 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 04:44:26,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:44:27,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 04:44:28,502 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=252693.33333333334, ans=0.0 2023-09-29 04:44:32,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:44:34,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 04:44:38,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:44:38,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:44:38,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 04:44:43,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:44:45,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:44:48,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:44:52,033 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=252760.0, ans=0.125 2023-09-29 04:44:53,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:44:53,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 04:44:56,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 04:44:56,302 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 04:44:59,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:45:03,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:45:04,421 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.79 vs. limit=6.0 2023-09-29 04:45:04,925 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:45:05,153 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:45:06,536 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 04:45:10,234 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.86 vs. limit=10.0 2023-09-29 04:45:11,570 INFO [train.py:1039] (1/4) Epoch 8, batch 750, loss[loss=0.194, simple_loss=0.2731, pruned_loss=0.05749, over 24485.00 frames. ], tot_loss[loss=0.2193, simple_loss=0.2847, pruned_loss=0.07692, over 4593953.14 frames. ], batch size: 66, lr: 1.33e-02, grad_scale: 8.0 2023-09-29 04:45:11,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 04:45:11,699 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 04:45:11,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 04:45:13,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 04:45:13,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 04:45:14,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:45:16,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 04:45:17,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:45:17,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 04:45:19,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:45:21,364 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:45:21,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 04:45:21,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:45:24,599 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:45:26,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:45:28,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:45:31,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:45:33,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:45:33,998 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 04:45:35,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:45:36,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:45:37,137 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:45:39,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 04:45:40,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 04:45:40,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:45:42,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 04:45:44,033 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 04:45:45,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 04:45:45,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:45:45,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 04:45:45,913 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=253026.66666666666, ans=0.04949747468305833 2023-09-29 04:45:47,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:45:55,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 04:45:55,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:45:55,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 04:45:58,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:45:58,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:46:00,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 04:46:00,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:46:01,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 04:46:02,145 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=253093.33333333334, ans=0.125 2023-09-29 04:46:03,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:46:08,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:46:08,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 04:46:08,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:46:11,308 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.684e+02 2.056e+02 2.287e+02 2.694e+02 4.439e+02, threshold=4.575e+02, percent-clipped=0.0 2023-09-29 04:46:13,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:46:13,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:46:15,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:46:18,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:46:22,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 04:46:24,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:46:24,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:46:28,457 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:46:28,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:46:31,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:46:31,850 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=253226.66666666666, ans=0.0 2023-09-29 04:46:32,917 INFO [train.py:1039] (1/4) Epoch 8, batch 800, loss[loss=0.24, simple_loss=0.311, pruned_loss=0.0845, over 23936.00 frames. ], tot_loss[loss=0.2193, simple_loss=0.2851, pruned_loss=0.07672, over 4627027.38 frames. ], batch size: 86, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:46:32,981 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 04:46:42,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:46:42,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:46:44,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:46:44,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:46:46,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:46:46,277 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:46:47,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:46:53,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:46:54,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:46:56,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 04:46:56,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:46:59,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:46:59,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:46:59,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:46:59,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 04:46:59,863 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:47:01,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 04:47:04,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:47:07,138 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:47:08,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:47:08,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:47:13,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:47:13,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:47:13,492 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=253360.0, ans=0.0 2023-09-29 04:47:16,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:47:18,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:47:18,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 04:47:20,160 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 04:47:21,516 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 04:47:21,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 04:47:21,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:47:23,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:47:23,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:47:28,454 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 04:47:29,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 04:47:31,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:47:33,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:47:36,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:47:40,054 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:47:40,337 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=253493.33333333334, ans=0.1 2023-09-29 04:47:41,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 04:47:42,972 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:47:44,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 04:47:52,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:47:52,623 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=253493.33333333334, ans=0.125 2023-09-29 04:47:56,237 INFO [train.py:1039] (1/4) Epoch 8, batch 850, loss[loss=0.1957, simple_loss=0.2707, pruned_loss=0.06039, over 24582.00 frames. ], tot_loss[loss=0.2206, simple_loss=0.286, pruned_loss=0.07757, over 4653407.28 frames. ], batch size: 60, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:47:56,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:47:56,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 04:47:57,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:47:59,286 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:47:59,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 04:48:00,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:48:02,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:48:03,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:48:05,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 04:48:06,969 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:48:08,563 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 04:48:08,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 04:48:08,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 04:48:10,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:48:10,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:48:11,048 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=253626.66666666666, ans=0.125 2023-09-29 04:48:12,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:48:14,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:48:14,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:48:20,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:48:20,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:48:20,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 04:48:24,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 04:48:26,750 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:48:28,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 04:48:32,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 04:48:34,038 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 04:48:37,168 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 04:48:37,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:48:37,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:48:37,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 04:48:40,120 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:48:41,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:48:41,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 04:48:42,214 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=253693.33333333334, ans=0.0 2023-09-29 04:48:43,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:48:45,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:48:45,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=253760.0, ans=0.0 2023-09-29 04:48:47,750 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:48:47,793 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 04:48:50,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:48:52,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 04:48:52,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 04:48:56,835 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.738e+02 2.108e+02 2.249e+02 2.560e+02 3.769e+02, threshold=4.498e+02, percent-clipped=0.0 2023-09-29 04:48:57,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:48:57,120 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:48:58,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:48:58,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:49:00,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:49:01,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:49:05,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:49:05,397 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=253826.66666666666, ans=0.125 2023-09-29 04:49:07,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 04:49:08,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:49:08,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 04:49:15,893 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.70 vs. limit=15.0 2023-09-29 04:49:17,879 INFO [train.py:1039] (1/4) Epoch 8, batch 900, loss[loss=0.234, simple_loss=0.2957, pruned_loss=0.08621, over 23254.00 frames. ], tot_loss[loss=0.2201, simple_loss=0.2861, pruned_loss=0.07704, over 4677226.37 frames. ], batch size: 105, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:49:18,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 04:49:20,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:49:20,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 04:49:20,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:49:20,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:49:20,651 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=253893.33333333334, ans=0.125 2023-09-29 04:49:21,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 04:49:23,598 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=253893.33333333334, ans=0.2 2023-09-29 04:49:23,629 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=253893.33333333334, ans=0.1 2023-09-29 04:49:28,801 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:49:33,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:49:33,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 04:49:36,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:49:36,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 04:49:38,555 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 04:49:40,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:49:40,134 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:49:40,216 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 04:49:40,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:49:50,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:49:50,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:49:50,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 04:49:52,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:49:56,094 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=254026.66666666666, ans=0.125 2023-09-29 04:49:58,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 04:49:58,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=254026.66666666666, ans=0.125 2023-09-29 04:50:00,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:50:04,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 04:50:04,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 04:50:05,768 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 04:50:05,895 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 04:50:06,217 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=254026.66666666666, ans=0.0 2023-09-29 04:50:12,593 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=254093.33333333334, ans=0.125 2023-09-29 04:50:15,262 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 04:50:15,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:50:15,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 04:50:21,585 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:50:21,611 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:50:25,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 04:50:25,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:50:28,193 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 04:50:29,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:50:31,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:50:31,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:50:31,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:50:33,237 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=254160.0, ans=0.95 2023-09-29 04:50:36,748 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 04:50:36,821 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 04:50:38,357 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 04:50:38,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 04:50:40,229 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=254226.66666666666, ans=0.07 2023-09-29 04:50:41,189 INFO [train.py:1039] (1/4) Epoch 8, batch 950, loss[loss=0.2241, simple_loss=0.2816, pruned_loss=0.08332, over 23657.00 frames. ], tot_loss[loss=0.2215, simple_loss=0.2873, pruned_loss=0.07787, over 4679675.36 frames. ], batch size: 232, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:50:41,381 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:50:46,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 04:50:49,208 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=254226.66666666666, ans=0.1 2023-09-29 04:50:52,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:50:52,418 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=254226.66666666666, ans=0.0 2023-09-29 04:50:53,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:50:53,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:50:55,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 04:50:56,846 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 04:51:01,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:51:02,050 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:51:03,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:51:04,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:51:04,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 04:51:04,993 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 04:51:07,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:51:08,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 04:51:08,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:51:12,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:51:12,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:51:13,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:51:13,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 04:51:15,214 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 04:51:18,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:51:20,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:51:24,755 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:51:24,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:51:28,011 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 04:51:29,607 WARNING [train.py:1197] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 04:51:29,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:51:29,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:51:31,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:51:31,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:51:32,035 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=254426.66666666666, ans=0.2 2023-09-29 04:51:37,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 04:51:39,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:51:39,487 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=254426.66666666666, ans=0.1 2023-09-29 04:51:41,015 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=254426.66666666666, ans=0.0 2023-09-29 04:51:41,993 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 1.975e+02 2.273e+02 2.583e+02 4.078e+02, threshold=4.545e+02, percent-clipped=0.0 2023-09-29 04:51:42,114 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:51:42,220 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:51:42,253 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 04:51:42,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:51:42,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:51:42,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 04:51:44,666 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 04:51:48,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:51:53,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:51:59,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:51:59,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 04:52:01,064 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 04:52:03,997 INFO [train.py:1039] (1/4) Epoch 8, batch 1000, loss[loss=0.2214, simple_loss=0.2735, pruned_loss=0.08466, over 23712.00 frames. ], tot_loss[loss=0.2206, simple_loss=0.2866, pruned_loss=0.07732, over 4694272.68 frames. ], batch size: 212, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:52:04,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:52:04,591 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=254560.0, ans=0.125 2023-09-29 04:52:07,376 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 04:52:09,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:52:15,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:52:16,931 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 04:52:16,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 04:52:22,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:52:22,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:52:22,710 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=254626.66666666666, ans=0.125 2023-09-29 04:52:23,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:52:27,500 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 04:52:31,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 04:52:32,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 04:52:32,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:52:34,609 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 04:52:37,472 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 04:52:37,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 04:52:39,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:52:40,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:52:49,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:52:50,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:52:50,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:52:51,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:52:51,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 04:52:52,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:52:53,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:52:53,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:52:53,739 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 04:52:58,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 04:52:58,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 04:53:00,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 04:53:01,183 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=254760.0, ans=0.125 2023-09-29 04:53:02,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:53:09,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:53:09,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:53:10,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:53:10,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:53:13,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 04:53:14,958 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.68 vs. limit=22.5 2023-09-29 04:53:15,600 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:53:15,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 04:53:15,942 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=254826.66666666666, ans=0.125 2023-09-29 04:53:17,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 04:53:19,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:53:19,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:53:22,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:53:22,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 04:53:26,020 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:53:27,355 INFO [train.py:1039] (1/4) Epoch 8, batch 1050, loss[loss=0.2114, simple_loss=0.291, pruned_loss=0.06587, over 24645.00 frames. ], tot_loss[loss=0.2199, simple_loss=0.286, pruned_loss=0.07691, over 4701260.10 frames. ], batch size: 65, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:53:30,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:53:30,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:53:32,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 04:53:33,763 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:53:38,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:53:39,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:53:41,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:53:42,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:53:44,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:53:44,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 04:53:44,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:53:46,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 04:53:47,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:53:47,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 04:53:49,915 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.02 vs. limit=15.0 2023-09-29 04:53:50,625 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:53:50,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 04:53:50,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 04:53:57,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:53:59,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 04:54:00,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:54:02,725 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=255026.66666666666, ans=0.125 2023-09-29 04:54:03,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 04:54:03,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 04:54:03,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:54:07,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 04:54:11,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 04:54:12,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:54:15,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 04:54:19,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 04:54:19,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:54:19,188 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:54:22,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:54:25,931 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.98 vs. limit=22.5 2023-09-29 04:54:27,415 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 04:54:28,814 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.631e+02 2.059e+02 2.267e+02 2.771e+02 5.438e+02, threshold=4.534e+02, percent-clipped=2.0 2023-09-29 04:54:28,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 04:54:29,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 04:54:30,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:54:30,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 04:54:31,050 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=255093.33333333334, ans=0.125 2023-09-29 04:54:32,089 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 04:54:37,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:54:38,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:54:38,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:54:38,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:54:38,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:54:44,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:54:44,110 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 04:54:45,785 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=10.31 vs. limit=15.0 2023-09-29 04:54:46,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:54:46,204 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 04:54:47,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 04:54:47,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:54:50,434 INFO [train.py:1039] (1/4) Epoch 8, batch 1100, loss[loss=0.2408, simple_loss=0.2918, pruned_loss=0.09491, over 23820.00 frames. ], tot_loss[loss=0.2198, simple_loss=0.2856, pruned_loss=0.07697, over 4702081.19 frames. ], batch size: 164, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:54:51,203 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.17 vs. limit=6.0 2023-09-29 04:54:52,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:54:56,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:55:00,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:55:03,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:55:03,791 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:55:03,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 04:55:05,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:55:07,228 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=255293.33333333334, ans=0.0 2023-09-29 04:55:08,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:55:10,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:55:13,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:55:15,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 04:55:15,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 04:55:17,517 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:55:17,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:55:20,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:55:23,869 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:55:24,248 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=255360.0, ans=0.125 2023-09-29 04:55:25,749 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=255360.0, ans=0.07 2023-09-29 04:55:29,835 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:55:32,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 04:55:33,567 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.51 vs. limit=15.0 2023-09-29 04:55:34,459 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 04:55:34,651 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=255360.0, ans=0.0 2023-09-29 04:55:35,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:55:37,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:55:39,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 04:55:40,991 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:55:42,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 04:55:43,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:55:43,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:55:43,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:55:44,045 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:55:45,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 04:55:50,771 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:55:50,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 04:55:54,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:55:59,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 04:56:01,079 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 04:56:01,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 04:56:02,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:56:05,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:56:05,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:56:06,022 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=255493.33333333334, ans=0.125 2023-09-29 04:56:07,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 04:56:08,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:56:08,613 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:56:10,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 04:56:10,127 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:56:11,501 INFO [train.py:1039] (1/4) Epoch 8, batch 1150, loss[loss=0.2318, simple_loss=0.2943, pruned_loss=0.08465, over 23447.00 frames. ], tot_loss[loss=0.2201, simple_loss=0.2859, pruned_loss=0.07717, over 4716801.16 frames. ], batch size: 105, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:56:11,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 04:56:13,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:56:13,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:56:15,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 04:56:19,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:56:21,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:56:25,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:56:25,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:56:25,461 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 04:56:25,591 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=255560.0, ans=0.1 2023-09-29 04:56:26,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:56:29,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 04:56:32,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:56:32,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 04:56:36,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 04:56:38,613 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:56:40,283 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=255626.66666666666, ans=0.125 2023-09-29 04:56:41,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:56:43,090 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:56:43,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 04:56:43,173 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 04:56:44,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:56:48,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 04:56:49,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:56:51,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:57:03,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:57:09,807 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:57:09,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 04:57:11,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:57:11,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:57:12,647 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.558e+02 2.096e+02 2.373e+02 2.802e+02 4.520e+02, threshold=4.746e+02, percent-clipped=0.0 2023-09-29 04:57:17,557 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 04:57:19,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:57:27,158 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 04:57:30,224 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:57:32,329 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:57:32,373 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 04:57:33,612 INFO [train.py:1039] (1/4) Epoch 8, batch 1200, loss[loss=0.2112, simple_loss=0.2867, pruned_loss=0.06787, over 24044.00 frames. ], tot_loss[loss=0.2209, simple_loss=0.2864, pruned_loss=0.0777, over 4712825.29 frames. ], batch size: 80, lr: 1.32e-02, grad_scale: 32.0 2023-09-29 04:57:33,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:57:37,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:57:42,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:57:43,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 04:57:45,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:57:45,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:57:45,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:57:46,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:57:48,339 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 04:57:49,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:57:49,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:57:51,453 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 04:57:56,514 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 04:58:00,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 04:58:03,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:58:04,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:58:04,955 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=256026.66666666666, ans=0.1 2023-09-29 04:58:06,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:58:06,952 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 04:58:08,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:58:18,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 04:58:18,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:58:18,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 04:58:20,464 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:58:20,793 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 04:58:23,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 04:58:26,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 04:58:26,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:58:28,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:58:29,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:58:29,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 04:58:31,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:58:31,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:58:33,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:58:33,796 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 04:58:33,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 04:58:35,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:58:35,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 04:58:38,519 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:58:38,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:58:43,449 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 04:58:46,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:58:48,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 04:58:50,492 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 04:58:53,517 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:58:56,294 INFO [train.py:1039] (1/4) Epoch 8, batch 1250, loss[loss=0.2177, simple_loss=0.2872, pruned_loss=0.07407, over 24448.00 frames. ], tot_loss[loss=0.2225, simple_loss=0.2879, pruned_loss=0.07859, over 4714485.20 frames. ], batch size: 63, lr: 1.32e-02, grad_scale: 8.0 2023-09-29 04:58:56,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:58:56,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:58:59,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:59:01,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 04:59:06,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:59:08,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:59:08,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 04:59:08,956 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.04 vs. limit=22.5 2023-09-29 04:59:11,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:59:11,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 04:59:15,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 04:59:17,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:59:18,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:59:18,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:59:20,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 04:59:25,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 04:59:25,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 04:59:25,428 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:59:25,610 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:59:27,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:59:27,453 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=256293.33333333334, ans=0.0 2023-09-29 04:59:30,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:59:31,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 04:59:36,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 04:59:36,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:59:40,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:59:41,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 04:59:43,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:59:43,078 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 04:59:43,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:59:43,131 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:59:47,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:59:50,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:59:51,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:59:52,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 04:59:52,163 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 04:59:52,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 04:59:56,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:59:58,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 04:59:58,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:00:01,795 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.530e+02 1.911e+02 2.146e+02 2.412e+02 3.765e+02, threshold=4.292e+02, percent-clipped=0.0 2023-09-29 05:00:03,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 05:00:03,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:00:03,897 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=256493.33333333334, ans=0.0 2023-09-29 05:00:05,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 05:00:05,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 05:00:05,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:00:05,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 05:00:06,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:00:08,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 05:00:09,227 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.21 vs. limit=22.5 2023-09-29 05:00:10,022 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:00:10,329 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=256493.33333333334, ans=0.2 2023-09-29 05:00:12,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:00:13,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 05:00:13,864 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=256493.33333333334, ans=0.0 2023-09-29 05:00:16,514 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 05:00:19,981 INFO [train.py:1039] (1/4) Epoch 8, batch 1300, loss[loss=0.227, simple_loss=0.301, pruned_loss=0.07648, over 24628.00 frames. ], tot_loss[loss=0.2238, simple_loss=0.289, pruned_loss=0.07929, over 4708248.61 frames. ], batch size: 68, lr: 1.32e-02, grad_scale: 8.0 2023-09-29 05:00:21,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:00:22,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 05:00:25,228 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:00:26,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 05:00:26,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:00:28,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:00:31,662 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:00:33,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 05:00:39,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:00:39,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:00:42,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 05:00:43,118 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.67 vs. limit=15.0 2023-09-29 05:00:44,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 05:00:49,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:00:49,787 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=256626.66666666666, ans=0.0 2023-09-29 05:00:50,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:00:51,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:00:53,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:00:54,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 05:00:54,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 05:00:56,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 05:01:02,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:01:02,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 05:01:04,097 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 05:01:06,213 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 05:01:09,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:01:09,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:01:09,675 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=256760.0, ans=0.2 2023-09-29 05:01:10,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 05:01:10,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:01:11,222 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 05:01:12,390 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 05:01:13,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:01:17,198 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:01:17,203 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:01:22,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 05:01:22,258 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 05:01:23,798 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 05:01:27,754 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:01:30,920 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 05:01:34,293 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:01:36,137 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 05:01:42,298 INFO [train.py:1039] (1/4) Epoch 8, batch 1350, loss[loss=0.2031, simple_loss=0.2742, pruned_loss=0.06604, over 24288.00 frames. ], tot_loss[loss=0.2231, simple_loss=0.2879, pruned_loss=0.07918, over 4709239.66 frames. ], batch size: 56, lr: 1.32e-02, grad_scale: 8.0 2023-09-29 05:01:42,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 05:01:46,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:01:48,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:01:51,679 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:01:51,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:01:54,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:01:54,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 05:01:58,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 05:02:00,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 05:02:02,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 05:02:03,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:02:05,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 05:02:05,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:02:07,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:02:07,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 05:02:10,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 05:02:12,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 05:02:14,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:02:14,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 05:02:17,954 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=257026.66666666666, ans=0.1 2023-09-29 05:02:24,776 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=15.03 vs. limit=15.0 2023-09-29 05:02:25,824 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=257026.66666666666, ans=0.0 2023-09-29 05:02:27,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:02:29,616 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=257026.66666666666, ans=0.0 2023-09-29 05:02:37,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:02:37,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:02:37,733 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 05:02:40,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:02:44,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 05:02:44,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 05:02:45,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:02:47,104 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.702e+02 2.144e+02 2.487e+02 2.898e+02 4.537e+02, threshold=4.974e+02, percent-clipped=1.0 2023-09-29 05:02:48,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:02:50,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 05:02:53,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:02:53,981 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=257160.0, ans=0.125 2023-09-29 05:02:54,103 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=257160.0, ans=0.125 2023-09-29 05:02:58,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 05:03:00,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 05:03:05,415 INFO [train.py:1039] (1/4) Epoch 8, batch 1400, loss[loss=0.1947, simple_loss=0.2649, pruned_loss=0.06222, over 21435.00 frames. ], tot_loss[loss=0.2219, simple_loss=0.2861, pruned_loss=0.07879, over 4686778.52 frames. ], batch size: 47, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:03:09,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 05:03:10,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:03:12,270 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:03:12,623 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=257226.66666666666, ans=0.2 2023-09-29 05:03:13,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:03:13,862 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=257226.66666666666, ans=0.015 2023-09-29 05:03:14,468 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten.whitening_limit, batch_count=257226.66666666666, ans=15.0 2023-09-29 05:03:18,639 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 05:03:22,123 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 05:03:29,459 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=257293.33333333334, ans=0.0 2023-09-29 05:03:30,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:03:31,109 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=257293.33333333334, ans=10.0 2023-09-29 05:03:32,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:03:34,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:03:34,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 05:03:38,145 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:03:38,296 WARNING [train.py:1197] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 05:03:45,485 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.03 vs. limit=6.0 2023-09-29 05:03:48,100 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:03:50,183 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:03:54,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 05:03:54,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:03:56,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:03:56,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:03:57,830 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:03:59,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:03:59,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:04:01,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:04:01,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 05:04:02,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:04:07,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:04:11,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:04:18,524 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=257493.33333333334, ans=0.5 2023-09-29 05:04:18,612 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=257493.33333333334, ans=0.0 2023-09-29 05:04:19,742 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 05:04:21,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 05:04:21,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:04:24,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 05:04:24,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:04:28,165 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:04:29,520 INFO [train.py:1039] (1/4) Epoch 8, batch 1450, loss[loss=0.2124, simple_loss=0.2933, pruned_loss=0.06578, over 24685.00 frames. ], tot_loss[loss=0.2212, simple_loss=0.2857, pruned_loss=0.07837, over 4690368.39 frames. ], batch size: 73, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:04:31,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 05:04:34,465 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:04:34,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:04:34,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 05:04:36,798 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=257560.0, ans=0.125 2023-09-29 05:04:39,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:04:41,199 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 05:04:42,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:04:42,872 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 05:04:44,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 05:04:46,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 05:04:46,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:04:48,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:04:48,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 05:04:50,055 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:04:50,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 05:04:50,377 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=257626.66666666666, ans=0.0 2023-09-29 05:04:51,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 05:04:51,837 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=257626.66666666666, ans=0.0 2023-09-29 05:04:52,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:04:54,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:04:55,979 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:04:59,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:05:02,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:05:02,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:05:05,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:05:05,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:05:10,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:05:10,054 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:05:10,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:05:10,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:05:13,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 05:05:15,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:05:20,624 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 05:05:22,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:05:22,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:05:25,869 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:05:27,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 05:05:29,230 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=257760.0, ans=0.125 2023-09-29 05:05:30,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:05:32,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 05:05:33,284 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 2.099e+02 2.346e+02 2.740e+02 3.754e+02, threshold=4.692e+02, percent-clipped=0.0 2023-09-29 05:05:33,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 05:05:35,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:05:36,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:05:38,332 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:05:40,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 05:05:44,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 05:05:44,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 05:05:45,405 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.91 vs. limit=15.0 2023-09-29 05:05:46,342 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:05:48,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 05:05:51,424 INFO [train.py:1039] (1/4) Epoch 8, batch 1500, loss[loss=0.2088, simple_loss=0.2854, pruned_loss=0.06614, over 24442.00 frames. ], tot_loss[loss=0.2208, simple_loss=0.286, pruned_loss=0.07774, over 4708397.47 frames. ], batch size: 63, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:05:59,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 05:06:00,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:06:00,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:06:01,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:06:02,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:06:03,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:06:05,037 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 05:06:05,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:06:06,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 05:06:06,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:06:08,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:06:09,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:06:11,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:06:15,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:06:15,195 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 05:06:16,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:06:16,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:06:16,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:06:21,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 05:06:27,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 05:06:28,565 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:06:28,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 05:06:31,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 05:06:33,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:06:33,974 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:06:33,999 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:06:35,606 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=258026.66666666666, ans=0.0 2023-09-29 05:06:36,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 05:06:36,971 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:06:38,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:06:38,566 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 05:06:39,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:06:44,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:06:44,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 05:06:52,458 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 05:06:53,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 05:06:59,024 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 05:07:00,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:07:00,504 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 05:07:02,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:07:02,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:07:02,281 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 05:07:02,828 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.96 vs. limit=15.0 2023-09-29 05:07:05,912 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:07:08,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 05:07:11,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:07:12,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:07:12,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:07:14,179 INFO [train.py:1039] (1/4) Epoch 8, batch 1550, loss[loss=0.2264, simple_loss=0.2843, pruned_loss=0.08427, over 23849.00 frames. ], tot_loss[loss=0.2217, simple_loss=0.287, pruned_loss=0.07825, over 4706927.04 frames. ], batch size: 195, lr: 1.31e-02, grad_scale: 4.0 2023-09-29 05:07:14,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:07:14,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:07:15,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:07:16,071 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=258226.66666666666, ans=0.125 2023-09-29 05:07:17,398 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 05:07:18,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 05:07:18,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:07:19,415 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=258226.66666666666, ans=0.125 2023-09-29 05:07:20,437 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 05:07:20,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 05:07:22,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:07:23,705 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:07:23,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:07:25,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:07:27,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:07:27,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:07:29,150 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 05:07:29,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:07:30,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:07:30,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 05:07:31,435 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.23 vs. limit=22.5 2023-09-29 05:07:32,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 05:07:32,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 05:07:34,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:07:36,446 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 05:07:36,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 05:07:36,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 05:07:38,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:07:40,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:07:43,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:07:43,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 05:07:43,625 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 05:07:53,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:07:57,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:07:57,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 05:07:58,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:07:58,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 05:08:04,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 05:08:06,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:08:10,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:08:13,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:08:15,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:08:15,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 05:08:15,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:08:18,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:08:18,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:08:19,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 05:08:19,652 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 05:08:20,935 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 1.986e+02 2.177e+02 2.778e+02 5.075e+02, threshold=4.355e+02, percent-clipped=1.0 2023-09-29 05:08:21,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:08:27,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 05:08:31,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:08:31,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:08:33,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 05:08:36,171 INFO [train.py:1039] (1/4) Epoch 8, batch 1600, loss[loss=0.2491, simple_loss=0.3039, pruned_loss=0.0971, over 23428.00 frames. ], tot_loss[loss=0.2238, simple_loss=0.288, pruned_loss=0.07976, over 4702341.22 frames. ], batch size: 285, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:08:37,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:08:37,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:08:37,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:08:37,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:08:39,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:08:41,524 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.63 vs. limit=6.0 2023-09-29 05:08:43,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:08:43,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 05:08:45,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 05:08:46,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 05:08:50,463 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:08:52,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 05:08:52,467 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=258626.66666666666, ans=0.125 2023-09-29 05:08:53,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:08:55,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:08:59,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:09:00,051 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=258626.66666666666, ans=0.125 2023-09-29 05:09:02,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 05:09:06,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:09:07,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 05:09:07,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:09:09,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 05:09:12,490 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=258693.33333333334, ans=0.2 2023-09-29 05:09:15,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 05:09:22,073 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=258693.33333333334, ans=0.1 2023-09-29 05:09:24,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:09:24,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 05:09:25,610 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.55 vs. limit=15.0 2023-09-29 05:09:26,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:09:26,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:09:26,791 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:09:29,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 05:09:33,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 05:09:34,807 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:09:35,077 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=258760.0, ans=0.2 2023-09-29 05:09:36,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:09:36,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:09:36,417 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:09:39,418 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:09:41,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:09:41,204 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:09:48,137 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=258826.66666666666, ans=0.0 2023-09-29 05:09:49,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:09:49,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:09:52,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 05:09:52,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:09:53,062 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 05:09:58,309 INFO [train.py:1039] (1/4) Epoch 8, batch 1650, loss[loss=0.1973, simple_loss=0.274, pruned_loss=0.06033, over 24461.00 frames. ], tot_loss[loss=0.2241, simple_loss=0.2887, pruned_loss=0.07978, over 4711189.10 frames. ], batch size: 66, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:09:59,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:10:01,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:10:01,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:10:01,431 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 05:10:01,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 05:10:01,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 05:10:03,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 05:10:04,858 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=258893.33333333334, ans=0.07 2023-09-29 05:10:07,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:10:07,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:10:07,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:10:07,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 05:10:07,885 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=258893.33333333334, ans=0.125 2023-09-29 05:10:10,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:10:13,830 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 05:10:15,604 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=258960.0, ans=0.0 2023-09-29 05:10:15,794 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.62 vs. limit=15.0 2023-09-29 05:10:16,692 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:10:16,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:10:16,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:10:16,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:10:16,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 05:10:16,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 05:10:23,068 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 05:10:25,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:10:26,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=258960.0, ans=0.0 2023-09-29 05:10:29,655 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=259026.66666666666, ans=0.0 2023-09-29 05:10:35,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 05:10:35,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:10:37,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 05:10:41,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:10:43,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:10:43,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:10:44,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:10:46,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:10:47,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:10:50,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:10:50,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:10:50,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:10:51,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:10:51,640 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.33 vs. limit=15.0 2023-09-29 05:10:53,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:10:54,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:10:58,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:10:59,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 05:11:01,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:11:02,714 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.647e+02 1.991e+02 2.189e+02 2.754e+02 4.240e+02, threshold=4.377e+02, percent-clipped=0.0 2023-09-29 05:11:02,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 05:11:03,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 05:11:03,037 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 05:11:03,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:11:04,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:11:04,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:11:06,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:11:06,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 05:11:08,887 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.30 vs. limit=15.0 2023-09-29 05:11:09,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:11:11,152 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:11:11,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:11:14,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 05:11:18,716 INFO [train.py:1039] (1/4) Epoch 8, batch 1700, loss[loss=0.2244, simple_loss=0.279, pruned_loss=0.08491, over 23766.00 frames. ], tot_loss[loss=0.2222, simple_loss=0.2873, pruned_loss=0.07853, over 4712051.23 frames. ], batch size: 164, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:11:18,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:11:18,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:11:18,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 05:11:18,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:11:19,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:11:19,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:11:25,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:11:25,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:11:25,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 05:11:28,478 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 05:11:37,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:11:40,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:11:47,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 05:11:47,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:11:48,517 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:11:48,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:11:51,555 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 05:11:53,217 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:11:53,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:11:54,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 05:11:55,857 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.52 vs. limit=22.5 2023-09-29 05:11:56,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 05:11:57,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 05:11:58,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 05:12:00,320 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:12:02,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 05:12:03,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:12:03,865 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=259360.0, ans=0.125 2023-09-29 05:12:11,571 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=259426.66666666666, ans=0.125 2023-09-29 05:12:12,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:12:14,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:12:15,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:12:15,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 05:12:15,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 05:12:17,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:12:18,897 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:12:18,899 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 05:12:18,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:12:19,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:12:20,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:12:20,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:12:22,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:12:22,070 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:12:24,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:12:25,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:12:25,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:12:29,612 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:12:29,769 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 05:12:31,368 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=259493.33333333334, ans=0.125 2023-09-29 05:12:33,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:12:35,362 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:12:35,580 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=259493.33333333334, ans=0.04949747468305833 2023-09-29 05:12:38,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 05:12:41,289 INFO [train.py:1039] (1/4) Epoch 8, batch 1750, loss[loss=0.2031, simple_loss=0.2758, pruned_loss=0.0652, over 24536.00 frames. ], tot_loss[loss=0.2202, simple_loss=0.2858, pruned_loss=0.07727, over 4712009.95 frames. ], batch size: 63, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:12:43,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:12:45,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:12:45,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 05:12:47,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 05:12:47,072 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:12:51,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:12:51,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:12:54,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 05:12:57,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:13:00,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 05:13:00,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:13:02,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:13:05,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 05:13:05,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 05:13:08,893 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:13:08,940 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 05:13:09,590 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.55 vs. limit=10.0 2023-09-29 05:13:19,808 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:13:22,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:13:22,880 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:13:27,445 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:13:27,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:13:29,056 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:13:30,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:13:33,699 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:13:33,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:13:35,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 05:13:36,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:13:39,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 05:13:39,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:13:41,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:13:42,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:13:47,666 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.659e+02 1.975e+02 2.294e+02 2.712e+02 4.778e+02, threshold=4.588e+02, percent-clipped=2.0 2023-09-29 05:13:47,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 05:13:47,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 05:13:49,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:13:49,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:13:56,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:13:58,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:14:00,515 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:14:02,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 05:14:02,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:14:03,441 INFO [train.py:1039] (1/4) Epoch 8, batch 1800, loss[loss=0.2154, simple_loss=0.2917, pruned_loss=0.0696, over 24291.00 frames. ], tot_loss[loss=0.2184, simple_loss=0.2844, pruned_loss=0.07624, over 4719139.39 frames. ], batch size: 74, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:14:03,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 05:14:03,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:14:03,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:14:03,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:14:05,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 05:14:08,193 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:14:09,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:14:10,094 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=259893.33333333334, ans=0.2 2023-09-29 05:14:11,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 05:14:14,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:14:17,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 05:14:19,497 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:14:23,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:14:25,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:14:26,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:14:27,137 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=259960.0, ans=0.125 2023-09-29 05:14:29,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:14:30,380 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:14:30,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 05:14:32,000 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:14:34,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:14:38,132 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 05:14:41,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 05:14:41,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 05:14:41,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:14:41,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:14:41,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:14:42,803 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:14:50,448 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 05:14:50,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:14:50,971 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=260093.33333333334, ans=0.1 2023-09-29 05:14:52,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:14:55,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 05:14:55,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 05:14:57,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:14:59,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:15:00,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 05:15:06,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 05:15:07,102 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.45 vs. limit=6.0 2023-09-29 05:15:08,493 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.37 vs. limit=15.0 2023-09-29 05:15:09,804 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=260160.0, ans=0.125 2023-09-29 05:15:12,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:15:12,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 05:15:14,007 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:15:14,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:15:14,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:15:14,185 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 05:15:17,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:15:17,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:15:20,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 05:15:20,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:15:21,909 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:15:21,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 05:15:21,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:15:24,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:15:24,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:15:25,492 INFO [train.py:1039] (1/4) Epoch 8, batch 1850, loss[loss=0.2294, simple_loss=0.2897, pruned_loss=0.08453, over 23445.00 frames. ], tot_loss[loss=0.2187, simple_loss=0.2847, pruned_loss=0.07641, over 4714769.36 frames. ], batch size: 285, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:15:27,610 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:15:27,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:15:27,916 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 05:15:30,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:15:32,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:15:42,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:15:42,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 05:15:45,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 05:15:46,176 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=260293.33333333334, ans=0.5 2023-09-29 05:15:48,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 05:15:49,822 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.12 vs. limit=22.5 2023-09-29 05:15:52,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:15:52,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 05:15:52,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 05:15:57,093 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=260360.0, ans=0.0 2023-09-29 05:16:02,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:16:03,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 05:16:08,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:16:08,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:16:12,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 05:16:14,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:16:14,092 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 05:16:15,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:16:17,816 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.31 vs. limit=15.0 2023-09-29 05:16:18,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:16:19,000 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=260426.66666666666, ans=0.125 2023-09-29 05:16:20,500 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=260426.66666666666, ans=0.125 2023-09-29 05:16:21,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:16:24,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 05:16:24,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:16:24,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 05:16:25,066 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=260426.66666666666, ans=0.0 2023-09-29 05:16:25,687 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=14.72 vs. limit=15.0 2023-09-29 05:16:26,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:16:27,830 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:16:29,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:16:30,631 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 1.959e+02 2.142e+02 2.407e+02 4.178e+02, threshold=4.283e+02, percent-clipped=0.0 2023-09-29 05:16:33,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 05:16:35,330 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:16:38,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:16:39,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:16:39,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 05:16:39,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 05:16:42,844 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 05:16:44,995 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 05:16:45,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 05:16:46,544 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:16:46,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:16:46,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:16:47,975 INFO [train.py:1039] (1/4) Epoch 8, batch 1900, loss[loss=0.2225, simple_loss=0.3028, pruned_loss=0.07108, over 24462.00 frames. ], tot_loss[loss=0.2197, simple_loss=0.2859, pruned_loss=0.07675, over 4722879.73 frames. ], batch size: 69, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:16:48,091 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 05:16:48,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:16:48,191 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:16:49,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 05:16:51,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 05:16:52,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:16:52,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 05:16:54,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:16:54,301 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 05:16:54,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:16:55,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:17:00,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:17:03,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:17:03,816 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 05:17:05,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 05:17:05,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:17:06,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:17:06,881 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 05:17:08,348 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 05:17:11,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 05:17:13,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:17:19,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 05:17:22,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 05:17:30,519 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=260693.33333333334, ans=0.125 2023-09-29 05:17:31,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 05:17:33,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 05:17:33,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:17:33,458 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 05:17:33,465 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 05:17:33,523 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 05:17:33,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 05:17:33,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:17:38,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 05:17:41,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:17:44,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:17:44,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 05:17:46,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:17:50,836 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.64 vs. limit=22.5 2023-09-29 05:17:51,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 05:17:51,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 05:17:57,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:17:57,703 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:17:57,724 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:17:59,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:18:00,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 05:18:00,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 05:18:02,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:18:05,624 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:18:05,626 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:18:08,562 INFO [train.py:1039] (1/4) Epoch 8, batch 1950, loss[loss=0.2131, simple_loss=0.294, pruned_loss=0.06615, over 24447.00 frames. ], tot_loss[loss=0.2209, simple_loss=0.2872, pruned_loss=0.07733, over 4714969.01 frames. ], batch size: 69, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:18:08,703 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:18:08,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:18:08,766 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 05:18:10,536 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=260893.33333333334, ans=0.0 2023-09-29 05:18:11,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:18:13,281 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:18:14,106 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.86 vs. limit=12.0 2023-09-29 05:18:16,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:18:16,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:18:16,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 05:18:19,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 05:18:20,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 05:18:21,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:18:23,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:18:27,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:18:27,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:18:27,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:18:28,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:18:30,466 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:18:30,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 05:18:30,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:18:31,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:18:32,751 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=6.68 vs. limit=12.0 2023-09-29 05:18:33,129 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.44 vs. limit=22.5 2023-09-29 05:18:35,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:18:38,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:18:38,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:18:38,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 05:18:38,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 05:18:38,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 05:18:38,680 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=260960.0, ans=0.04949747468305833 2023-09-29 05:18:39,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:18:39,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:18:44,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:18:44,747 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=261026.66666666666, ans=0.0 2023-09-29 05:18:47,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:18:51,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 05:18:55,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:18:55,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 05:18:55,634 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 05:18:56,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:19:00,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:19:02,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:19:02,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 05:19:12,490 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:19:14,029 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:19:15,356 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.692e+02 2.025e+02 2.337e+02 2.726e+02 4.544e+02, threshold=4.674e+02, percent-clipped=3.0 2023-09-29 05:19:16,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:19:17,472 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=261160.0, ans=0.0 2023-09-29 05:19:19,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:19:21,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:19:23,081 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:19:25,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 05:19:25,190 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 05:19:26,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:19:26,865 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 05:19:30,277 INFO [train.py:1039] (1/4) Epoch 8, batch 2000, loss[loss=0.2103, simple_loss=0.286, pruned_loss=0.06734, over 24393.00 frames. ], tot_loss[loss=0.222, simple_loss=0.2881, pruned_loss=0.07794, over 4727724.95 frames. ], batch size: 69, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:19:30,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:19:30,710 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=261226.66666666666, ans=0.125 2023-09-29 05:19:35,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 05:19:35,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:19:37,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:19:39,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:19:40,569 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:19:43,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 05:19:43,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 05:19:46,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:19:47,013 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=261293.33333333334, ans=0.2 2023-09-29 05:19:49,748 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 05:19:49,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 05:19:49,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:19:53,106 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=261293.33333333334, ans=0.09899494936611666 2023-09-29 05:19:54,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:19:56,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 05:19:56,391 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=261293.33333333334, ans=0.125 2023-09-29 05:19:57,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:19:59,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:19:59,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:20:01,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 05:20:01,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 05:20:03,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 05:20:03,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:20:05,489 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=261360.0, ans=0.0 2023-09-29 05:20:06,848 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:20:08,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 05:20:08,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:20:08,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:20:10,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:20:12,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 05:20:15,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 05:20:15,485 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:20:15,497 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:20:20,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:20:21,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:20:21,853 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:20:22,074 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=261426.66666666666, ans=0.0 2023-09-29 05:20:23,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:20:26,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:20:26,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:20:26,299 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:20:27,001 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.44 vs. limit=6.0 2023-09-29 05:20:27,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:20:29,188 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:20:30,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:20:33,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 05:20:39,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 05:20:40,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:20:41,975 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=261493.33333333334, ans=0.0 2023-09-29 05:20:43,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:20:43,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:20:48,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:20:48,604 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=261493.33333333334, ans=0.0 2023-09-29 05:20:49,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:20:49,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:20:50,522 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1.whitening_limit, batch_count=261493.33333333334, ans=10.0 2023-09-29 05:20:51,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 05:20:52,727 INFO [train.py:1039] (1/4) Epoch 8, batch 2050, loss[loss=0.2186, simple_loss=0.2861, pruned_loss=0.07554, over 24658.00 frames. ], tot_loss[loss=0.2208, simple_loss=0.2871, pruned_loss=0.07725, over 4747160.69 frames. ], batch size: 65, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:20:52,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:20:54,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:20:54,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:20:57,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:20:57,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:20:57,930 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=261560.0, ans=0.1 2023-09-29 05:20:59,435 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=261560.0, ans=0.125 2023-09-29 05:21:05,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:21:07,008 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:21:08,545 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:21:08,662 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:21:11,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 05:21:11,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:21:12,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:21:12,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:21:24,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:21:24,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:21:27,486 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 05:21:29,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:21:29,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 05:21:30,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:21:31,674 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.46 vs. limit=6.0 2023-09-29 05:21:33,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:21:35,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:21:37,250 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 05:21:37,327 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:21:38,802 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:21:38,949 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:21:40,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:21:43,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:21:43,759 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=261760.0, ans=0.0 2023-09-29 05:21:45,809 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=261760.0, ans=0.1 2023-09-29 05:21:46,861 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:21:48,479 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 05:21:50,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:21:50,432 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=261760.0, ans=0.04949747468305833 2023-09-29 05:21:54,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:21:59,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:22:01,287 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.781e+02 2.119e+02 2.372e+02 3.018e+02 5.017e+02, threshold=4.745e+02, percent-clipped=2.0 2023-09-29 05:22:01,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 05:22:06,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:22:07,522 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:22:09,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:22:09,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 05:22:14,309 INFO [train.py:1039] (1/4) Epoch 8, batch 2100, loss[loss=0.2103, simple_loss=0.2866, pruned_loss=0.06699, over 24655.00 frames. ], tot_loss[loss=0.2194, simple_loss=0.2852, pruned_loss=0.07678, over 4729893.23 frames. ], batch size: 65, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:22:14,479 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 05:22:14,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:22:14,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:22:16,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 05:22:18,649 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:22:18,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 05:22:18,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 05:22:20,365 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:22:23,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:22:23,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:22:26,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:22:28,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:22:28,052 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 05:22:30,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:22:30,164 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 05:22:30,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 05:22:33,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:22:33,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:22:33,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 05:22:33,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 05:22:38,135 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=261960.0, ans=0.125 2023-09-29 05:22:39,267 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 05:22:39,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 05:22:40,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:22:41,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:22:45,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:22:48,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 05:22:48,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:22:48,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 05:22:51,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 05:22:51,684 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=262026.66666666666, ans=0.2 2023-09-29 05:22:53,405 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:22:53,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 05:22:53,479 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 05:22:54,899 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 05:22:56,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 05:22:57,952 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:23:01,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 05:23:02,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 05:23:03,178 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=262093.33333333334, ans=0.125 2023-09-29 05:23:04,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:23:06,449 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:23:06,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 05:23:06,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:23:06,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:23:07,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:23:07,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 05:23:09,489 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 05:23:09,714 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=262093.33333333334, ans=0.2 2023-09-29 05:23:10,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 05:23:12,997 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=262093.33333333334, ans=0.2 2023-09-29 05:23:14,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:23:17,191 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:23:17,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 05:23:19,089 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=262160.0, ans=0.05 2023-09-29 05:23:24,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:23:28,440 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=262160.0, ans=0.125 2023-09-29 05:23:29,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:23:29,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:23:29,503 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:23:29,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 05:23:31,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 05:23:32,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:23:32,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 05:23:32,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:23:34,137 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:23:35,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 05:23:36,522 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.59 vs. limit=15.0 2023-09-29 05:23:37,096 INFO [train.py:1039] (1/4) Epoch 8, batch 2150, loss[loss=0.1859, simple_loss=0.2601, pruned_loss=0.0559, over 24560.00 frames. ], tot_loss[loss=0.2174, simple_loss=0.2835, pruned_loss=0.07568, over 4727453.39 frames. ], batch size: 60, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:23:37,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 05:23:37,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:23:40,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:23:40,287 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:23:40,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:23:40,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:23:40,663 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=262226.6666666667, ans=0.1 2023-09-29 05:23:47,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 05:23:50,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:23:50,373 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=262226.6666666667, ans=0.125 2023-09-29 05:23:51,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:23:54,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:23:54,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:23:54,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:23:55,586 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=8.42 vs. limit=15.0 2023-09-29 05:23:59,577 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:23:59,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:23:59,689 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:24:04,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:24:04,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 05:24:09,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:24:09,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:24:11,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:24:11,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:24:12,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:24:13,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:24:13,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:24:13,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:24:14,279 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=9.18 vs. limit=15.0 2023-09-29 05:24:14,637 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:24:16,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 05:24:17,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:24:17,983 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=262360.0, ans=0.125 2023-09-29 05:24:19,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:24:19,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:24:20,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 05:24:21,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:24:24,640 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:24:24,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 05:24:26,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:24:26,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 05:24:26,298 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 05:24:31,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:24:31,593 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=262426.6666666667, ans=0.125 2023-09-29 05:24:32,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:24:34,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:24:34,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 05:24:35,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:24:37,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:24:37,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 05:24:40,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 05:24:40,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:24:41,462 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 05:24:41,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:24:41,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:24:41,955 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=262493.3333333333, ans=0.125 2023-09-29 05:24:42,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 05:24:43,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:24:43,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 05:24:43,057 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 05:24:43,058 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 05:24:44,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 05:24:45,900 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.644e+02 2.226e+02 2.643e+02 3.151e+02 6.561e+02, threshold=5.285e+02, percent-clipped=6.0 2023-09-29 05:24:46,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:24:46,197 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:24:46,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:24:47,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:24:47,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 05:24:47,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:24:47,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:24:57,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:24:57,802 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=262560.0, ans=0.125 2023-09-29 05:24:58,881 INFO [train.py:1039] (1/4) Epoch 8, batch 2200, loss[loss=0.1819, simple_loss=0.2487, pruned_loss=0.0575, over 24307.00 frames. ], tot_loss[loss=0.2174, simple_loss=0.2831, pruned_loss=0.07585, over 4717541.55 frames. ], batch size: 56, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:24:59,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 05:25:04,291 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:25:08,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:25:10,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:25:11,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:25:11,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 05:25:14,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:25:16,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:25:16,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 05:25:18,080 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=262626.6666666667, ans=0.125 2023-09-29 05:25:22,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 05:25:23,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 05:25:24,743 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.83 vs. limit=6.0 2023-09-29 05:25:28,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 05:25:29,004 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=262626.6666666667, ans=0.125 2023-09-29 05:25:32,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:25:33,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:25:33,732 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:25:37,550 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:25:37,603 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 05:25:42,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:25:44,661 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:25:44,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 05:25:47,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 05:25:49,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:25:51,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:25:51,793 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.24 vs. limit=15.0 2023-09-29 05:25:52,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:25:55,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 05:25:57,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:25:58,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 05:26:01,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:26:01,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 05:26:01,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:26:04,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:26:04,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:26:04,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:26:04,668 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:26:06,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 05:26:07,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:26:08,541 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 05:26:08,760 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 05:26:13,476 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 05:26:14,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:26:16,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:26:17,364 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 05:26:20,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:26:20,942 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 05:26:22,241 INFO [train.py:1039] (1/4) Epoch 8, batch 2250, loss[loss=0.2276, simple_loss=0.2834, pruned_loss=0.08589, over 23784.00 frames. ], tot_loss[loss=0.2174, simple_loss=0.2835, pruned_loss=0.07567, over 4720496.30 frames. ], batch size: 179, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:26:22,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 05:26:23,982 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 05:26:25,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:26:25,621 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 05:26:27,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:26:28,714 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 05:26:30,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:26:32,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:26:38,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:26:38,396 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:26:40,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:26:42,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:26:42,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:26:45,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 05:26:46,009 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:26:47,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:26:49,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 05:26:51,052 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:26:51,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:26:51,452 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=262960.0, ans=0.125 2023-09-29 05:26:52,746 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:26:57,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:26:58,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 05:26:58,871 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 05:27:00,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 05:27:01,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:27:03,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:27:06,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:27:08,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:27:09,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:27:09,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:27:11,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:27:14,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:27:18,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:27:19,353 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.92 vs. limit=15.0 2023-09-29 05:27:22,181 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 05:27:22,477 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=263093.3333333333, ans=0.125 2023-09-29 05:27:27,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 05:27:27,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:27:28,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:27:29,072 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=263160.0, ans=0.125 2023-09-29 05:27:31,593 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.954e+02 2.186e+02 2.448e+02 4.409e+02, threshold=4.373e+02, percent-clipped=0.0 2023-09-29 05:27:32,324 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.71 vs. limit=22.5 2023-09-29 05:27:36,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 05:27:39,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 05:27:39,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 05:27:39,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:27:39,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:27:43,928 INFO [train.py:1039] (1/4) Epoch 8, batch 2300, loss[loss=0.1817, simple_loss=0.2509, pruned_loss=0.05627, over 20603.00 frames. ], tot_loss[loss=0.2186, simple_loss=0.2847, pruned_loss=0.07624, over 4728350.03 frames. ], batch size: 45, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:27:43,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 05:27:44,921 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.89 vs. limit=15.0 2023-09-29 05:27:45,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:27:45,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:27:52,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:27:52,477 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:27:54,153 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 05:27:55,236 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.39 vs. limit=6.0 2023-09-29 05:27:55,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:28:03,240 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:28:03,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 05:28:04,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:28:04,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:28:04,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 05:28:06,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:28:07,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:28:07,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:28:11,086 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:28:14,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 05:28:17,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:28:22,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:28:22,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:28:25,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:28:29,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:28:31,171 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=263360.0, ans=0.0 2023-09-29 05:28:32,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:28:34,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 05:28:34,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:28:34,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 05:28:37,790 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 05:28:37,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:28:37,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:28:37,902 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:28:39,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:28:39,850 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=263426.6666666667, ans=0.1 2023-09-29 05:28:40,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 05:28:40,834 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 05:28:42,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 05:28:42,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:28:42,258 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:28:42,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 05:28:48,697 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=263493.3333333333, ans=0.1 2023-09-29 05:28:51,329 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:28:51,794 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=263493.3333333333, ans=0.1 2023-09-29 05:28:55,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:28:58,547 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:28:58,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:28:58,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 05:29:01,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 05:29:01,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:29:03,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 05:29:03,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 05:29:07,054 INFO [train.py:1039] (1/4) Epoch 8, batch 2350, loss[loss=0.216, simple_loss=0.2908, pruned_loss=0.07067, over 24462.00 frames. ], tot_loss[loss=0.2208, simple_loss=0.2867, pruned_loss=0.07745, over 4730700.17 frames. ], batch size: 66, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:29:07,586 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 05:29:10,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:29:10,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 05:29:10,577 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=263560.0, ans=0.125 2023-09-29 05:29:14,191 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.04 vs. limit=12.0 2023-09-29 05:29:16,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 05:29:18,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:29:18,163 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=263560.0, ans=0.2 2023-09-29 05:29:22,502 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:29:22,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:29:22,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:29:23,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:29:25,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 05:29:29,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:29:35,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 05:29:37,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:29:39,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:29:39,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:29:43,212 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:29:44,776 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=263693.3333333333, ans=0.0 2023-09-29 05:29:46,166 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 05:29:47,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:29:49,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:29:50,699 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:29:50,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:29:55,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:29:58,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 05:29:58,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:30:01,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:30:01,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:30:01,823 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=263760.0, ans=0.125 2023-09-29 05:30:03,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 05:30:03,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:30:06,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 05:30:08,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 05:30:11,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 05:30:14,764 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.631e+02 2.176e+02 2.529e+02 2.945e+02 4.428e+02, threshold=5.058e+02, percent-clipped=1.0 2023-09-29 05:30:17,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 05:30:17,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:30:17,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 05:30:17,164 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 05:30:19,158 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 05:30:20,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 05:30:24,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:30:28,201 INFO [train.py:1039] (1/4) Epoch 8, batch 2400, loss[loss=0.2382, simple_loss=0.3076, pruned_loss=0.08443, over 23596.00 frames. ], tot_loss[loss=0.2201, simple_loss=0.286, pruned_loss=0.07714, over 4732512.40 frames. ], batch size: 94, lr: 1.30e-02, grad_scale: 16.0 2023-09-29 05:30:28,351 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:30:32,883 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:30:34,404 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:30:35,912 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 05:30:35,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 05:30:43,433 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 05:30:43,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:30:46,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 05:30:46,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:30:47,951 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:30:49,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 05:30:54,150 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten.whitening_limit, batch_count=263960.0, ans=15.0 2023-09-29 05:30:56,759 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:30:58,430 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 05:31:03,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 05:31:07,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 05:31:09,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:31:10,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:31:16,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:31:18,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 05:31:18,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:31:24,932 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:31:28,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:31:31,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:31:33,045 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:31:33,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 05:31:33,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:31:33,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:31:33,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:31:33,209 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 05:31:37,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:31:37,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 05:31:39,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 05:31:39,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 05:31:40,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:31:40,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:31:41,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 05:31:41,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 05:31:41,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 05:31:41,181 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 05:31:44,114 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 05:31:44,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:31:45,666 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:31:45,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:31:47,839 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 05:31:49,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:31:49,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 05:31:51,828 INFO [train.py:1039] (1/4) Epoch 8, batch 2450, loss[loss=0.2269, simple_loss=0.2906, pruned_loss=0.08156, over 24452.00 frames. ], tot_loss[loss=0.2191, simple_loss=0.2841, pruned_loss=0.07699, over 4712748.80 frames. ], batch size: 58, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:31:52,350 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=264226.6666666667, ans=0.0 2023-09-29 05:31:55,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 05:31:55,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:31:59,917 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:31:59,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:32:01,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 05:32:06,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:32:06,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:32:09,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:32:10,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:32:10,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:32:10,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 05:32:13,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:32:16,977 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 05:32:17,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:32:17,872 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.88 vs. limit=22.5 2023-09-29 05:32:21,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 05:32:22,320 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=264360.0, ans=0.0 2023-09-29 05:32:23,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:32:25,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:32:25,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:32:27,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 05:32:27,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:32:35,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:32:37,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:32:37,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:32:38,954 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:32:39,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:32:39,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:32:40,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 05:32:43,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:32:43,853 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:32:48,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:32:48,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:32:53,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:32:53,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 05:32:54,699 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:32:54,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:32:54,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 05:32:57,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:32:58,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:33:01,896 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 2.058e+02 2.397e+02 2.730e+02 4.175e+02, threshold=4.793e+02, percent-clipped=0.0 2023-09-29 05:33:05,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:33:08,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:33:08,516 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:33:11,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 05:33:11,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:33:13,063 INFO [train.py:1039] (1/4) Epoch 8, batch 2500, loss[loss=0.2413, simple_loss=0.2944, pruned_loss=0.09404, over 23811.00 frames. ], tot_loss[loss=0.2182, simple_loss=0.2839, pruned_loss=0.07629, over 4702774.24 frames. ], batch size: 164, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:33:19,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:33:25,683 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=264560.0, ans=0.0 2023-09-29 05:33:28,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:33:28,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:33:31,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:33:31,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 05:33:37,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 05:33:39,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:33:40,048 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.00 vs. limit=15.0 2023-09-29 05:33:40,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 05:33:40,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 05:33:42,331 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 05:33:43,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:33:43,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:33:45,300 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 05:33:45,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:33:45,432 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 05:33:45,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:33:50,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:33:51,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:33:52,132 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.23 vs. limit=15.0 2023-09-29 05:33:54,524 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 05:33:54,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 05:33:54,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:33:57,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:33:58,027 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=264693.3333333333, ans=0.0 2023-09-29 05:33:59,558 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=264760.0, ans=10.0 2023-09-29 05:34:02,074 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:34:05,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:34:11,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:34:15,451 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=264760.0, ans=0.125 2023-09-29 05:34:16,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 05:34:21,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 05:34:21,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:34:21,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 05:34:22,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:34:22,686 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 05:34:24,217 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 05:34:24,218 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 05:34:24,227 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 05:34:27,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:34:30,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 05:34:30,316 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 05:34:31,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:34:33,170 INFO [train.py:1039] (1/4) Epoch 8, batch 2550, loss[loss=0.1972, simple_loss=0.2667, pruned_loss=0.06388, over 24604.00 frames. ], tot_loss[loss=0.2177, simple_loss=0.2838, pruned_loss=0.07583, over 4716482.45 frames. ], batch size: 60, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:34:33,312 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 05:34:36,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 05:34:39,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:34:41,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:34:42,907 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:34:45,131 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:34:47,744 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 05:34:47,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 05:34:51,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 05:34:53,064 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:34:54,666 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:34:57,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:34:57,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 05:34:57,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 05:34:57,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:34:58,067 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=264960.0, ans=0.0 2023-09-29 05:34:59,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:35:02,018 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:35:02,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 05:35:02,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 05:35:02,124 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:35:02,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 05:35:16,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:35:20,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:35:20,749 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:35:20,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:35:22,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 05:35:27,740 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=265093.3333333333, ans=0.0 2023-09-29 05:35:29,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:35:32,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 05:35:32,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:35:32,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:35:33,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 05:35:33,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 05:35:35,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:35:36,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:35:40,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:35:40,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 05:35:40,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:35:41,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:35:43,111 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:35:45,163 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.988e+02 2.217e+02 2.595e+02 3.453e+02, threshold=4.435e+02, percent-clipped=0.0 2023-09-29 05:35:45,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 05:35:46,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:35:52,356 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=265160.0, ans=0.2 2023-09-29 05:35:54,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:35:55,492 INFO [train.py:1039] (1/4) Epoch 8, batch 2600, loss[loss=0.2037, simple_loss=0.27, pruned_loss=0.06865, over 24331.00 frames. ], tot_loss[loss=0.2182, simple_loss=0.2843, pruned_loss=0.07602, over 4730581.41 frames. ], batch size: 56, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:35:57,769 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:35:58,065 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=265226.6666666667, ans=0.2 2023-09-29 05:36:00,693 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 05:36:02,329 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 05:36:02,369 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:36:03,801 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 05:36:03,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 05:36:03,982 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 05:36:07,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:36:07,033 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 05:36:08,520 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 05:36:09,994 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 05:36:12,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:36:14,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 05:36:16,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 05:36:17,519 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:36:17,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 05:36:21,560 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 05:36:21,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 05:36:23,588 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.45 vs. limit=15.0 2023-09-29 05:36:31,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:36:31,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:36:31,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:36:31,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 05:36:33,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:36:37,059 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=265360.0, ans=0.0 2023-09-29 05:36:39,665 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 05:36:44,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:36:44,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:36:45,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 05:36:45,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:36:45,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:36:47,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 05:36:50,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 05:36:50,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:36:54,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:36:55,937 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 05:36:55,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:36:56,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:37:03,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:37:04,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:37:04,720 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 05:37:06,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:37:08,293 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:37:09,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:37:15,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 05:37:16,980 INFO [train.py:1039] (1/4) Epoch 8, batch 2650, loss[loss=0.2198, simple_loss=0.3037, pruned_loss=0.06792, over 24632.00 frames. ], tot_loss[loss=0.2203, simple_loss=0.2863, pruned_loss=0.07714, over 4721567.30 frames. ], batch size: 73, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:37:17,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:37:17,321 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=265560.0, ans=0.125 2023-09-29 05:37:17,559 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=265560.0, ans=0.0 2023-09-29 05:37:18,718 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 05:37:22,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 05:37:23,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:37:23,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=265560.0, ans=0.125 2023-09-29 05:37:25,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 05:37:26,494 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 05:37:26,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:37:27,301 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.62 vs. limit=22.5 2023-09-29 05:37:28,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:37:31,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 05:37:33,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:37:35,094 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:37:36,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 05:37:36,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 05:37:36,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:37:41,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 05:37:41,717 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 05:37:45,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:37:48,416 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 05:37:48,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:37:48,539 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 05:37:49,450 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.11 vs. limit=12.0 2023-09-29 05:37:51,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:37:53,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 05:37:53,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:37:54,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:37:57,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 05:37:57,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 05:37:58,711 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=14.47 vs. limit=15.0 2023-09-29 05:38:01,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 05:38:05,267 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 05:38:05,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:38:06,721 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:38:06,778 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 05:38:06,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:38:08,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:38:10,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:38:10,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:38:11,997 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:38:13,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:38:14,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:38:17,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:38:19,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:38:19,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:38:21,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:38:21,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 05:38:26,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:38:28,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:38:28,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:38:28,261 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=265826.6666666667, ans=0.04949747468305833 2023-09-29 05:38:29,356 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 2.037e+02 2.259e+02 2.622e+02 3.986e+02, threshold=4.518e+02, percent-clipped=0.0 2023-09-29 05:38:29,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 05:38:33,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:38:34,116 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:38:34,440 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=265826.6666666667, ans=0.125 2023-09-29 05:38:35,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:38:35,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:38:37,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 05:38:39,336 INFO [train.py:1039] (1/4) Epoch 8, batch 2700, loss[loss=0.2323, simple_loss=0.3062, pruned_loss=0.07919, over 24011.00 frames. ], tot_loss[loss=0.2206, simple_loss=0.2869, pruned_loss=0.07718, over 4729236.51 frames. ], batch size: 86, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:38:39,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:38:41,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:38:41,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 05:38:45,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:38:46,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 05:38:48,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:38:48,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:38:48,668 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:38:50,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:38:50,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:38:51,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:38:51,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 05:38:51,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 05:38:51,765 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:38:54,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 05:38:54,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 05:38:56,387 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:39:01,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 05:39:01,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 05:39:03,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:39:07,929 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=265960.0, ans=0.125 2023-09-29 05:39:09,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:39:09,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:39:11,945 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=266026.6666666667, ans=0.0 2023-09-29 05:39:15,231 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:39:15,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:39:15,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:39:15,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 05:39:20,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:39:23,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:39:23,486 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 05:39:23,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:39:29,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:39:29,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:39:36,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:39:38,322 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:39:41,261 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:39:41,264 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:39:44,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:39:46,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:39:46,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:39:48,218 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:39:49,067 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:39:50,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:39:52,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:39:53,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:39:53,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:39:56,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 05:39:56,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:39:57,101 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=266160.0, ans=0.125 2023-09-29 05:39:59,810 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:39:59,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 05:40:01,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 05:40:02,918 INFO [train.py:1039] (1/4) Epoch 8, batch 2750, loss[loss=0.2233, simple_loss=0.2857, pruned_loss=0.08043, over 23386.00 frames. ], tot_loss[loss=0.2205, simple_loss=0.2864, pruned_loss=0.07733, over 4711900.29 frames. ], batch size: 106, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:40:03,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:40:04,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:40:04,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:40:08,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:40:08,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 05:40:08,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:40:12,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:40:13,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 05:40:13,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:40:13,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:40:13,800 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 05:40:13,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:40:13,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:40:22,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 05:40:24,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:40:25,743 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:40:25,850 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:40:25,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 05:40:27,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:40:28,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:40:29,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:40:30,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:40:32,269 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=266293.3333333333, ans=0.0 2023-09-29 05:40:32,396 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=266293.3333333333, ans=0.0 2023-09-29 05:40:33,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 05:40:35,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 05:40:36,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:40:36,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:40:38,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 05:40:41,878 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=266360.0, ans=0.125 2023-09-29 05:40:45,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:40:48,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 05:40:49,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:40:51,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:40:51,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:40:53,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:40:59,018 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:41:00,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:41:00,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 05:41:02,948 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.29 vs. limit=15.0 2023-09-29 05:41:03,920 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=266426.6666666667, ans=0.09899494936611666 2023-09-29 05:41:04,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:41:05,136 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=266426.6666666667, ans=0.125 2023-09-29 05:41:06,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 05:41:11,332 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 05:41:13,020 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:41:14,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 05:41:14,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:41:15,922 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 2.077e+02 2.428e+02 2.871e+02 4.392e+02, threshold=4.857e+02, percent-clipped=0.0 2023-09-29 05:41:17,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:41:17,701 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 05:41:17,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:41:21,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 05:41:21,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:41:21,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:41:23,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 05:41:23,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:41:23,112 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:41:25,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:41:25,243 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 05:41:25,244 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 05:41:26,528 INFO [train.py:1039] (1/4) Epoch 8, batch 2800, loss[loss=0.2077, simple_loss=0.245, pruned_loss=0.08514, over 19071.00 frames. ], tot_loss[loss=0.2186, simple_loss=0.2845, pruned_loss=0.07641, over 4712101.66 frames. ], batch size: 388, lr: 1.29e-02, grad_scale: 16.0 2023-09-29 05:41:31,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:41:31,991 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=266560.0, ans=0.125 2023-09-29 05:41:32,035 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=266560.0, ans=0.2 2023-09-29 05:41:33,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 05:41:33,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:41:37,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:41:39,282 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 05:41:41,188 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=266626.6666666667, ans=0.1 2023-09-29 05:41:42,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 05:41:42,645 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=266626.6666666667, ans=0.125 2023-09-29 05:41:43,259 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.16 vs. limit=15.0 2023-09-29 05:41:43,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 05:41:45,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:41:46,845 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:41:46,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:41:50,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:41:50,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:41:50,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 05:41:55,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:41:55,978 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=266626.6666666667, ans=0.125 2023-09-29 05:42:04,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:42:04,753 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=266693.3333333333, ans=0.0 2023-09-29 05:42:07,611 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:42:09,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:42:11,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:42:11,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:42:17,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:42:17,776 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 05:42:17,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:42:19,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:42:19,376 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:42:23,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:42:24,574 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.06 vs. limit=15.0 2023-09-29 05:42:25,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:42:27,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:42:30,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:42:30,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:42:30,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 05:42:32,962 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 05:42:33,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 05:42:34,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:42:34,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 05:42:34,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:42:36,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:42:36,545 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:42:38,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 05:42:38,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:42:38,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:42:39,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:42:41,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 05:42:48,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:42:48,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 05:42:48,915 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=5.05 vs. limit=12.0 2023-09-29 05:42:49,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:42:51,360 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:42:52,727 INFO [train.py:1039] (1/4) Epoch 8, batch 2850, loss[loss=0.2053, simple_loss=0.2832, pruned_loss=0.06374, over 24442.00 frames. ], tot_loss[loss=0.218, simple_loss=0.2839, pruned_loss=0.07606, over 4721283.98 frames. ], batch size: 69, lr: 1.29e-02, grad_scale: 16.0 2023-09-29 05:42:57,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:42:57,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:42:57,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:42:58,967 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:42:59,281 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=266893.3333333333, ans=0.125 2023-09-29 05:42:59,292 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=266893.3333333333, ans=0.125 2023-09-29 05:43:00,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:43:02,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 05:43:04,112 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 05:43:04,514 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 05:43:11,208 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 05:43:11,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:43:13,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 05:43:13,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:43:13,614 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=266960.0, ans=0.125 2023-09-29 05:43:16,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 05:43:16,356 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 05:43:20,065 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:43:30,414 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.29 vs. limit=15.0 2023-09-29 05:43:32,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:43:33,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:43:35,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:43:35,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 05:43:36,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 05:43:37,050 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 05:43:40,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 05:43:40,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 05:43:42,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:43:42,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:43:44,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:43:44,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:43:46,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:43:46,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:43:48,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:43:48,427 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=267093.3333333333, ans=0.125 2023-09-29 05:43:49,718 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:43:49,991 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=267093.3333333333, ans=0.125 2023-09-29 05:43:53,382 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:43:53,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:43:54,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:43:57,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:44:02,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:44:02,825 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=267160.0, ans=0.0 2023-09-29 05:44:03,168 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.10 vs. limit=12.0 2023-09-29 05:44:04,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 05:44:04,189 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 05:44:05,428 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 2.097e+02 2.299e+02 2.689e+02 7.485e+02, threshold=4.599e+02, percent-clipped=2.0 2023-09-29 05:44:07,118 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 05:44:07,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:44:07,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 05:44:08,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:44:10,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:44:10,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:44:10,252 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:44:10,252 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 05:44:11,611 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 05:44:11,617 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:44:13,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:44:15,347 INFO [train.py:1039] (1/4) Epoch 8, batch 2900, loss[loss=0.1829, simple_loss=0.2447, pruned_loss=0.0606, over 19815.00 frames. ], tot_loss[loss=0.2172, simple_loss=0.2836, pruned_loss=0.07544, over 4709831.38 frames. ], batch size: 43, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:44:15,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 05:44:17,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:44:17,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:44:19,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 05:44:23,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:44:23,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 05:44:24,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 05:44:25,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:44:25,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 05:44:27,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:44:29,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:44:33,585 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:44:33,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:44:36,820 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=267293.3333333333, ans=0.04949747468305833 2023-09-29 05:44:37,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 05:44:38,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 05:44:39,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 05:44:41,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:44:41,378 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=267293.3333333333, ans=0.125 2023-09-29 05:44:42,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 05:44:44,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 05:44:45,835 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:44:45,839 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 05:44:45,867 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:44:49,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:44:49,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 05:44:52,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:44:56,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:44:59,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:45:03,017 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:45:04,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 05:45:04,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 05:45:04,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:45:09,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:45:12,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 05:45:13,993 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:45:17,237 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=267426.6666666667, ans=0.1 2023-09-29 05:45:20,049 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:45:21,045 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.31 vs. limit=15.0 2023-09-29 05:45:28,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:45:28,719 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 05:45:29,099 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=267493.3333333333, ans=10.0 2023-09-29 05:45:29,313 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.23 vs. limit=12.0 2023-09-29 05:45:30,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 05:45:33,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:45:33,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 05:45:34,762 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:45:34,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 05:45:36,144 INFO [train.py:1039] (1/4) Epoch 8, batch 2950, loss[loss=0.2468, simple_loss=0.2952, pruned_loss=0.09919, over 23352.00 frames. ], tot_loss[loss=0.2181, simple_loss=0.2844, pruned_loss=0.07588, over 4713826.81 frames. ], batch size: 285, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:45:41,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:45:43,446 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 05:45:43,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:45:44,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:45:46,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:45:48,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:45:49,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 05:45:49,793 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 05:45:51,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 05:45:51,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 05:45:51,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:45:56,225 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:45:59,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:46:01,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:46:01,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:46:04,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:46:05,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:46:06,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:46:08,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:46:08,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:46:09,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 05:46:16,757 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 05:46:18,107 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 05:46:19,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:46:21,158 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 05:46:21,478 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=267693.3333333333, ans=0.125 2023-09-29 05:46:22,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 05:46:22,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:46:24,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:46:24,259 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 05:46:24,280 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 05:46:27,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 05:46:28,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:46:30,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:46:33,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:46:35,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:46:35,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:46:35,139 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 05:46:36,547 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:46:36,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 05:46:41,139 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.33 vs. limit=15.0 2023-09-29 05:46:43,955 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:46:45,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:46:45,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 05:46:45,586 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:46:47,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 05:46:50,570 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.933e+02 2.152e+02 2.474e+02 4.181e+02, threshold=4.303e+02, percent-clipped=1.0 2023-09-29 05:46:50,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:46:50,992 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=267826.6666666667, ans=0.0 2023-09-29 05:46:52,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:46:52,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:46:55,283 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:46:55,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 05:46:55,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:46:57,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:46:57,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 05:46:57,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 05:46:57,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:46:58,567 INFO [train.py:1039] (1/4) Epoch 8, batch 3000, loss[loss=0.2198, simple_loss=0.2799, pruned_loss=0.07991, over 23786.00 frames. ], tot_loss[loss=0.2176, simple_loss=0.2842, pruned_loss=0.07549, over 4717910.39 frames. ], batch size: 135, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:46:58,567 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-29 05:47:12,763 INFO [train.py:1071] (1/4) Epoch 8, validation: loss=0.3012, simple_loss=0.2865, pruned_loss=0.1579, over 1125622.00 frames. 2023-09-29 05:47:12,764 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-29 05:47:14,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:47:15,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:47:15,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 05:47:16,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:47:18,250 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:47:20,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 05:47:23,425 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 05:47:23,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 05:47:25,096 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:47:26,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:47:26,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 05:47:26,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:47:26,952 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=267893.3333333333, ans=0.125 2023-09-29 05:47:32,841 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 05:47:33,111 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=267960.0, ans=0.125 2023-09-29 05:47:45,127 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.46 vs. limit=15.0 2023-09-29 05:47:45,877 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:47:46,564 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.97 vs. limit=15.0 2023-09-29 05:47:51,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 05:47:53,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 05:47:56,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 05:47:57,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:47:57,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:47:59,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:47:59,538 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 05:48:01,218 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 05:48:04,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:48:04,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 05:48:05,933 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:48:05,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:48:07,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:48:07,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:48:11,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 05:48:12,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:48:12,045 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:48:12,436 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=268093.3333333333, ans=0.125 2023-09-29 05:48:13,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:48:16,662 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 05:48:16,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:48:16,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:48:19,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:48:22,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:48:22,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:48:25,139 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 05:48:25,194 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 05:48:25,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:48:26,683 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 05:48:26,770 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:48:28,574 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=268160.0, ans=0.125 2023-09-29 05:48:31,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 05:48:34,123 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 05:48:34,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 05:48:34,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 05:48:35,801 INFO [train.py:1039] (1/4) Epoch 8, batch 3050, loss[loss=0.225, simple_loss=0.3021, pruned_loss=0.07393, over 24380.00 frames. ], tot_loss[loss=0.2201, simple_loss=0.2864, pruned_loss=0.07692, over 4715893.35 frames. ], batch size: 77, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:48:37,342 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 05:48:37,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 05:48:37,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:48:38,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:48:38,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 05:48:38,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:48:39,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:48:42,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 05:48:44,914 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:48:47,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:48:47,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:48:51,040 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:48:55,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 05:48:56,022 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.37 vs. limit=15.0 2023-09-29 05:49:01,256 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=268293.3333333333, ans=0.0 2023-09-29 05:49:03,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 05:49:03,865 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 05:49:03,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:49:06,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:49:10,190 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:49:10,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:49:10,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:49:11,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:49:13,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 05:49:13,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:49:14,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:49:14,732 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:49:14,890 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:49:17,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:49:20,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:49:20,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 05:49:22,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:49:22,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 05:49:27,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:49:27,452 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:49:27,541 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:49:29,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:49:31,747 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.09 vs. limit=22.5 2023-09-29 05:49:34,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:49:36,542 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:49:42,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:49:42,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:49:42,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:49:44,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:49:44,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 05:49:44,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:49:45,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 05:49:46,287 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=268493.3333333333, ans=0.125 2023-09-29 05:49:47,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:49:47,705 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=268493.3333333333, ans=0.125 2023-09-29 05:49:48,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:49:50,195 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 2.057e+02 2.311e+02 2.662e+02 5.288e+02, threshold=4.621e+02, percent-clipped=1.0 2023-09-29 05:49:50,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 05:49:51,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:49:52,039 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=268493.3333333333, ans=0.125 2023-09-29 05:49:55,670 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten.whitening_limit, batch_count=268560.0, ans=22.5 2023-09-29 05:49:56,334 INFO [train.py:1039] (1/4) Epoch 8, batch 3100, loss[loss=0.1995, simple_loss=0.2678, pruned_loss=0.06562, over 24476.00 frames. ], tot_loss[loss=0.2209, simple_loss=0.2865, pruned_loss=0.07763, over 4711222.37 frames. ], batch size: 58, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:49:57,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:49:58,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:49:59,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 05:50:02,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 05:50:06,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 05:50:06,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 05:50:08,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:50:09,168 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=268560.0, ans=0.125 2023-09-29 05:50:13,416 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:50:13,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:50:15,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 05:50:18,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:50:21,746 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=268626.6666666667, ans=0.125 2023-09-29 05:50:24,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 05:50:30,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 05:50:30,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:50:31,157 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.10 vs. limit=12.0 2023-09-29 05:50:31,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:50:31,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:50:33,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 05:50:35,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:50:35,593 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 05:50:35,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:50:37,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:50:40,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 05:50:40,798 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.49 vs. limit=6.0 2023-09-29 05:50:41,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:50:44,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:50:44,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 05:50:44,953 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=268760.0, ans=0.1 2023-09-29 05:50:46,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 05:50:47,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:50:47,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:50:49,496 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:50:49,516 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:50:49,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:50:51,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 05:50:51,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:50:54,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:50:54,280 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:50:54,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:50:54,295 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 05:50:57,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:50:59,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 05:51:00,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:51:02,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 05:51:02,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:51:02,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:51:03,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 05:51:06,333 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=268826.6666666667, ans=0.0 2023-09-29 05:51:06,819 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.99 vs. limit=15.0 2023-09-29 05:51:16,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 05:51:19,611 INFO [train.py:1039] (1/4) Epoch 8, batch 3150, loss[loss=0.1894, simple_loss=0.2579, pruned_loss=0.06046, over 24601.00 frames. ], tot_loss[loss=0.219, simple_loss=0.285, pruned_loss=0.07643, over 4719468.98 frames. ], batch size: 60, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:51:21,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:51:21,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:51:24,259 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:51:24,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:51:25,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 05:51:27,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:51:27,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 05:51:28,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 05:51:30,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:51:32,102 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 05:51:35,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 05:51:35,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:51:36,688 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 05:51:38,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 05:51:38,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 05:51:40,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 05:51:40,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 05:51:40,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:51:40,384 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:51:41,933 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:51:43,510 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 05:51:47,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:51:47,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:51:47,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:51:51,054 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:51:54,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 05:51:54,368 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:51:55,992 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 05:51:57,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:51:57,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 05:51:59,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 05:52:00,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:52:02,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 05:52:02,217 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 05:52:02,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:52:02,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:52:03,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 05:52:03,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 05:52:05,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 05:52:05,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 05:52:05,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:52:07,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:52:07,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:52:09,169 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 05:52:09,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:52:11,350 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.92 vs. limit=15.0 2023-09-29 05:52:12,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 05:52:12,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:52:13,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 05:52:15,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 05:52:15,388 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:52:17,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:52:17,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 05:52:19,687 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 05:52:21,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:52:23,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:52:25,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:52:26,543 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:52:31,209 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:52:31,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:52:34,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 05:52:35,510 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.692e+02 2.026e+02 2.271e+02 2.794e+02 4.211e+02, threshold=4.543e+02, percent-clipped=0.0 2023-09-29 05:52:38,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:52:38,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 05:52:42,180 INFO [train.py:1039] (1/4) Epoch 8, batch 3200, loss[loss=0.2278, simple_loss=0.2876, pruned_loss=0.08396, over 23335.00 frames. ], tot_loss[loss=0.2177, simple_loss=0.2845, pruned_loss=0.07548, over 4733709.32 frames. ], batch size: 119, lr: 1.29e-02, grad_scale: 16.0 2023-09-29 05:52:43,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:52:46,634 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:52:46,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 05:52:50,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:52:50,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=269226.6666666667, ans=0.2 2023-09-29 05:52:57,045 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 05:53:02,161 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:53:11,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 05:53:20,869 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=269360.0, ans=0.125 2023-09-29 05:53:22,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 05:53:24,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:53:28,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 05:53:28,479 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=269360.0, ans=0.5 2023-09-29 05:53:29,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 05:53:31,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:53:31,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 05:53:32,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:53:38,380 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 05:53:39,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 05:53:43,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 05:53:46,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 05:53:47,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 05:53:52,543 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:53:53,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:53:53,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:53:53,995 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 05:53:53,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 05:53:59,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:54:00,726 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 05:54:00,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 05:54:02,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 05:54:03,747 INFO [train.py:1039] (1/4) Epoch 8, batch 3250, loss[loss=0.2481, simple_loss=0.3006, pruned_loss=0.09786, over 22785.00 frames. ], tot_loss[loss=0.218, simple_loss=0.2849, pruned_loss=0.07552, over 4736720.06 frames. ], batch size: 322, lr: 1.29e-02, grad_scale: 16.0 2023-09-29 05:54:03,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 05:54:05,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:54:09,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 05:54:11,030 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 05:54:11,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:54:11,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:54:12,742 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 05:54:17,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:54:19,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:54:26,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:54:26,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 05:54:28,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:54:28,469 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:54:28,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:54:29,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:54:30,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 05:54:34,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:54:34,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:54:34,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:54:34,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:54:34,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:54:35,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:54:37,842 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_ff3.min_abs, batch_count=269693.3333333333, ans=0.2 2023-09-29 05:54:38,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:54:39,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:54:42,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:54:42,755 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:54:45,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:54:45,062 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:54:45,078 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:54:49,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 05:54:49,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:54:49,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:54:52,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:54:52,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 05:55:00,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:55:08,414 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:55:09,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:55:09,819 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 05:55:09,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:55:09,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 05:55:09,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:55:13,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 05:55:13,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 05:55:13,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:55:14,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:55:16,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:55:17,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 05:55:18,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:55:18,727 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 05:55:19,806 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.498e+02 1.998e+02 2.232e+02 2.545e+02 3.931e+02, threshold=4.463e+02, percent-clipped=0.0 2023-09-29 05:55:21,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:55:23,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:55:24,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 05:55:24,549 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:55:25,841 INFO [train.py:1039] (1/4) Epoch 8, batch 3300, loss[loss=0.2432, simple_loss=0.2997, pruned_loss=0.09331, over 23510.00 frames. ], tot_loss[loss=0.2192, simple_loss=0.286, pruned_loss=0.07621, over 4737739.17 frames. ], batch size: 134, lr: 1.28e-02, grad_scale: 16.0 2023-09-29 05:55:26,314 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=269893.3333333333, ans=0.1 2023-09-29 05:55:27,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:55:27,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 05:55:29,411 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=269893.3333333333, ans=0.125 2023-09-29 05:55:30,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:55:30,596 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 05:55:32,325 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 05:55:33,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 05:55:33,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:55:35,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=269893.3333333333, ans=0.125 2023-09-29 05:55:38,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:55:40,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:55:40,664 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:55:43,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 05:55:43,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 05:55:45,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:55:46,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:55:48,959 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=269960.0, ans=0.2 2023-09-29 05:55:50,206 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 05:55:50,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:55:52,432 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:55:52,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:55:53,874 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 05:55:54,186 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=269960.0, ans=0.0 2023-09-29 05:55:55,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:55:55,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 05:55:57,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 05:55:57,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:55:57,076 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 05:56:01,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:56:01,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 05:56:04,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:56:04,814 WARNING [train.py:1197] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 05:56:06,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 05:56:06,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:56:07,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:56:09,592 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 05:56:10,455 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=12.39 vs. limit=15.0 2023-09-29 05:56:11,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 05:56:11,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:56:15,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 05:56:18,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:56:19,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:56:21,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:56:23,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:56:23,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:56:23,138 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:56:23,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:56:26,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:56:26,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:56:26,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:56:28,454 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 05:56:29,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 05:56:31,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 05:56:32,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:56:32,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:56:34,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:56:34,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:56:36,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 05:56:37,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:56:37,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 05:56:37,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:56:39,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 05:56:43,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 05:56:44,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:56:45,040 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:56:46,934 INFO [train.py:1039] (1/4) Epoch 8, batch 3350, loss[loss=0.2178, simple_loss=0.2862, pruned_loss=0.07467, over 23607.00 frames. ], tot_loss[loss=0.2208, simple_loss=0.2873, pruned_loss=0.07713, over 4719441.83 frames. ], batch size: 149, lr: 1.28e-02, grad_scale: 16.0 2023-09-29 05:56:47,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:56:47,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:56:49,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:56:52,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:56:52,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:56:55,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:56:57,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:56:58,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:56:59,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:57:02,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:57:03,130 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=270293.3333333333, ans=0.05 2023-09-29 05:57:04,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:57:04,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:57:05,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 05:57:08,669 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 05:57:08,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:57:11,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 05:57:11,866 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 05:57:14,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:57:14,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:57:16,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:57:16,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 05:57:16,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:57:16,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:57:18,704 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:57:21,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:57:21,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:57:23,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:57:25,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:57:28,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:57:28,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:57:34,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:57:36,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:57:37,891 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:57:37,905 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:57:39,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:57:43,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 05:57:43,043 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 05:57:43,087 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 05:57:43,149 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:57:46,036 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 05:57:46,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:57:46,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:57:53,393 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=270493.3333333333, ans=0.0 2023-09-29 05:57:54,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:57:56,637 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 05:57:56,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 05:57:58,253 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:57:59,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:58:00,134 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=270493.3333333333, ans=0.0 2023-09-29 05:58:02,777 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 2.038e+02 2.381e+02 2.842e+02 4.419e+02, threshold=4.763e+02, percent-clipped=0.0 2023-09-29 05:58:03,299 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=270493.3333333333, ans=0.0 2023-09-29 05:58:04,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:58:06,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 05:58:06,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 05:58:08,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 05:58:09,757 INFO [train.py:1039] (1/4) Epoch 8, batch 3400, loss[loss=0.2378, simple_loss=0.2901, pruned_loss=0.09282, over 23352.00 frames. ], tot_loss[loss=0.2201, simple_loss=0.2869, pruned_loss=0.0767, over 4719560.32 frames. ], batch size: 285, lr: 1.28e-02, grad_scale: 16.0 2023-09-29 05:58:09,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:58:09,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 05:58:11,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:58:11,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 05:58:13,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:58:14,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:58:14,939 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 05:58:15,306 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=270560.0, ans=0.125 2023-09-29 05:58:15,335 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=270560.0, ans=0.125 2023-09-29 05:58:16,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:58:16,590 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 05:58:21,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 05:58:22,542 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 05:58:22,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:58:27,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:58:27,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 05:58:29,273 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:58:29,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:58:34,866 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=270626.6666666667, ans=0.125 2023-09-29 05:58:37,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:58:39,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 05:58:44,686 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 05:58:46,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:58:46,350 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:58:48,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 05:58:53,938 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.96 vs. limit=15.0 2023-09-29 05:58:54,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:58:56,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 05:59:01,578 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.03 vs. limit=15.0 2023-09-29 05:59:03,849 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:59:05,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:59:05,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 05:59:07,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:59:07,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:59:09,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:59:09,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:59:11,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:59:14,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 05:59:14,342 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:59:16,776 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=270826.6666666667, ans=0.1 2023-09-29 05:59:20,266 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:59:20,639 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 05:59:21,793 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 05:59:26,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 05:59:31,896 INFO [train.py:1039] (1/4) Epoch 8, batch 3450, loss[loss=0.2188, simple_loss=0.2671, pruned_loss=0.08523, over 23419.00 frames. ], tot_loss[loss=0.2203, simple_loss=0.2869, pruned_loss=0.07681, over 4725451.82 frames. ], batch size: 285, lr: 1.28e-02, grad_scale: 16.0 2023-09-29 05:59:32,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 05:59:38,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 05:59:38,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:59:40,826 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:59:40,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 05:59:42,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:59:45,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 05:59:50,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:59:52,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:59:53,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 05:59:53,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:59:55,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:59:55,859 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=270960.0, ans=0.125 2023-09-29 05:59:59,728 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.71 vs. limit=15.0 2023-09-29 06:00:02,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 06:00:06,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 06:00:06,758 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 06:00:08,245 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:00:08,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:00:10,753 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=271026.6666666667, ans=0.0 2023-09-29 06:00:15,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 06:00:15,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:00:20,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:00:20,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:00:23,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:00:25,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:00:26,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 06:00:26,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:00:28,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:00:31,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:00:33,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 06:00:36,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:00:41,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:00:42,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:00:46,622 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:00:47,987 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 2.051e+02 2.293e+02 2.662e+02 4.290e+02, threshold=4.586e+02, percent-clipped=0.0 2023-09-29 06:00:51,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:00:51,257 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:00:52,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:00:52,867 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:00:54,257 INFO [train.py:1039] (1/4) Epoch 8, batch 3500, loss[loss=0.2249, simple_loss=0.2694, pruned_loss=0.09022, over 23746.00 frames. ], tot_loss[loss=0.2182, simple_loss=0.2851, pruned_loss=0.07564, over 4728801.92 frames. ], batch size: 232, lr: 1.28e-02, grad_scale: 16.0 2023-09-29 06:00:56,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:00:59,638 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:01:01,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 06:01:03,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 06:01:06,166 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 06:01:07,952 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=271226.6666666667, ans=0.125 2023-09-29 06:01:09,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:01:09,187 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 06:01:14,080 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:01:15,572 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:01:15,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:01:17,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:01:17,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 06:01:17,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:01:19,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:01:19,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 06:01:22,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:01:22,624 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 06:01:24,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:01:28,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:01:28,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 06:01:29,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:01:32,689 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:01:34,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:01:34,538 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=271360.0, ans=0.125 2023-09-29 06:01:36,191 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:01:39,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:01:39,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:01:40,774 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 06:01:42,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 06:01:42,552 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=271426.6666666667, ans=0.125 2023-09-29 06:01:43,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 06:01:43,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:01:46,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:01:46,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:01:46,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:01:49,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 06:01:50,160 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=271426.6666666667, ans=0.2 2023-09-29 06:01:51,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:01:54,430 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=271426.6666666667, ans=0.2 2023-09-29 06:01:57,165 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:01:58,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 06:01:58,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 06:01:58,774 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:02:03,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:02:05,173 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:02:06,768 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:02:08,442 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 06:02:08,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:02:10,170 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:02:11,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 06:02:13,727 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 06:02:16,402 INFO [train.py:1039] (1/4) Epoch 8, batch 3550, loss[loss=0.2003, simple_loss=0.2621, pruned_loss=0.06927, over 23761.00 frames. ], tot_loss[loss=0.2168, simple_loss=0.2833, pruned_loss=0.07513, over 4727796.14 frames. ], batch size: 135, lr: 1.28e-02, grad_scale: 16.0 2023-09-29 06:02:16,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:02:18,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:02:19,557 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:02:19,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:02:22,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:02:26,212 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=271560.0, ans=0.125 2023-09-29 06:02:31,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:02:32,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 06:02:36,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:02:37,952 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:02:39,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:02:39,764 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=271626.6666666667, ans=0.0 2023-09-29 06:02:40,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:02:41,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:02:44,175 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:02:45,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:02:45,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:02:47,564 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 06:02:47,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 06:02:48,222 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:02:52,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:02:52,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:02:54,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:02:54,153 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:02:55,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:02:55,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 06:02:55,633 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:02:57,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:02:58,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 06:03:05,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:03:07,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:03:08,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:03:09,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 06:03:10,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 06:03:12,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 06:03:12,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:03:15,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:03:15,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:03:19,974 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 06:03:20,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:03:24,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:03:25,801 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 06:03:25,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:03:30,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:03:31,726 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 2.035e+02 2.264e+02 2.526e+02 3.446e+02, threshold=4.528e+02, percent-clipped=0.0 2023-09-29 06:03:31,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 06:03:39,316 INFO [train.py:1039] (1/4) Epoch 8, batch 3600, loss[loss=0.2134, simple_loss=0.289, pruned_loss=0.06885, over 24617.00 frames. ], tot_loss[loss=0.2155, simple_loss=0.2821, pruned_loss=0.07445, over 4708623.06 frames. ], batch size: 68, lr: 1.28e-02, grad_scale: 32.0 2023-09-29 06:03:40,948 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 06:03:40,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:03:41,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:03:44,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:03:44,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:03:46,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:03:48,227 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=271893.3333333333, ans=0.0 2023-09-29 06:03:50,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:03:52,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:03:54,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:03:54,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:03:55,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:03:55,662 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 06:04:00,772 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 06:04:02,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:04:02,771 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:04:04,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:04:06,021 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:04:07,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:04:07,596 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:04:07,626 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 06:04:09,656 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:04:11,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:04:13,389 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:04:16,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:04:17,731 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:04:19,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:04:21,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 06:04:24,895 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=272026.6666666667, ans=0.0 2023-09-29 06:04:26,380 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=272026.6666666667, ans=0.125 2023-09-29 06:04:27,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:04:27,969 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:04:29,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 06:04:29,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 06:04:34,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:04:39,986 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=272093.3333333333, ans=0.09899494936611666 2023-09-29 06:04:41,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:04:42,705 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:04:47,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 06:04:48,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:04:49,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 06:04:49,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 06:04:51,655 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.09 vs. limit=10.0 2023-09-29 06:04:52,493 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 06:04:54,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:04:54,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:04:55,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 06:04:57,521 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:04:57,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:04:58,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:04:59,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 06:05:00,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 06:05:01,991 INFO [train.py:1039] (1/4) Epoch 8, batch 3650, loss[loss=0.2112, simple_loss=0.2924, pruned_loss=0.06505, over 24484.00 frames. ], tot_loss[loss=0.216, simple_loss=0.2833, pruned_loss=0.07437, over 4720962.55 frames. ], batch size: 66, lr: 1.28e-02, grad_scale: 32.0 2023-09-29 06:05:03,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:05:03,769 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 06:05:04,166 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=272226.6666666667, ans=0.0 2023-09-29 06:05:07,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 06:05:07,966 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:05:09,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:05:12,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 06:05:14,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 06:05:19,444 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:05:19,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 06:05:19,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:05:24,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 06:05:25,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:05:26,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 06:05:28,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:05:28,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:05:30,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 06:05:31,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 06:05:32,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:05:32,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:05:35,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:05:36,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 06:05:38,451 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 06:05:39,400 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.55 vs. limit=15.0 2023-09-29 06:05:39,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:05:41,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 06:05:43,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:05:43,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:05:48,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 06:05:50,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:05:50,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 06:05:51,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 06:05:53,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:05:55,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:05:57,231 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:05:58,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:05:58,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:06:00,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 06:06:03,235 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:06:03,331 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:06:10,275 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=272493.3333333333, ans=0.125 2023-09-29 06:06:11,358 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 06:06:11,656 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=272493.3333333333, ans=0.0 2023-09-29 06:06:14,366 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:06:16,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:06:16,511 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 06:06:16,588 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:06:17,948 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.672e+02 2.110e+02 2.350e+02 2.595e+02 3.564e+02, threshold=4.700e+02, percent-clipped=0.0 2023-09-29 06:06:18,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 06:06:18,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:06:21,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 06:06:22,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:06:23,843 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 06:06:25,111 INFO [train.py:1039] (1/4) Epoch 8, batch 3700, loss[loss=0.2992, simple_loss=0.3385, pruned_loss=0.13, over 19507.00 frames. ], tot_loss[loss=0.2164, simple_loss=0.2839, pruned_loss=0.07447, over 4713044.97 frames. ], batch size: 389, lr: 1.28e-02, grad_scale: 32.0 2023-09-29 06:06:26,703 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:06:28,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:06:28,571 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=272560.0, ans=0.0 2023-09-29 06:06:30,491 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:06:30,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 06:06:30,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:06:31,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 06:06:32,008 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 06:06:36,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 06:06:40,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:06:41,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:06:41,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:06:41,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:06:43,485 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 06:06:46,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:06:48,238 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 06:06:57,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:06:57,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 06:06:58,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:06:58,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 06:06:58,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:07:02,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:07:03,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 06:07:05,405 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:07:07,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:07:10,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:07:10,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 06:07:12,485 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.97 vs. limit=6.0 2023-09-29 06:07:13,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 06:07:18,300 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:07:18,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 06:07:18,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:07:19,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 06:07:24,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:07:25,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:07:29,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:07:29,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 06:07:32,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:07:32,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 06:07:32,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:07:32,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:07:37,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:07:39,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 06:07:39,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 06:07:41,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:07:41,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:07:42,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:07:42,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:07:46,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:07:47,550 INFO [train.py:1039] (1/4) Epoch 8, batch 3750, loss[loss=0.2296, simple_loss=0.2909, pruned_loss=0.08417, over 23697.00 frames. ], tot_loss[loss=0.2175, simple_loss=0.2851, pruned_loss=0.07492, over 4706556.48 frames. ], batch size: 149, lr: 1.28e-02, grad_scale: 32.0 2023-09-29 06:07:47,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:07:49,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:07:51,483 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer_na.min_abs, batch_count=272893.3333333333, ans=0.02 2023-09-29 06:07:52,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 06:07:54,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 06:07:57,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 06:07:57,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 06:07:57,643 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=272893.3333333333, ans=0.1 2023-09-29 06:07:57,715 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=272893.3333333333, ans=0.04949747468305833 2023-09-29 06:07:58,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:08:00,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:08:01,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:08:04,319 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.48 vs. limit=15.0 2023-09-29 06:08:05,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:08:07,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:08:09,562 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=272960.0, ans=0.125 2023-09-29 06:08:10,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:08:12,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 06:08:16,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:08:18,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:08:20,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 06:08:20,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:08:22,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:08:22,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:08:25,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 06:08:30,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 06:08:31,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:08:32,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:08:33,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:08:37,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:08:41,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 06:08:44,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 06:08:44,647 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=273093.3333333333, ans=0.1 2023-09-29 06:08:45,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:08:49,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:08:51,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:08:54,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 06:08:57,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 06:08:59,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 06:09:01,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:09:02,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:09:04,244 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.657e+02 2.271e+02 2.610e+02 3.277e+02 5.264e+02, threshold=5.220e+02, percent-clipped=1.0 2023-09-29 06:09:05,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 06:09:11,161 INFO [train.py:1039] (1/4) Epoch 8, batch 3800, loss[loss=0.2277, simple_loss=0.2841, pruned_loss=0.08563, over 23819.00 frames. ], tot_loss[loss=0.2177, simple_loss=0.2851, pruned_loss=0.07512, over 4693176.30 frames. ], batch size: 179, lr: 1.28e-02, grad_scale: 32.0 2023-09-29 06:09:16,391 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:09:20,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:09:22,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 06:09:22,598 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 06:09:22,913 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=273226.6666666667, ans=0.2 2023-09-29 06:09:24,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:09:27,626 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:09:27,770 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 06:09:29,541 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=273293.3333333333, ans=0.1 2023-09-29 06:09:30,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 06:09:30,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:09:30,826 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:09:32,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:09:32,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:09:32,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:09:34,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 06:09:39,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 06:09:39,478 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:09:41,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:09:45,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:09:47,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:09:49,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 06:09:49,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:09:51,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:09:52,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:09:57,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 06:09:57,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 06:10:00,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:10:01,203 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=273426.6666666667, ans=0.125 2023-09-29 06:10:02,946 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=273426.6666666667, ans=0.0 2023-09-29 06:10:05,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:10:11,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:10:14,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 06:10:15,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 06:10:15,875 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:10:19,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:10:20,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:10:20,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 06:10:24,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 06:10:24,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 06:10:26,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:10:26,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=273493.3333333333, ans=0.1 2023-09-29 06:10:27,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:10:33,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:10:33,339 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=273560.0, ans=0.0 2023-09-29 06:10:34,282 INFO [train.py:1039] (1/4) Epoch 8, batch 3850, loss[loss=0.2159, simple_loss=0.271, pruned_loss=0.0804, over 23894.00 frames. ], tot_loss[loss=0.2161, simple_loss=0.2834, pruned_loss=0.07435, over 4701891.62 frames. ], batch size: 195, lr: 1.28e-02, grad_scale: 32.0 2023-09-29 06:10:34,473 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 06:10:37,779 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=273560.0, ans=0.125 2023-09-29 06:10:40,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:10:41,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 06:10:42,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:10:44,091 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:10:47,259 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 06:10:48,871 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:10:50,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 06:10:53,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 06:11:00,314 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:11:01,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:11:05,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:11:05,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:11:09,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:11:10,092 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:11:11,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:11:11,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:11:13,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:11:14,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:11:14,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:11:15,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:11:17,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 06:11:17,160 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 06:11:18,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:11:18,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:11:21,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:11:21,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:11:21,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 06:11:24,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 06:11:26,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:11:28,493 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 06:11:30,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 06:11:35,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:11:37,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:11:43,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:11:43,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 06:11:45,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 06:11:48,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:11:48,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:11:50,377 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.736e+02 2.044e+02 2.316e+02 2.829e+02 5.158e+02, threshold=4.631e+02, percent-clipped=0.0 2023-09-29 06:11:52,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 06:11:52,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 06:11:52,252 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:11:53,688 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:11:53,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:11:53,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 06:11:55,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:11:56,715 INFO [train.py:1039] (1/4) Epoch 8, batch 3900, loss[loss=0.1901, simple_loss=0.2608, pruned_loss=0.05973, over 24317.00 frames. ], tot_loss[loss=0.2151, simple_loss=0.2819, pruned_loss=0.07421, over 4687006.90 frames. ], batch size: 56, lr: 1.28e-02, grad_scale: 32.0 2023-09-29 06:11:56,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 06:11:56,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:11:56,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:11:58,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:11:59,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:12:00,648 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.96 vs. limit=22.5 2023-09-29 06:12:02,041 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:12:02,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:12:02,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:12:04,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:12:04,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 06:12:05,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:12:05,855 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_ff2.min_abs, batch_count=273893.3333333333, ans=0.1 2023-09-29 06:12:08,661 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:12:10,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 06:12:10,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:12:11,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:12:13,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 06:12:15,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:12:16,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 06:12:18,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 06:12:18,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:12:20,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 06:12:20,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:12:21,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 06:12:23,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 06:12:28,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:12:29,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:12:30,014 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:12:31,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:12:34,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:12:36,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:12:37,208 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=274026.6666666667, ans=0.0 2023-09-29 06:12:39,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:12:39,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:12:39,207 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:12:39,587 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=274026.6666666667, ans=0.125 2023-09-29 06:12:45,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:12:46,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:12:53,018 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=11.16 vs. limit=15.0 2023-09-29 06:12:53,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 06:12:55,210 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:13:05,026 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:13:08,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:13:08,165 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 06:13:08,228 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 06:13:08,262 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:13:08,551 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=274160.0, ans=0.1 2023-09-29 06:13:10,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 06:13:12,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:13:13,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 06:13:18,958 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=274226.6666666667, ans=0.0 2023-09-29 06:13:20,395 INFO [train.py:1039] (1/4) Epoch 8, batch 3950, loss[loss=0.2206, simple_loss=0.281, pruned_loss=0.08008, over 23632.00 frames. ], tot_loss[loss=0.2147, simple_loss=0.2814, pruned_loss=0.07403, over 4676412.54 frames. ], batch size: 256, lr: 1.27e-02, grad_scale: 32.0 2023-09-29 06:13:22,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:13:23,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 06:13:23,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:13:26,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:13:29,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:13:31,123 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=274226.6666666667, ans=0.125 2023-09-29 06:13:36,110 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 06:13:36,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:13:36,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 06:13:37,553 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 06:13:37,591 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:13:40,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:13:40,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:13:40,676 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:13:42,932 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.76 vs. limit=15.0 2023-09-29 06:13:45,506 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 06:13:47,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:13:49,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:13:49,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:13:50,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:13:52,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:13:53,271 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.61 vs. limit=10.0 2023-09-29 06:14:03,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:14:03,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:14:04,276 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=274360.0, ans=0.035 2023-09-29 06:14:07,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 06:14:12,741 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 06:14:12,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 06:14:14,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:14:15,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:14:24,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:14:24,681 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=274493.3333333333, ans=0.125 2023-09-29 06:14:25,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 06:14:26,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:14:26,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:14:26,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 06:14:26,315 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=274493.3333333333, ans=0.0 2023-09-29 06:14:26,466 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:14:26,940 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.54 vs. limit=15.0 2023-09-29 06:14:31,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:14:32,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:14:36,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 06:14:37,531 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.764e+02 2.191e+02 2.483e+02 2.950e+02 4.567e+02, threshold=4.966e+02, percent-clipped=0.0 2023-09-29 06:14:42,736 INFO [train.py:1039] (1/4) Epoch 8, batch 4000, loss[loss=0.2193, simple_loss=0.2744, pruned_loss=0.08212, over 23361.00 frames. ], tot_loss[loss=0.2153, simple_loss=0.2822, pruned_loss=0.07424, over 4686730.44 frames. ], batch size: 119, lr: 1.27e-02, grad_scale: 32.0 2023-09-29 06:14:44,827 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=274560.0, ans=0.1 2023-09-29 06:14:45,229 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.97 vs. limit=22.5 2023-09-29 06:14:46,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:14:55,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:15:02,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:15:02,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:15:02,472 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:15:03,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 06:15:03,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 06:15:03,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 06:15:06,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:15:06,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 06:15:06,404 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=274626.6666666667, ans=0.125 2023-09-29 06:15:07,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:15:11,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:15:11,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:15:11,190 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:15:12,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:15:12,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 06:15:14,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:15:15,883 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 06:15:17,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:15:17,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:15:21,131 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 06:15:22,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 06:15:22,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:15:22,763 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=274693.3333333333, ans=0.0 2023-09-29 06:15:28,927 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 06:15:31,085 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:15:32,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:15:34,648 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 06:15:35,076 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=274760.0, ans=0.0 2023-09-29 06:15:36,150 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:15:36,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 06:15:36,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:15:36,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:15:37,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:15:39,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:15:39,776 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=274760.0, ans=0.0 2023-09-29 06:15:41,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 06:15:41,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:15:43,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 06:15:43,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:15:46,072 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 06:15:52,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:15:54,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 06:15:57,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:15:57,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:15:59,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:16:01,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:16:02,193 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=274826.6666666667, ans=0.07 2023-09-29 06:16:03,699 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:16:04,755 INFO [train.py:1039] (1/4) Epoch 8, batch 4050, loss[loss=0.1994, simple_loss=0.2819, pruned_loss=0.05846, over 24678.00 frames. ], tot_loss[loss=0.2163, simple_loss=0.2832, pruned_loss=0.07474, over 4687204.39 frames. ], batch size: 68, lr: 1.27e-02, grad_scale: 32.0 2023-09-29 06:16:08,703 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:16:11,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 06:16:11,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 06:16:13,840 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:16:15,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:16:17,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:16:18,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 06:16:18,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:16:22,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:16:26,604 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:16:26,676 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 06:16:30,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:16:30,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:16:32,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:16:35,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 06:16:38,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 06:16:38,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 06:16:39,951 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 06:16:41,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:16:48,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 06:16:50,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:16:54,443 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=275093.3333333333, ans=0.125 2023-09-29 06:16:55,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:16:57,481 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=275093.3333333333, ans=0.125 2023-09-29 06:16:58,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:17:00,106 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:17:00,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:17:03,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:17:06,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 06:17:06,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 06:17:08,403 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:17:08,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 06:17:13,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:17:13,682 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=275160.0, ans=0.2 2023-09-29 06:17:20,138 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=275160.0, ans=0.1 2023-09-29 06:17:21,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 06:17:23,487 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 2.014e+02 2.168e+02 2.447e+02 3.458e+02, threshold=4.336e+02, percent-clipped=0.0 2023-09-29 06:17:23,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:17:23,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:17:27,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 06:17:27,159 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 06:17:27,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:17:28,552 INFO [train.py:1039] (1/4) Epoch 8, batch 4100, loss[loss=0.2454, simple_loss=0.3239, pruned_loss=0.08343, over 24650.00 frames. ], tot_loss[loss=0.2182, simple_loss=0.2851, pruned_loss=0.07569, over 4690690.38 frames. ], batch size: 68, lr: 1.27e-02, grad_scale: 32.0 2023-09-29 06:17:28,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:17:30,247 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:17:30,285 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:17:30,626 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=275226.6666666667, ans=0.1 2023-09-29 06:17:32,547 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.08 vs. limit=15.0 2023-09-29 06:17:36,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 06:17:38,065 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 06:17:38,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 06:17:39,877 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.63 vs. limit=10.0 2023-09-29 06:17:40,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 06:17:40,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:17:40,498 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:17:40,553 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:17:40,588 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:17:42,172 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 06:17:44,005 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:17:46,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:17:46,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:17:46,400 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=275293.3333333333, ans=0.2 2023-09-29 06:17:47,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:17:51,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 06:17:53,434 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:17:53,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:17:53,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 06:17:57,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:17:57,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:17:57,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:17:57,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:17:59,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 06:18:00,772 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:18:02,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 06:18:03,923 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:18:05,819 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=275360.0, ans=0.0 2023-09-29 06:18:06,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:18:06,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 06:18:08,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:18:08,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:18:08,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:18:11,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 06:18:14,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:18:14,996 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:18:16,589 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 06:18:18,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:18:18,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 06:18:23,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:18:27,954 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:18:31,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:18:31,640 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:18:42,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:18:42,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:18:42,881 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=275493.3333333333, ans=0.0 2023-09-29 06:18:45,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:18:47,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:18:50,471 INFO [train.py:1039] (1/4) Epoch 8, batch 4150, loss[loss=0.2315, simple_loss=0.3006, pruned_loss=0.08122, over 24060.00 frames. ], tot_loss[loss=0.218, simple_loss=0.285, pruned_loss=0.07547, over 4702375.70 frames. ], batch size: 80, lr: 1.27e-02, grad_scale: 16.0 2023-09-29 06:18:50,856 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 06:18:52,342 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:18:54,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:18:54,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:18:57,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 06:18:58,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:18:58,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 06:19:00,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 06:19:00,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 06:19:02,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:19:06,053 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=275626.6666666667, ans=0.04949747468305833 2023-09-29 06:19:08,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:19:08,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:19:13,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:19:14,713 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:19:14,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 06:19:15,564 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=6.59 vs. limit=12.0 2023-09-29 06:19:16,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 06:19:16,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:19:17,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 06:19:21,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:19:22,259 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=275693.3333333333, ans=0.125 2023-09-29 06:19:23,698 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_abs, batch_count=275693.3333333333, ans=0.5 2023-09-29 06:19:24,225 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=14.85 vs. limit=15.0 2023-09-29 06:19:25,022 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:19:26,752 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=275693.3333333333, ans=0.0 2023-09-29 06:19:28,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 06:19:29,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 06:19:29,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:19:30,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 06:19:30,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:19:30,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:19:30,527 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=275693.3333333333, ans=0.125 2023-09-29 06:19:33,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:19:34,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:19:39,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 06:19:40,978 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=275760.0, ans=0.2 2023-09-29 06:19:42,317 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 06:19:43,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:19:45,346 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 06:19:45,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:19:47,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 06:19:48,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:19:50,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:19:50,413 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=275760.0, ans=0.125 2023-09-29 06:19:50,430 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=275760.0, ans=0.1 2023-09-29 06:19:51,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:19:52,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 06:19:52,463 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:19:52,466 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 06:19:55,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 06:19:56,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 06:19:56,932 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:19:56,939 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 06:19:56,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 06:19:58,464 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 06:20:00,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:20:00,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 06:20:01,920 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:20:03,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:20:03,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 06:20:03,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 06:20:09,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:20:11,682 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.524e+02 2.232e+02 3.048e+02 3.830e+02 6.363e+02, threshold=6.096e+02, percent-clipped=13.0 2023-09-29 06:20:13,770 INFO [train.py:1039] (1/4) Epoch 8, batch 4200, loss[loss=0.2181, simple_loss=0.2495, pruned_loss=0.09333, over 19482.00 frames. ], tot_loss[loss=0.2174, simple_loss=0.2836, pruned_loss=0.07561, over 4683326.64 frames. ], batch size: 388, lr: 1.27e-02, grad_scale: 8.0 2023-09-29 06:20:13,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 06:20:15,505 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:20:17,219 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:20:19,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:20:20,090 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:20:20,093 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:20:23,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 06:20:23,267 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=275893.3333333333, ans=0.1 2023-09-29 06:20:26,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 06:20:26,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:20:29,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:20:31,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:20:35,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 06:20:38,245 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:20:38,286 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:20:38,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 06:20:38,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:20:38,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:20:40,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:20:40,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:20:41,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:20:43,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 06:20:43,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:20:48,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 06:20:50,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:20:53,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:20:53,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:20:55,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:20:55,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 06:20:55,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:20:58,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:21:00,292 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=276026.6666666667, ans=0.0 2023-09-29 06:21:03,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 06:21:05,783 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:21:13,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:21:16,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 06:21:18,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:21:23,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 06:21:25,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:21:28,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 06:21:32,648 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 06:21:33,096 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=276160.0, ans=0.0 2023-09-29 06:21:35,632 INFO [train.py:1039] (1/4) Epoch 8, batch 4250, loss[loss=0.235, simple_loss=0.298, pruned_loss=0.08605, over 23305.00 frames. ], tot_loss[loss=0.2159, simple_loss=0.2826, pruned_loss=0.07463, over 4690296.78 frames. ], batch size: 119, lr: 1.27e-02, grad_scale: 8.0 2023-09-29 06:21:36,597 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.94 vs. limit=15.0 2023-09-29 06:21:37,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:21:37,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 06:21:41,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:21:45,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 06:21:47,149 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 06:21:47,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:21:49,420 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=14.01 vs. limit=15.0 2023-09-29 06:21:52,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:21:54,322 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=276293.3333333333, ans=0.125 2023-09-29 06:21:55,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:21:59,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:21:59,440 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:22:02,582 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:22:02,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:22:05,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:22:05,632 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:22:07,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:22:08,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:22:10,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:22:12,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 06:22:17,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 06:22:17,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:22:17,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:22:17,321 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:22:18,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:22:20,116 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:22:20,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:22:24,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 06:22:25,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:22:28,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:22:30,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:22:32,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 06:22:32,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:22:33,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 06:22:35,262 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:22:36,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:22:38,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:22:38,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:22:41,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 06:22:43,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 06:22:43,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 06:22:45,355 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=276493.3333333333, ans=0.0 2023-09-29 06:22:46,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:22:50,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:22:51,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:22:53,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:22:53,312 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:22:55,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:22:56,719 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.653e+02 2.004e+02 2.225e+02 2.717e+02 4.251e+02, threshold=4.450e+02, percent-clipped=0.0 2023-09-29 06:22:56,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:22:56,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 06:22:57,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:22:58,440 INFO [train.py:1039] (1/4) Epoch 8, batch 4300, loss[loss=0.2379, simple_loss=0.3115, pruned_loss=0.08214, over 24643.00 frames. ], tot_loss[loss=0.2156, simple_loss=0.2829, pruned_loss=0.07417, over 4707416.22 frames. ], batch size: 68, lr: 1.27e-02, grad_scale: 8.0 2023-09-29 06:23:05,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:23:05,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:23:08,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:23:10,505 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=276560.0, ans=0.125 2023-09-29 06:23:16,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:23:16,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 06:23:16,389 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:23:20,658 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 06:23:20,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:23:20,714 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 06:23:25,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 06:23:27,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:23:30,991 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 06:23:31,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:23:31,046 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 06:23:32,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 06:23:35,556 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:23:35,838 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=276693.3333333333, ans=0.125 2023-09-29 06:23:38,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:23:38,671 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:23:40,818 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:23:43,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:23:45,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:23:45,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 06:23:45,350 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 06:23:47,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:23:50,244 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=276760.0, ans=0.125 2023-09-29 06:23:51,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:23:51,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 06:23:51,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:23:51,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:23:51,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 06:23:51,465 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 06:23:52,957 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 06:23:53,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:23:53,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 06:23:53,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 06:23:57,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:23:59,249 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 06:23:59,794 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.85 vs. limit=6.0 2023-09-29 06:24:00,841 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:24:04,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:24:04,132 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:24:05,793 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 06:24:07,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:24:07,296 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:24:08,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:24:08,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:24:08,941 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:24:11,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:24:12,697 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.57 vs. limit=6.0 2023-09-29 06:24:15,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:24:16,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:24:16,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:24:19,240 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=276893.3333333333, ans=0.0 2023-09-29 06:24:20,350 INFO [train.py:1039] (1/4) Epoch 8, batch 4350, loss[loss=0.2105, simple_loss=0.2765, pruned_loss=0.07231, over 23271.00 frames. ], tot_loss[loss=0.2164, simple_loss=0.2836, pruned_loss=0.0746, over 4708589.57 frames. ], batch size: 119, lr: 1.27e-02, grad_scale: 8.0 2023-09-29 06:24:20,928 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=276893.3333333333, ans=0.04949747468305833 2023-09-29 06:24:22,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 06:24:22,148 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 06:24:26,856 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:24:31,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:24:32,790 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=9.59 vs. limit=15.0 2023-09-29 06:24:33,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:24:33,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:24:33,630 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=276893.3333333333, ans=0.125 2023-09-29 06:24:39,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:24:44,461 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:24:46,267 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=276960.0, ans=0.0 2023-09-29 06:24:47,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:24:47,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:24:49,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 06:24:52,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:24:53,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:24:57,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 06:24:57,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:24:59,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:25:01,264 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=277026.6666666667, ans=0.0 2023-09-29 06:25:03,286 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=277026.6666666667, ans=0.0 2023-09-29 06:25:05,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:25:08,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 06:25:11,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:25:11,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 06:25:15,935 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 06:25:17,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:25:17,471 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 06:25:18,964 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 06:25:20,353 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 06:25:20,372 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:25:20,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:25:21,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:25:23,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:25:23,404 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:25:23,472 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:25:25,318 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=277160.0, ans=0.1 2023-09-29 06:25:27,069 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 06:25:27,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:25:27,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:25:28,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:25:28,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 06:25:30,139 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 06:25:30,149 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 06:25:30,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 06:25:33,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:25:33,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:25:35,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:25:35,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:25:38,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 06:25:39,667 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.760e+02 2.095e+02 2.321e+02 2.736e+02 4.922e+02, threshold=4.641e+02, percent-clipped=1.0 2023-09-29 06:25:41,148 INFO [train.py:1039] (1/4) Epoch 8, batch 4400, loss[loss=0.2292, simple_loss=0.3065, pruned_loss=0.07602, over 24645.00 frames. ], tot_loss[loss=0.2161, simple_loss=0.2837, pruned_loss=0.07419, over 4714669.07 frames. ], batch size: 68, lr: 1.27e-02, grad_scale: 16.0 2023-09-29 06:25:41,237 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 06:25:41,259 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:25:46,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:25:46,837 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:25:47,131 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:25:48,523 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:25:51,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 06:25:51,515 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 06:25:51,565 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 06:25:51,617 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 06:25:53,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 06:25:53,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:25:54,727 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 06:25:56,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:25:57,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:25:59,264 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 06:26:00,923 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:26:00,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 06:26:02,756 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 06:26:06,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 06:26:06,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 06:26:06,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 06:26:06,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:26:08,411 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:26:08,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:26:08,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:26:10,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 06:26:11,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 06:26:12,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:26:15,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:26:15,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:26:18,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:26:18,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:26:18,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 06:26:18,761 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 06:26:23,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:26:24,967 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=277360.0, ans=0.0 2023-09-29 06:26:29,236 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:26:29,926 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.95 vs. limit=15.0 2023-09-29 06:26:33,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 06:26:37,598 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:26:38,690 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:26:40,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:26:42,182 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:26:44,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 06:26:44,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:26:44,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:26:44,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:26:45,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 06:26:47,810 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=277493.3333333333, ans=0.04949747468305833 2023-09-29 06:26:51,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 06:26:54,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 06:26:55,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 06:26:56,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:26:56,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 06:26:56,190 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:26:59,228 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:27:00,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 06:27:03,750 INFO [train.py:1039] (1/4) Epoch 8, batch 4450, loss[loss=0.2952, simple_loss=0.3312, pruned_loss=0.1296, over 19647.00 frames. ], tot_loss[loss=0.2183, simple_loss=0.2854, pruned_loss=0.07558, over 4705648.54 frames. ], batch size: 388, lr: 1.27e-02, grad_scale: 16.0 2023-09-29 06:27:03,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:27:06,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:27:08,426 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:27:15,187 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:27:16,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:27:20,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:27:22,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:27:25,867 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=277626.6666666667, ans=0.0 2023-09-29 06:27:27,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:27:27,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:27:27,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 06:27:27,639 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:27:27,750 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:27:27,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:27:27,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:27:30,895 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 06:27:31,772 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.88 vs. limit=15.0 2023-09-29 06:27:32,674 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=277626.6666666667, ans=0.1 2023-09-29 06:27:36,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:27:36,974 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:27:38,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:27:38,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:27:40,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:27:42,498 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.76 vs. limit=22.5 2023-09-29 06:27:45,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 06:27:46,875 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 06:27:46,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 06:27:46,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:27:51,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:27:53,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 06:27:56,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 06:28:00,231 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:28:02,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 06:28:02,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:28:02,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:28:02,345 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:28:02,365 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:28:05,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:28:08,356 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 06:28:08,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 06:28:10,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 06:28:11,715 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:28:13,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:28:14,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:28:16,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 06:28:19,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 06:28:21,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 06:28:24,957 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 2.112e+02 2.512e+02 3.151e+02 6.272e+02, threshold=5.024e+02, percent-clipped=2.0 2023-09-29 06:28:25,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:28:26,499 INFO [train.py:1039] (1/4) Epoch 8, batch 4500, loss[loss=0.2103, simple_loss=0.2827, pruned_loss=0.06893, over 23392.00 frames. ], tot_loss[loss=0.218, simple_loss=0.2852, pruned_loss=0.07545, over 4715712.09 frames. ], batch size: 93, lr: 1.27e-02, grad_scale: 16.0 2023-09-29 06:28:26,968 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.max_abs, batch_count=277893.3333333333, ans=10.0 2023-09-29 06:28:28,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:28:29,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 06:28:29,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 06:28:32,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:28:39,828 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:28:39,911 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:28:41,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 06:28:41,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:28:41,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:28:42,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:28:43,451 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=277960.0, ans=0.0 2023-09-29 06:28:53,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:28:55,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:28:57,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:28:57,781 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:28:59,264 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:29:05,451 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 06:29:11,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:29:12,834 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=278026.6666666667, ans=0.0 2023-09-29 06:29:15,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 06:29:15,784 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=278093.3333333333, ans=0.125 2023-09-29 06:29:18,617 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:29:20,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 06:29:20,291 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:29:21,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:29:23,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:29:23,258 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:29:24,111 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.26 vs. limit=15.0 2023-09-29 06:29:26,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:29:26,344 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 06:29:26,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 06:29:26,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:29:31,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:29:31,792 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:29:34,054 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.23 vs. limit=15.0 2023-09-29 06:29:34,733 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:29:37,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:29:37,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:29:39,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 06:29:40,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 06:29:40,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 06:29:45,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 06:29:48,372 INFO [train.py:1039] (1/4) Epoch 8, batch 4550, loss[loss=0.2407, simple_loss=0.2921, pruned_loss=0.09462, over 23770.00 frames. ], tot_loss[loss=0.218, simple_loss=0.2847, pruned_loss=0.07561, over 4713862.66 frames. ], batch size: 164, lr: 1.27e-02, grad_scale: 16.0 2023-09-29 06:29:48,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 06:29:49,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:29:53,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:29:53,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:29:53,474 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=278226.6666666667, ans=0.125 2023-09-29 06:29:56,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:29:58,216 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=278226.6666666667, ans=0.125 2023-09-29 06:29:58,620 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.81 vs. limit=15.0 2023-09-29 06:30:00,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:30:01,143 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=278226.6666666667, ans=0.0 2023-09-29 06:30:04,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:30:06,009 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:30:06,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:30:06,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:30:09,015 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:30:09,083 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:30:12,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:30:16,617 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 06:30:18,056 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 06:30:18,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:30:20,226 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=17.44 vs. limit=15.0 2023-09-29 06:30:21,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 06:30:22,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 06:30:24,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:30:27,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 06:30:29,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 06:30:32,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:30:32,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:30:32,298 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:30:35,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 06:30:39,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:30:40,249 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=11.42 vs. limit=15.0 2023-09-29 06:30:40,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:30:40,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:30:44,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:30:44,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 06:30:44,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 06:30:45,790 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:30:45,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 06:30:49,667 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 06:30:49,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:30:51,909 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:30:51,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:30:53,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:30:53,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:30:55,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 06:30:55,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 06:30:56,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:30:56,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 06:30:58,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 06:30:58,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:30:58,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 06:31:01,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:31:01,396 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:31:01,706 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=278493.3333333333, ans=0.05 2023-09-29 06:31:04,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:31:04,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:31:04,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 06:31:05,906 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:31:09,474 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.931e+02 2.102e+02 2.382e+02 3.783e+02, threshold=4.205e+02, percent-clipped=0.0 2023-09-29 06:31:09,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 06:31:11,116 INFO [train.py:1039] (1/4) Epoch 8, batch 4600, loss[loss=0.222, simple_loss=0.2964, pruned_loss=0.0738, over 24657.00 frames. ], tot_loss[loss=0.2167, simple_loss=0.2834, pruned_loss=0.07502, over 4711360.88 frames. ], batch size: 73, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:31:11,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:31:12,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:31:16,365 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:31:16,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:31:16,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:31:16,665 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 06:31:19,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:31:20,672 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=278560.0, ans=0.0 2023-09-29 06:31:24,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:31:24,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:31:27,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:31:32,623 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=278626.6666666667, ans=0.2 2023-09-29 06:31:34,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 06:31:35,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:31:38,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:31:40,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:31:42,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:31:42,909 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=278693.3333333333, ans=0.125 2023-09-29 06:31:42,919 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:31:48,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 06:31:48,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 06:31:50,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:31:55,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:31:55,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 06:31:58,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:32:01,356 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 06:32:02,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 06:32:07,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:32:09,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:32:11,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:32:11,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 06:32:12,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:32:12,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 06:32:13,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:32:13,847 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=278760.0, ans=0.125 2023-09-29 06:32:15,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:32:18,529 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:32:18,627 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:32:18,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:32:19,133 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=278826.6666666667, ans=0.1 2023-09-29 06:32:20,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 06:32:20,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 06:32:20,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 06:32:20,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:32:21,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:32:21,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:32:23,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:32:34,344 INFO [train.py:1039] (1/4) Epoch 8, batch 4650, loss[loss=0.2174, simple_loss=0.2836, pruned_loss=0.07559, over 24547.00 frames. ], tot_loss[loss=0.2158, simple_loss=0.2828, pruned_loss=0.07441, over 4713648.27 frames. ], batch size: 63, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:32:35,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:32:38,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:32:40,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:32:40,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:32:41,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:32:41,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:32:43,153 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:32:46,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 06:32:50,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:32:53,107 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 06:32:53,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:32:53,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 06:32:53,462 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=278960.0, ans=0.125 2023-09-29 06:32:54,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:32:54,744 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 06:32:54,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 06:32:54,792 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:32:55,112 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=278960.0, ans=0.125 2023-09-29 06:32:56,222 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:32:57,949 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:33:01,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:33:01,467 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 06:33:04,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:33:06,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 06:33:07,328 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.26 vs. limit=15.0 2023-09-29 06:33:08,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:33:08,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:33:11,541 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 06:33:13,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:33:16,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:33:18,128 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:33:24,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:33:27,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:33:28,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:33:28,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:33:31,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 06:33:31,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 06:33:31,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 06:33:31,349 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 06:33:33,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:33:40,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:33:40,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:33:40,227 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 06:33:40,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:33:43,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:33:43,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:33:43,340 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:33:43,648 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=279160.0, ans=0.125 2023-09-29 06:33:46,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:33:46,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:33:47,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:33:52,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:33:52,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:33:52,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 06:33:53,556 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 2.046e+02 2.215e+02 2.491e+02 3.733e+02, threshold=4.429e+02, percent-clipped=0.0 2023-09-29 06:33:53,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 06:33:55,149 INFO [train.py:1039] (1/4) Epoch 8, batch 4700, loss[loss=0.2259, simple_loss=0.2954, pruned_loss=0.07818, over 24658.00 frames. ], tot_loss[loss=0.2167, simple_loss=0.2842, pruned_loss=0.07455, over 4724144.32 frames. ], batch size: 65, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:33:55,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 06:33:57,572 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 06:34:05,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:34:07,635 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:34:07,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:34:09,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:34:09,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 06:34:16,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 06:34:17,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 06:34:19,260 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:34:19,395 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=279293.3333333333, ans=0.1 2023-09-29 06:34:22,061 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:34:22,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:34:23,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:34:24,489 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.19 vs. limit=22.5 2023-09-29 06:34:28,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 06:34:29,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 06:34:33,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:34:43,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 06:34:43,597 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=279426.6666666667, ans=0.0 2023-09-29 06:34:44,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:34:46,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:34:51,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 06:34:53,033 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:34:56,160 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:34:57,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 06:34:57,887 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=279426.6666666667, ans=0.0 2023-09-29 06:34:59,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:34:59,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:35:00,181 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.88 vs. limit=15.0 2023-09-29 06:35:02,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:35:02,304 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:35:02,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 06:35:02,442 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 06:35:03,977 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=279493.3333333333, ans=0.125 2023-09-29 06:35:04,057 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=279493.3333333333, ans=0.0 2023-09-29 06:35:05,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:35:07,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:35:07,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:35:07,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 06:35:08,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:35:12,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 06:35:15,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:35:15,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:35:17,384 INFO [train.py:1039] (1/4) Epoch 8, batch 4750, loss[loss=0.2917, simple_loss=0.3329, pruned_loss=0.1252, over 19576.00 frames. ], tot_loss[loss=0.2181, simple_loss=0.2855, pruned_loss=0.07535, over 4709184.70 frames. ], batch size: 388, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:35:21,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:35:21,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:35:21,531 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=279560.0, ans=0.125 2023-09-29 06:35:24,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 06:35:24,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:35:26,561 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=279560.0, ans=0.125 2023-09-29 06:35:27,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 06:35:29,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:35:30,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:35:31,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:35:38,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 06:35:42,471 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:35:44,267 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=279626.6666666667, ans=0.07 2023-09-29 06:35:45,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 06:35:46,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:35:47,206 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=279626.6666666667, ans=0.125 2023-09-29 06:35:50,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:35:50,195 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:35:50,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:35:51,650 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 06:35:51,667 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 06:36:00,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 06:36:04,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:36:06,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:36:09,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 06:36:09,698 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 06:36:09,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:36:12,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:36:15,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:36:16,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 06:36:17,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 06:36:17,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:36:18,870 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:36:18,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:36:20,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 06:36:22,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 06:36:22,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 06:36:25,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:36:27,357 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:36:27,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 06:36:27,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:36:27,800 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=279826.6666666667, ans=0.1 2023-09-29 06:36:29,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:36:30,541 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 06:36:32,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:36:34,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 06:36:38,015 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 2.176e+02 2.410e+02 2.744e+02 3.912e+02, threshold=4.820e+02, percent-clipped=0.0 2023-09-29 06:36:38,244 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:36:39,647 INFO [train.py:1039] (1/4) Epoch 8, batch 4800, loss[loss=0.268, simple_loss=0.3182, pruned_loss=0.1089, over 22659.00 frames. ], tot_loss[loss=0.219, simple_loss=0.2859, pruned_loss=0.07608, over 4714308.94 frames. ], batch size: 322, lr: 1.26e-02, grad_scale: 32.0 2023-09-29 06:36:39,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 06:36:41,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 06:36:42,602 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 06:36:44,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 06:36:44,292 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:36:45,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 06:36:51,697 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:36:53,135 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:36:59,799 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:36:59,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:36:59,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:37:01,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 06:37:01,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:37:03,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:37:04,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:37:08,614 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:37:10,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:37:10,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:37:12,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:37:12,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 06:37:12,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:37:12,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:37:12,551 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=280026.6666666667, ans=0.125 2023-09-29 06:37:12,611 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=280026.6666666667, ans=0.0 2023-09-29 06:37:16,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:37:18,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:37:21,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:37:21,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 06:37:21,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 06:37:22,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:37:24,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 06:37:24,651 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 06:37:26,131 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:37:26,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:37:26,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:37:26,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:37:26,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:37:29,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:37:29,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:37:31,716 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=280093.3333333333, ans=0.1 2023-09-29 06:37:33,255 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:37:36,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:37:38,234 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=280093.3333333333, ans=0.125 2023-09-29 06:37:39,311 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:37:43,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 06:37:43,769 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:37:45,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:37:45,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 06:37:46,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:37:48,534 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=280160.0, ans=0.0 2023-09-29 06:37:48,540 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=280160.0, ans=0.125 2023-09-29 06:37:49,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:37:50,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:37:50,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:37:51,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:37:51,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:37:53,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:37:54,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:37:56,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:37:56,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:37:57,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 06:38:00,676 INFO [train.py:1039] (1/4) Epoch 8, batch 4850, loss[loss=0.1985, simple_loss=0.2501, pruned_loss=0.07344, over 23435.00 frames. ], tot_loss[loss=0.2183, simple_loss=0.2855, pruned_loss=0.07557, over 4723899.08 frames. ], batch size: 285, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:38:00,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 06:38:00,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:38:00,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:38:00,903 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:38:00,904 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:38:04,952 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=280226.6666666667, ans=0.2 2023-09-29 06:38:05,963 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:38:13,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 06:38:16,709 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:38:21,348 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:38:22,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 06:38:22,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:38:26,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:38:28,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:38:29,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:38:29,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 06:38:31,362 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.06 vs. limit=22.5 2023-09-29 06:38:32,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:38:35,362 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:38:37,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 06:38:37,591 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 06:38:37,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 06:38:40,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:38:40,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:38:45,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:38:45,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 06:38:45,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 06:38:46,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 06:38:55,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:38:55,892 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 06:38:57,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:38:57,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:38:58,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 06:39:01,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 06:39:01,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:39:03,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 06:39:03,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:39:03,596 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:39:05,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 06:39:14,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:39:18,118 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=280493.3333333333, ans=0.1 2023-09-29 06:39:19,592 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:39:19,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:39:23,647 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 2.212e+02 2.561e+02 3.191e+02 4.940e+02, threshold=5.123e+02, percent-clipped=1.0 2023-09-29 06:39:23,689 INFO [train.py:1039] (1/4) Epoch 8, batch 4900, loss[loss=0.209, simple_loss=0.2739, pruned_loss=0.07209, over 24451.00 frames. ], tot_loss[loss=0.2167, simple_loss=0.284, pruned_loss=0.07469, over 4721824.44 frames. ], batch size: 58, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:39:26,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 06:39:26,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:39:30,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:39:32,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:39:33,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:39:37,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 06:39:41,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 06:39:41,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=280626.6666666667, ans=0.125 2023-09-29 06:39:44,366 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.77 vs. limit=22.5 2023-09-29 06:39:45,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 06:39:46,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 06:39:46,873 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 06:39:48,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:39:48,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:39:48,230 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:39:49,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:39:49,684 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 06:39:52,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 06:39:52,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 06:39:54,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:39:54,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 06:39:56,484 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=280693.3333333333, ans=0.1 2023-09-29 06:40:00,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:40:00,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:40:01,803 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:40:01,817 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 06:40:03,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:40:03,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=280693.3333333333, ans=0.125 2023-09-29 06:40:04,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:40:04,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 06:40:06,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 06:40:09,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 06:40:09,899 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=280693.3333333333, ans=0.0 2023-09-29 06:40:11,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:40:12,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:40:12,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:40:12,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:40:13,156 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=280760.0, ans=0.1 2023-09-29 06:40:14,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 06:40:14,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:40:14,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 06:40:18,004 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:40:19,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 06:40:21,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:40:24,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 06:40:24,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:40:24,438 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 06:40:24,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 06:40:35,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:40:36,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:40:38,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 06:40:38,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 06:40:38,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:40:41,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:40:44,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:40:44,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:40:44,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:40:44,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 06:40:45,618 INFO [train.py:1039] (1/4) Epoch 8, batch 4950, loss[loss=0.18, simple_loss=0.2498, pruned_loss=0.05514, over 24441.00 frames. ], tot_loss[loss=0.216, simple_loss=0.283, pruned_loss=0.07443, over 4722366.28 frames. ], batch size: 58, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:40:45,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 06:40:49,533 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:40:50,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 06:40:53,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 06:40:54,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 06:40:54,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 06:40:55,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 06:40:55,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:40:55,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:40:55,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:40:57,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:40:58,831 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:40:58,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:41:01,065 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:41:01,934 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.99 vs. limit=15.0 2023-09-29 06:41:02,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:41:04,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:41:04,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:41:07,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 06:41:12,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:41:14,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:41:17,124 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:41:17,206 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:41:18,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:41:21,003 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 06:41:22,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 06:41:24,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:41:27,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:41:27,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:41:28,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:41:28,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:41:28,793 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:41:30,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:41:32,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 06:41:35,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:41:39,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:41:39,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:41:39,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 06:41:39,759 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=281093.3333333333, ans=0.0 2023-09-29 06:41:40,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:41:41,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:41:45,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:41:47,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:41:47,227 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:41:47,926 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=19.90 vs. limit=22.5 2023-09-29 06:41:48,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:41:48,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:41:50,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:41:51,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:41:51,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:41:51,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:41:55,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 06:41:59,686 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:42:02,200 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.04 vs. limit=15.0 2023-09-29 06:42:05,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 06:42:05,795 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 06:42:07,486 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.702e+02 2.069e+02 2.336e+02 2.676e+02 4.238e+02, threshold=4.671e+02, percent-clipped=0.0 2023-09-29 06:42:07,528 INFO [train.py:1039] (1/4) Epoch 8, batch 5000, loss[loss=0.193, simple_loss=0.2583, pruned_loss=0.06388, over 24438.00 frames. ], tot_loss[loss=0.2148, simple_loss=0.2818, pruned_loss=0.07383, over 4724863.09 frames. ], batch size: 58, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:42:08,423 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2.whitening_limit, batch_count=281226.6666666667, ans=15.0 2023-09-29 06:42:12,568 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=281226.6666666667, ans=0.2 2023-09-29 06:42:14,520 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:42:14,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 06:42:14,829 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=281226.6666666667, ans=0.125 2023-09-29 06:42:15,928 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 06:42:16,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 06:42:17,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:42:20,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 06:42:20,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:42:20,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:42:20,893 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=281226.6666666667, ans=0.0 2023-09-29 06:42:23,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 06:42:23,592 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:42:25,083 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:42:25,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 06:42:25,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:42:25,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:42:28,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 06:42:29,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 06:42:29,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:42:31,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 06:42:31,134 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 06:42:31,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:42:31,290 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 06:42:31,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 06:42:33,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 06:42:34,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 06:42:34,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:42:34,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:42:36,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 06:42:36,528 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 06:42:36,755 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=281293.3333333333, ans=0.1 2023-09-29 06:42:39,488 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:42:39,756 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=281360.0, ans=0.1 2023-09-29 06:42:40,856 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:42:41,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 06:42:43,190 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 06:42:44,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:42:48,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:42:48,451 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=281360.0, ans=0.0 2023-09-29 06:42:51,419 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 06:42:55,073 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:42:56,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:42:56,564 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:42:59,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 06:42:59,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:43:01,161 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:43:01,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:43:04,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 06:43:04,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:43:04,893 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=281426.6666666667, ans=0.125 2023-09-29 06:43:07,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:43:09,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:43:10,137 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.82 vs. limit=15.0 2023-09-29 06:43:13,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 06:43:17,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:43:26,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:43:28,498 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:43:28,512 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:43:29,898 INFO [train.py:1039] (1/4) Epoch 8, batch 5050, loss[loss=0.1887, simple_loss=0.2645, pruned_loss=0.05645, over 24304.00 frames. ], tot_loss[loss=0.2153, simple_loss=0.2824, pruned_loss=0.07406, over 4717312.04 frames. ], batch size: 61, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:43:29,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:43:30,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 06:43:30,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 06:43:30,123 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:43:32,135 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=281560.0, ans=0.125 2023-09-29 06:43:32,145 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=281560.0, ans=0.125 2023-09-29 06:43:33,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:43:34,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 06:43:36,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:43:37,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:43:40,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:43:40,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 06:43:41,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:43:41,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:43:44,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 06:43:46,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:43:46,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 06:43:55,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 06:43:55,749 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 06:43:57,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:43:57,453 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=281626.6666666667, ans=0.125 2023-09-29 06:43:58,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 06:43:58,715 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:44:01,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:44:01,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:44:03,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:44:03,461 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 06:44:03,607 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 06:44:05,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:44:05,515 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=281693.3333333333, ans=0.0 2023-09-29 06:44:08,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:44:09,192 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.81 vs. limit=10.0 2023-09-29 06:44:11,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:44:11,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 06:44:14,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:44:17,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 06:44:18,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:44:19,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:44:21,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:44:21,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:44:22,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:44:24,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:44:26,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:44:26,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:44:26,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:44:26,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 06:44:28,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:44:30,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:44:35,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:44:35,024 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 06:44:35,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 06:44:37,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:44:37,163 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=281826.6666666667, ans=0.0 2023-09-29 06:44:38,540 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:44:38,576 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 06:44:41,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:44:41,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 06:44:41,607 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:44:43,561 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=281826.6666666667, ans=0.125 2023-09-29 06:44:44,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:44:46,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:44:46,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 06:44:48,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 06:44:50,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:44:51,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:44:51,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:44:52,963 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.731e+02 2.260e+02 2.527e+02 2.886e+02 4.203e+02, threshold=5.054e+02, percent-clipped=0.0 2023-09-29 06:44:53,005 INFO [train.py:1039] (1/4) Epoch 8, batch 5100, loss[loss=0.2186, simple_loss=0.2979, pruned_loss=0.06961, over 24623.00 frames. ], tot_loss[loss=0.2164, simple_loss=0.2841, pruned_loss=0.07439, over 4716837.21 frames. ], batch size: 68, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:44:53,330 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 06:44:56,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:44:59,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 06:44:59,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 06:44:59,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:45:02,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:45:06,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:45:06,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 06:45:06,579 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 06:45:13,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:45:13,363 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:45:18,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:45:21,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 06:45:22,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:45:24,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:45:24,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 06:45:26,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:45:27,840 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:45:27,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 06:45:31,473 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 06:45:32,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:45:32,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 06:45:33,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 06:45:36,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:45:46,007 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:45:48,904 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.93 vs. limit=12.0 2023-09-29 06:45:49,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 06:45:49,448 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 06:45:49,472 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 06:45:51,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 06:45:52,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:45:55,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 06:46:00,095 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 06:46:02,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 06:46:03,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:46:05,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 06:46:07,732 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 06:46:09,021 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 06:46:13,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:46:13,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:46:13,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:46:15,132 INFO [train.py:1039] (1/4) Epoch 8, batch 5150, loss[loss=0.2015, simple_loss=0.2801, pruned_loss=0.06147, over 24306.00 frames. ], tot_loss[loss=0.2175, simple_loss=0.2849, pruned_loss=0.07501, over 4713339.60 frames. ], batch size: 61, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:46:15,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:46:15,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 06:46:17,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:46:18,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 06:46:18,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 06:46:18,964 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 06:46:19,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:46:19,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 06:46:19,210 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:46:21,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 06:46:22,748 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:46:24,380 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:46:24,786 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=282226.6666666667, ans=0.2 2023-09-29 06:46:28,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 06:46:28,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 06:46:30,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:46:30,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:46:32,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 06:46:32,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:46:32,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:46:33,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:46:33,813 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:46:33,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 06:46:37,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:46:37,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:46:39,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 06:46:41,266 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 06:46:41,573 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=282293.3333333333, ans=0.0 2023-09-29 06:46:42,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:46:49,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:46:49,537 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=282360.0, ans=0.125 2023-09-29 06:46:50,123 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.11 vs. limit=15.0 2023-09-29 06:46:52,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 06:46:57,118 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:47:00,712 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=282360.0, ans=0.125 2023-09-29 06:47:03,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:47:03,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:47:06,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:47:08,003 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:47:11,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 06:47:16,763 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:47:19,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:47:19,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:47:21,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:47:23,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:47:24,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 06:47:29,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:47:29,977 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 06:47:31,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:47:31,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:47:33,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 06:47:33,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 06:47:33,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:47:33,402 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=282493.3333333333, ans=0.125 2023-09-29 06:47:35,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:47:38,011 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.727e+02 2.091e+02 2.433e+02 2.751e+02 4.119e+02, threshold=4.867e+02, percent-clipped=0.0 2023-09-29 06:47:38,056 INFO [train.py:1039] (1/4) Epoch 8, batch 5200, loss[loss=0.1939, simple_loss=0.2628, pruned_loss=0.06247, over 24318.00 frames. ], tot_loss[loss=0.2184, simple_loss=0.2853, pruned_loss=0.07572, over 4705249.34 frames. ], batch size: 61, lr: 1.26e-02, grad_scale: 32.0 2023-09-29 06:47:39,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:47:41,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:47:44,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:47:50,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 06:47:50,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:47:51,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:47:53,687 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=282626.6666666667, ans=0.2 2023-09-29 06:47:54,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:47:56,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:47:56,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:47:56,703 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=282626.6666666667, ans=0.0 2023-09-29 06:47:58,251 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=282626.6666666667, ans=0.125 2023-09-29 06:48:00,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 06:48:00,307 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=282626.6666666667, ans=0.0 2023-09-29 06:48:01,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 06:48:03,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:48:05,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 06:48:08,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 06:48:09,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:48:10,006 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.max_positive, batch_count=282693.3333333333, ans=0.95 2023-09-29 06:48:11,065 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 06:48:11,146 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 06:48:13,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 06:48:14,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:48:14,824 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 06:48:14,835 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:48:16,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:48:16,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:48:17,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 06:48:19,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:48:19,576 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=282693.3333333333, ans=0.2 2023-09-29 06:48:21,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:48:25,178 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 06:48:25,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 06:48:26,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 06:48:29,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 06:48:29,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 06:48:37,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:48:38,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:48:39,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 06:48:41,037 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:48:41,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 06:48:41,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:48:41,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 06:48:44,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:48:45,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:48:51,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:48:51,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:48:51,296 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:48:55,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:48:58,601 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 06:48:58,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:48:58,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:49:00,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:49:01,688 INFO [train.py:1039] (1/4) Epoch 8, batch 5250, loss[loss=0.2397, simple_loss=0.2905, pruned_loss=0.09448, over 23740.00 frames. ], tot_loss[loss=0.2183, simple_loss=0.2849, pruned_loss=0.07581, over 4704123.96 frames. ], batch size: 150, lr: 1.26e-02, grad_scale: 32.0 2023-09-29 06:49:01,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 06:49:03,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:49:03,620 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=282893.3333333333, ans=0.1 2023-09-29 06:49:05,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:49:08,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:49:08,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:49:10,367 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:49:12,314 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=282893.3333333333, ans=0.025 2023-09-29 06:49:16,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:49:18,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:49:20,032 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=282960.0, ans=0.0 2023-09-29 06:49:21,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:49:21,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 06:49:24,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 06:49:24,893 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:49:26,883 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:49:48,290 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:50:10,365 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.05 vs. limit=22.5 2023-09-29 06:50:16,296 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.699e+02 2.079e+02 2.357e+02 2.633e+02 5.213e+02, threshold=4.714e+02, percent-clipped=2.0 2023-09-29 06:50:16,339 INFO [train.py:1039] (1/4) Epoch 8, batch 5300, loss[loss=0.1927, simple_loss=0.2697, pruned_loss=0.05788, over 24459.00 frames. ], tot_loss[loss=0.2168, simple_loss=0.2832, pruned_loss=0.07521, over 4698133.92 frames. ], batch size: 63, lr: 1.25e-02, grad_scale: 32.0 2023-09-29 06:50:19,337 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=283226.6666666667, ans=0.0 2023-09-29 06:50:24,890 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=283226.6666666667, ans=0.2 2023-09-29 06:50:31,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:50:31,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 06:50:31,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 06:50:31,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:50:32,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:50:32,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:50:32,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:50:32,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:50:32,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:50:32,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:50:32,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 06:50:33,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:50:33,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 06:50:33,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 06:50:33,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 06:50:34,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 06:50:34,095 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 06:50:34,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 06:50:34,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:50:34,918 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:50:34,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:50:35,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:50:35,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:50:35,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:50:35,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:50:35,855 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:50:36,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:50:36,025 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:50:36,031 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:50:36,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:50:36,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:50:37,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 06:50:37,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:50:38,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:50:38,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 06:50:38,079 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 06:50:38,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:50:38,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:50:38,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 06:50:38,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 06:50:38,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 06:50:39,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:50:39,624 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:50:39,822 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 06:50:39,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 06:50:39,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 06:50:40,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:50:40,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 06:50:40,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 06:50:40,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 06:50:40,719 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 06:50:50,914 INFO [train.py:1039] (1/4) Epoch 9, batch 0, loss[loss=0.2189, simple_loss=0.2806, pruned_loss=0.07859, over 23282.00 frames. ], tot_loss[loss=0.2189, simple_loss=0.2806, pruned_loss=0.07859, over 23282.00 frames. ], batch size: 119, lr: 1.19e-02, grad_scale: 32.0 2023-09-29 06:50:50,914 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-29 06:51:04,827 INFO [train.py:1071] (1/4) Epoch 9, validation: loss=0.2824, simple_loss=0.2767, pruned_loss=0.144, over 1125622.00 frames. 2023-09-29 06:51:04,828 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-29 06:51:06,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 06:51:06,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:51:08,086 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:51:09,788 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=283306.6666666667, ans=0.125 2023-09-29 06:51:14,741 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:51:14,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:51:14,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:51:16,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 06:51:17,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 06:51:19,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:51:20,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:51:24,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:51:24,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:51:26,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:51:26,120 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:51:29,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 06:51:29,446 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=283373.3333333333, ans=0.125 2023-09-29 06:51:30,703 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:51:40,463 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:51:40,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:51:42,679 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 06:51:45,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 06:51:47,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:51:48,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:51:49,553 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2.whitening_limit, batch_count=283440.0, ans=15.0 2023-09-29 06:51:51,163 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:51:56,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:52:02,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 06:52:04,509 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.75 vs. limit=15.0 2023-09-29 06:52:06,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 06:52:06,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:52:06,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:52:06,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:52:06,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:52:10,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 06:52:13,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:52:13,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:52:17,402 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:52:20,489 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 06:52:23,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:52:23,766 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=283573.3333333333, ans=0.05 2023-09-29 06:52:26,929 INFO [train.py:1039] (1/4) Epoch 9, batch 50, loss[loss=0.2272, simple_loss=0.2892, pruned_loss=0.0826, over 23272.00 frames. ], tot_loss[loss=0.2167, simple_loss=0.2842, pruned_loss=0.07459, over 1065853.22 frames. ], batch size: 119, lr: 1.19e-02, grad_scale: 32.0 2023-09-29 06:52:27,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:52:28,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:52:28,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 06:52:30,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 06:52:30,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:52:31,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:52:34,063 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:52:36,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:52:40,324 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=283640.0, ans=0.05 2023-09-29 06:52:41,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 06:52:41,394 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:52:48,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 06:52:51,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 06:52:52,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 06:52:56,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:52:57,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:52:57,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:52:59,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:53:01,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 06:53:01,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 06:53:01,200 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:53:06,105 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=283773.3333333333, ans=0.125 2023-09-29 06:53:09,091 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=283773.3333333333, ans=0.0 2023-09-29 06:53:09,106 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=283773.3333333333, ans=0.0 2023-09-29 06:53:10,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:53:12,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:53:12,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:53:14,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 06:53:14,534 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=283840.0, ans=0.125 2023-09-29 06:53:15,803 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:53:17,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:53:17,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 06:53:17,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:53:18,109 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.78 vs. limit=22.5 2023-09-29 06:53:19,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 06:53:27,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:53:27,388 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:53:28,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:53:30,301 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 2.133e+02 2.436e+02 2.893e+02 4.514e+02, threshold=4.872e+02, percent-clipped=0.0 2023-09-29 06:53:30,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:53:30,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 06:53:34,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 06:53:34,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 06:53:36,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:53:36,418 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 06:53:37,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:53:39,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:53:39,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 06:53:39,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 06:53:40,907 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 06:53:42,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:53:42,595 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=283906.6666666667, ans=0.125 2023-09-29 06:53:43,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:53:43,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 06:53:43,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 06:53:46,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:53:47,492 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:53:48,948 INFO [train.py:1039] (1/4) Epoch 9, batch 100, loss[loss=0.2207, simple_loss=0.2853, pruned_loss=0.07808, over 23642.00 frames. ], tot_loss[loss=0.2183, simple_loss=0.2858, pruned_loss=0.07535, over 1879707.27 frames. ], batch size: 149, lr: 1.19e-02, grad_scale: 16.0 2023-09-29 06:53:49,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 06:53:49,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:53:52,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:53:56,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:54:02,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:54:02,236 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.min_positive, batch_count=283973.3333333333, ans=0.025 2023-09-29 06:54:03,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 06:54:03,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:54:05,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=284040.0, ans=0.07 2023-09-29 06:54:06,912 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:54:06,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:54:06,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:54:08,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:54:08,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:54:09,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 06:54:13,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 06:54:13,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:54:14,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:54:14,030 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:54:18,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 06:54:20,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:54:20,473 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=284106.6666666667, ans=0.95 2023-09-29 06:54:21,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:54:21,746 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 06:54:23,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 06:54:28,351 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 06:54:28,374 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 06:54:30,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:54:30,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:54:34,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 06:54:38,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:54:39,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:54:44,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:54:44,215 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 06:54:45,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 06:54:49,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 06:54:51,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:54:51,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:54:55,413 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:54:58,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:55:00,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:55:03,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:55:04,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:55:04,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:55:04,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:55:04,765 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:55:05,425 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.26 vs. limit=15.0 2023-09-29 06:55:06,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 06:55:06,305 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 06:55:06,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:55:08,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:55:08,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:55:08,735 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:55:10,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 06:55:10,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 06:55:11,519 INFO [train.py:1039] (1/4) Epoch 9, batch 150, loss[loss=0.2146, simple_loss=0.2895, pruned_loss=0.06985, over 24336.00 frames. ], tot_loss[loss=0.2177, simple_loss=0.2861, pruned_loss=0.07462, over 2523803.02 frames. ], batch size: 74, lr: 1.19e-02, grad_scale: 16.0 2023-09-29 06:55:11,640 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 06:55:11,649 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:55:11,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:55:13,331 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:55:14,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:55:14,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:55:16,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:55:21,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:55:21,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:55:23,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:55:24,380 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.84 vs. limit=6.0 2023-09-29 06:55:26,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:55:26,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:55:30,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 06:55:30,182 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:55:32,596 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=284373.3333333333, ans=0.5 2023-09-29 06:55:35,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 06:55:35,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 06:55:35,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 06:55:37,116 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:55:37,124 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:55:39,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:55:41,559 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:55:41,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:55:41,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:55:43,796 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:55:43,981 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 06:55:45,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:55:51,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:55:56,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:55:58,692 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 06:56:03,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:56:03,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:56:03,435 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:56:06,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:56:08,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:56:08,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:56:10,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:56:11,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 06:56:16,140 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 2.032e+02 2.478e+02 3.173e+02 5.553e+02, threshold=4.955e+02, percent-clipped=3.0 2023-09-29 06:56:16,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:56:16,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:56:17,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:56:17,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:56:20,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:56:23,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 06:56:26,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:56:26,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:56:26,513 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=284573.3333333333, ans=0.125 2023-09-29 06:56:27,790 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:56:29,353 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:56:29,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 06:56:29,697 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=284573.3333333333, ans=0.2 2023-09-29 06:56:30,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:56:30,764 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 06:56:34,635 INFO [train.py:1039] (1/4) Epoch 9, batch 200, loss[loss=0.2092, simple_loss=0.2778, pruned_loss=0.07034, over 23373.00 frames. ], tot_loss[loss=0.2173, simple_loss=0.2857, pruned_loss=0.07447, over 3014678.76 frames. ], batch size: 105, lr: 1.19e-02, grad_scale: 16.0 2023-09-29 06:56:36,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:56:39,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:56:39,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:56:42,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 06:56:44,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:56:44,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:56:45,017 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=284640.0, ans=0.0 2023-09-29 06:56:47,750 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 06:56:49,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 06:56:50,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:56:52,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:56:56,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:56:57,517 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:56:57,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:56:59,408 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=284706.6666666667, ans=0.125 2023-09-29 06:57:11,280 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=284773.3333333333, ans=0.0 2023-09-29 06:57:14,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:57:15,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:57:16,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:57:16,531 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=284773.3333333333, ans=0.125 2023-09-29 06:57:17,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:57:19,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 06:57:19,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 06:57:19,507 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=284773.3333333333, ans=0.125 2023-09-29 06:57:20,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:57:22,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:57:23,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:57:23,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:57:25,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 06:57:25,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 06:57:25,441 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:57:28,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:57:36,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:57:44,128 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:57:44,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:57:52,354 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:57:55,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 06:57:55,538 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:57:56,447 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=24.07 vs. limit=22.5 2023-09-29 06:57:56,914 INFO [train.py:1039] (1/4) Epoch 9, batch 250, loss[loss=0.1857, simple_loss=0.2617, pruned_loss=0.05486, over 24377.00 frames. ], tot_loss[loss=0.2171, simple_loss=0.2849, pruned_loss=0.07464, over 3381395.89 frames. ], batch size: 61, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 06:57:56,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:57:56,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:57:58,456 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:57:58,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 06:58:00,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:58:00,252 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 06:58:01,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:58:05,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:58:06,572 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:58:06,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:58:08,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:58:09,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:58:11,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:58:14,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:58:16,699 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=285040.0, ans=0.125 2023-09-29 06:58:24,643 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=13.62 vs. limit=15.0 2023-09-29 06:58:25,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:58:28,458 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:58:28,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:58:34,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 06:58:36,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 06:58:36,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 06:58:36,879 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.57 vs. limit=15.0 2023-09-29 06:58:37,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:58:39,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 06:58:39,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:58:39,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:58:40,149 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=285106.6666666667, ans=0.07 2023-09-29 06:58:41,336 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:58:42,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 06:58:43,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:58:45,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:58:46,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:58:46,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:58:46,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:58:48,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:58:48,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:58:50,982 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:58:53,168 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:58:53,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:58:58,441 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:59:01,251 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.649e+02 2.006e+02 2.267e+02 2.549e+02 3.617e+02, threshold=4.534e+02, percent-clipped=0.0 2023-09-29 06:59:01,765 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=285240.0, ans=0.2 2023-09-29 06:59:03,135 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=285240.0, ans=0.125 2023-09-29 06:59:04,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:59:06,326 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=285240.0, ans=0.125 2023-09-29 06:59:07,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:59:12,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:59:12,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:59:16,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 06:59:18,720 INFO [train.py:1039] (1/4) Epoch 9, batch 300, loss[loss=0.1933, simple_loss=0.2465, pruned_loss=0.07006, over 23373.00 frames. ], tot_loss[loss=0.2145, simple_loss=0.2818, pruned_loss=0.07359, over 3674753.33 frames. ], batch size: 285, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 06:59:18,798 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:59:18,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:59:20,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 06:59:22,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 06:59:22,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:59:22,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 06:59:22,963 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=285306.6666666667, ans=0.1 2023-09-29 06:59:25,096 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.38 vs. limit=22.5 2023-09-29 06:59:27,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:59:29,602 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:59:34,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:59:34,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 06:59:36,068 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:59:37,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 06:59:37,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 06:59:37,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:59:41,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:59:46,332 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:59:46,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 06:59:50,863 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 06:59:52,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:59:53,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:59:55,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:59:55,380 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 06:59:55,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:59:57,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:00:00,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:00:02,076 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:00:05,334 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=285440.0, ans=0.125 2023-09-29 07:00:07,677 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 07:00:07,685 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 07:00:07,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:00:08,222 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=285506.6666666667, ans=0.125 2023-09-29 07:00:10,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:00:12,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 07:00:14,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:00:16,442 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.72 vs. limit=15.0 2023-09-29 07:00:19,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:00:19,805 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=285506.6666666667, ans=0.125 2023-09-29 07:00:22,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:00:22,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 07:00:25,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:00:25,655 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 07:00:27,334 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:00:27,816 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.14 vs. limit=15.0 2023-09-29 07:00:28,878 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:00:29,086 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=285573.3333333333, ans=0.2 2023-09-29 07:00:30,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 07:00:31,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 07:00:32,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:00:32,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 07:00:32,636 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.min_positive, batch_count=285573.3333333333, ans=0.025 2023-09-29 07:00:35,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:00:35,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:00:35,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:00:37,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:00:37,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:00:37,377 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=285573.3333333333, ans=0.125 2023-09-29 07:00:41,755 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.44 vs. limit=6.0 2023-09-29 07:00:42,580 INFO [train.py:1039] (1/4) Epoch 9, batch 350, loss[loss=0.1931, simple_loss=0.2693, pruned_loss=0.05849, over 24609.00 frames. ], tot_loss[loss=0.213, simple_loss=0.2801, pruned_loss=0.07291, over 3900101.04 frames. ], batch size: 60, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:00:44,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:00:44,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 07:00:47,223 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:00:53,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:00:55,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:00:56,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:00:58,645 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 07:01:00,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:01:00,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 07:01:02,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:01:03,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 07:01:05,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:01:08,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 07:01:09,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:01:12,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:01:13,699 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=285773.3333333333, ans=0.0 2023-09-29 07:01:14,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:01:15,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:01:15,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:01:16,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:01:16,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:01:16,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 07:01:19,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:01:19,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:01:24,494 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=285773.3333333333, ans=0.125 2023-09-29 07:01:27,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:01:27,910 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:01:27,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:01:29,429 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:01:35,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 07:01:35,606 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:01:35,895 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=285840.0, ans=0.0 2023-09-29 07:01:40,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:01:40,921 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:01:40,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:01:42,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 07:01:45,357 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.686e+02 1.981e+02 2.324e+02 2.780e+02 5.402e+02, threshold=4.648e+02, percent-clipped=1.0 2023-09-29 07:01:45,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:01:47,012 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 07:01:47,219 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 07:01:47,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:01:51,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:01:51,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 07:01:54,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:01:58,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:01:59,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:02:01,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:02:01,037 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:02:03,923 INFO [train.py:1039] (1/4) Epoch 9, batch 400, loss[loss=0.2001, simple_loss=0.2838, pruned_loss=0.05822, over 24645.00 frames. ], tot_loss[loss=0.2126, simple_loss=0.2801, pruned_loss=0.07256, over 4089251.00 frames. ], batch size: 68, lr: 1.18e-02, grad_scale: 32.0 2023-09-29 07:02:04,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:02:07,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:02:08,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:02:09,344 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.56 vs. limit=15.0 2023-09-29 07:02:10,487 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=285973.3333333333, ans=0.0 2023-09-29 07:02:11,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 07:02:11,621 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:02:11,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:02:13,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:02:13,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:02:16,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:02:18,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:02:20,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 07:02:21,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 07:02:21,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:02:23,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 07:02:25,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:02:28,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:02:28,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:02:28,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 07:02:28,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:02:28,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:02:28,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:02:28,755 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=286040.0, ans=10.0 2023-09-29 07:02:29,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:02:31,561 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 07:02:31,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 07:02:36,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:02:38,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:02:38,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 07:02:39,860 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 07:02:44,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:02:46,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:02:52,838 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 07:02:58,460 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 07:02:59,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 07:03:01,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:03:02,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:03:02,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 07:03:06,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:03:09,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 07:03:11,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:03:11,968 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.39 vs. limit=15.0 2023-09-29 07:03:14,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:03:16,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 07:03:19,143 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 07:03:19,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 07:03:20,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 07:03:20,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:03:22,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 07:03:25,686 INFO [train.py:1039] (1/4) Epoch 9, batch 450, loss[loss=0.1846, simple_loss=0.2589, pruned_loss=0.05515, over 20950.00 frames. ], tot_loss[loss=0.2137, simple_loss=0.2806, pruned_loss=0.07341, over 4205227.65 frames. ], batch size: 46, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:03:25,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:03:25,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:03:26,719 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 07:03:29,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 07:03:29,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:03:31,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:03:32,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:03:32,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 07:03:32,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:03:34,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 07:03:36,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 07:03:44,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:03:45,031 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.68 vs. limit=15.0 2023-09-29 07:03:46,363 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:03:47,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 07:03:49,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 07:03:51,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:03:54,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:03:56,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:03:59,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:03:59,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:04:02,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 07:04:02,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 07:04:05,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 07:04:05,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:04:07,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:04:07,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:04:10,824 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 07:04:10,837 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 07:04:12,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:04:13,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:04:15,297 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 07:04:19,146 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.36 vs. limit=10.0 2023-09-29 07:04:20,362 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 07:04:20,428 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:04:21,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 07:04:23,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 07:04:26,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:04:29,801 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 07:04:29,865 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:04:31,300 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.979e+02 2.168e+02 2.458e+02 3.361e+02, threshold=4.337e+02, percent-clipped=0.0 2023-09-29 07:04:31,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 07:04:36,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:04:36,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 07:04:38,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 07:04:39,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:04:43,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:04:46,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:04:48,027 INFO [train.py:1039] (1/4) Epoch 9, batch 500, loss[loss=0.2285, simple_loss=0.2913, pruned_loss=0.08289, over 23215.00 frames. ], tot_loss[loss=0.2156, simple_loss=0.2821, pruned_loss=0.07456, over 4302029.05 frames. ], batch size: 105, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:04:48,189 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:04:48,230 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 07:04:51,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:04:52,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:04:52,973 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:04:52,987 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 07:04:55,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 07:04:55,053 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:04:59,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 07:05:03,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 07:05:04,713 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:05:07,790 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:05:08,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:05:08,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:05:17,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:05:19,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 07:05:19,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 07:05:20,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:05:20,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 07:05:20,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 07:05:24,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:05:26,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:05:26,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:05:26,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:05:28,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 07:05:31,376 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 07:05:31,665 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=286773.3333333333, ans=0.0 2023-09-29 07:05:34,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:05:34,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:05:36,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:05:37,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:05:37,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 07:05:40,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 07:05:43,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:05:44,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:05:46,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:05:48,901 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.07 vs. limit=15.0 2023-09-29 07:05:49,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:05:54,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:05:54,417 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=286906.6666666667, ans=0.125 2023-09-29 07:05:58,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 07:05:58,460 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:05:58,478 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:06:01,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 07:06:01,751 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 07:06:04,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:06:09,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 07:06:10,900 INFO [train.py:1039] (1/4) Epoch 9, batch 550, loss[loss=0.2022, simple_loss=0.2674, pruned_loss=0.06854, over 20156.00 frames. ], tot_loss[loss=0.2162, simple_loss=0.2833, pruned_loss=0.07456, over 4390059.96 frames. ], batch size: 44, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:06:11,413 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=286973.3333333333, ans=0.04949747468305833 2023-09-29 07:06:12,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 07:06:12,477 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:06:13,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 07:06:14,859 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=286973.3333333333, ans=0.07 2023-09-29 07:06:15,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:06:15,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:06:17,181 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:06:17,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:06:17,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:06:18,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:06:20,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:06:21,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 07:06:21,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:06:27,839 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:06:27,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:06:31,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:06:31,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:06:31,862 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=287040.0, ans=0.1 2023-09-29 07:06:36,779 WARNING [train.py:1197] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 07:06:38,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 07:06:39,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:06:43,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:06:44,868 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:06:46,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:06:48,642 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:06:48,652 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 07:06:50,329 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:06:51,524 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:06:53,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 07:06:57,368 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:06:57,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:06:57,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:06:59,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:07:01,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 07:07:02,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 07:07:03,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:07:03,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:07:04,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:07:04,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:07:06,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:07:10,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:07:11,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:07:11,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:07:13,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 07:07:14,838 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 2.039e+02 2.212e+02 2.496e+02 3.392e+02, threshold=4.424e+02, percent-clipped=0.0 2023-09-29 07:07:14,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 07:07:17,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:07:17,251 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 07:07:18,693 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:07:20,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 07:07:20,392 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 07:07:20,702 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=287240.0, ans=0.125 2023-09-29 07:07:28,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 07:07:31,087 INFO [train.py:1039] (1/4) Epoch 9, batch 600, loss[loss=0.2079, simple_loss=0.289, pruned_loss=0.06341, over 24444.00 frames. ], tot_loss[loss=0.2163, simple_loss=0.2834, pruned_loss=0.07462, over 4454005.25 frames. ], batch size: 77, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:07:31,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 07:07:34,393 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:07:34,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 07:07:34,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:07:36,322 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=287306.6666666667, ans=10.0 2023-09-29 07:07:41,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:07:42,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 07:07:45,002 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 07:07:45,208 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=287306.6666666667, ans=0.035 2023-09-29 07:07:47,868 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:07:50,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:07:52,369 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:07:54,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 07:07:54,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:08:01,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 07:08:05,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:08:05,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:08:07,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:08:11,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:08:11,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:08:13,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:08:15,991 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.13 vs. limit=15.0 2023-09-29 07:08:22,091 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:08:25,696 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:08:25,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:08:25,725 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:08:33,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 07:08:38,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 07:08:38,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:08:42,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 07:08:44,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:08:45,006 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=287573.3333333333, ans=0.1 2023-09-29 07:08:46,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 07:08:46,193 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:08:46,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:08:46,487 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=287573.3333333333, ans=0.0 2023-09-29 07:08:48,125 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=287573.3333333333, ans=0.0 2023-09-29 07:08:50,367 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=287573.3333333333, ans=0.125 2023-09-29 07:08:52,980 INFO [train.py:1039] (1/4) Epoch 9, batch 650, loss[loss=0.1881, simple_loss=0.2667, pruned_loss=0.05476, over 24459.00 frames. ], tot_loss[loss=0.2156, simple_loss=0.2824, pruned_loss=0.07443, over 4504459.08 frames. ], batch size: 66, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:08:53,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 07:08:53,361 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=287640.0, ans=0.0 2023-09-29 07:08:55,217 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 07:08:58,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:08:58,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:09:00,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:09:04,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 07:09:04,450 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=287640.0, ans=0.125 2023-09-29 07:09:05,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:09:11,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:09:11,752 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:09:14,863 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:09:19,376 WARNING [train.py:1197] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 07:09:20,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:09:21,007 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:09:24,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:09:26,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 07:09:29,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:09:29,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:09:31,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 07:09:32,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:09:34,455 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:09:36,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 07:09:36,733 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 07:09:36,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:09:36,777 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:09:39,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:09:42,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:09:42,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:09:42,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:09:44,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 07:09:45,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:09:45,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:09:47,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 07:09:47,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:09:47,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 07:09:49,095 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 07:09:50,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 07:09:50,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:09:50,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:09:50,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:09:50,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:09:53,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:09:58,843 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.700e+02 2.023e+02 2.249e+02 2.557e+02 3.525e+02, threshold=4.498e+02, percent-clipped=0.0 2023-09-29 07:10:01,192 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:10:01,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:10:03,271 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:10:06,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:10:07,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 07:10:07,773 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:10:15,899 INFO [train.py:1039] (1/4) Epoch 9, batch 700, loss[loss=0.2051, simple_loss=0.2672, pruned_loss=0.07153, over 23626.00 frames. ], tot_loss[loss=0.2133, simple_loss=0.2804, pruned_loss=0.07314, over 4556242.48 frames. ], batch size: 256, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:10:15,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 07:10:15,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:10:16,033 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:10:16,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:10:20,808 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 07:10:22,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 07:10:22,890 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=287973.3333333333, ans=0.0 2023-09-29 07:10:25,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 07:10:25,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:10:28,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:10:30,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 07:10:33,901 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:10:37,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:10:39,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:10:39,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 07:10:41,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:10:44,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:10:47,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 07:10:47,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:10:49,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 07:10:52,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 07:10:55,644 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:10:57,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:10:58,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:11:03,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:11:05,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 07:11:10,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:11:10,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 07:11:10,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 07:11:10,805 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=288173.3333333333, ans=0.125 2023-09-29 07:11:10,868 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:11:16,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:11:16,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:11:19,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:11:24,252 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=288240.0, ans=0.2 2023-09-29 07:11:25,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:11:25,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 07:11:29,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 07:11:29,870 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 07:11:30,597 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.17 vs. limit=15.0 2023-09-29 07:11:31,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:11:31,701 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=288240.0, ans=0.0 2023-09-29 07:11:33,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:11:34,678 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:11:36,889 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:11:36,898 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 07:11:38,255 INFO [train.py:1039] (1/4) Epoch 9, batch 750, loss[loss=0.2159, simple_loss=0.2733, pruned_loss=0.0793, over 23814.00 frames. ], tot_loss[loss=0.2131, simple_loss=0.2801, pruned_loss=0.07308, over 4589744.19 frames. ], batch size: 179, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:11:41,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 07:11:41,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 07:11:41,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 07:11:43,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 07:11:43,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 07:11:45,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:11:46,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 07:11:48,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:11:48,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:11:51,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:11:52,545 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:11:52,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 07:11:52,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:11:52,951 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=288306.6666666667, ans=0.125 2023-09-29 07:11:55,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:11:57,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 07:11:58,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:12:00,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:12:02,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:12:02,186 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 07:12:02,618 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=288373.3333333333, ans=0.125 2023-09-29 07:12:03,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 07:12:03,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:12:05,026 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.86 vs. limit=10.0 2023-09-29 07:12:05,494 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:12:09,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 07:12:09,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 07:12:09,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:12:12,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 07:12:12,512 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 07:12:14,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 07:12:14,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:12:14,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 07:12:16,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:12:24,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:12:24,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:12:24,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:12:27,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:12:29,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:12:29,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 07:12:29,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:12:30,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 07:12:32,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:12:35,498 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:12:36,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:12:36,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 07:12:38,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:12:43,146 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=288573.3333333333, ans=0.0 2023-09-29 07:12:43,758 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.67 vs. limit=15.0 2023-09-29 07:12:44,785 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.597e+02 1.964e+02 2.224e+02 2.470e+02 4.454e+02, threshold=4.447e+02, percent-clipped=0.0 2023-09-29 07:12:44,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:12:45,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:12:46,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:12:48,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 07:12:52,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 07:12:53,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:12:53,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:12:58,069 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:12:58,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:13:01,776 INFO [train.py:1039] (1/4) Epoch 9, batch 800, loss[loss=0.2236, simple_loss=0.3058, pruned_loss=0.07068, over 23998.00 frames. ], tot_loss[loss=0.2132, simple_loss=0.2807, pruned_loss=0.07288, over 4620142.88 frames. ], batch size: 80, lr: 1.18e-02, grad_scale: 32.0 2023-09-29 07:13:01,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:13:01,972 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 07:13:04,324 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.78 vs. limit=12.0 2023-09-29 07:13:11,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:13:11,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:13:12,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:13:12,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:13:14,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:13:14,305 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:13:15,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:13:19,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:13:21,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:13:24,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 07:13:24,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:13:25,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:13:25,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:13:25,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:13:25,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 07:13:27,985 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:13:28,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 07:13:31,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:13:35,247 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:13:38,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:13:38,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:13:39,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:13:39,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:13:44,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:13:44,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:13:45,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 07:13:46,750 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.16 vs. limit=12.0 2023-09-29 07:13:47,603 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 07:13:47,658 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 07:13:47,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 07:13:47,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:13:50,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:13:50,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:13:56,036 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 07:13:56,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 07:13:57,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 07:13:59,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 07:14:01,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:14:07,212 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:14:07,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 07:14:08,912 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:14:13,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 07:14:21,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:14:22,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:14:24,345 INFO [train.py:1039] (1/4) Epoch 9, batch 850, loss[loss=0.2147, simple_loss=0.2851, pruned_loss=0.07216, over 24676.00 frames. ], tot_loss[loss=0.2151, simple_loss=0.2823, pruned_loss=0.07398, over 4633935.30 frames. ], batch size: 65, lr: 1.18e-02, grad_scale: 32.0 2023-09-29 07:14:24,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 07:14:24,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:14:24,626 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:14:26,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 07:14:27,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:14:29,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:14:32,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:14:34,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:14:35,861 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:14:37,412 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 07:14:37,479 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 07:14:37,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 07:14:39,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:14:39,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:14:41,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:14:43,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:14:43,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 07:14:48,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:14:48,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:14:48,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 07:14:51,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 07:14:53,805 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=7.74 vs. limit=15.0 2023-09-29 07:14:55,997 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:14:56,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 07:14:59,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 07:15:00,809 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 07:15:02,525 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 07:15:04,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:15:04,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:15:04,511 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 07:15:06,251 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:15:07,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:15:09,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 07:15:11,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:15:12,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:15:12,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:15:12,980 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 07:15:14,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:15:16,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 07:15:16,776 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=289173.3333333333, ans=0.1 2023-09-29 07:15:18,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 07:15:22,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:15:22,575 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:15:24,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:15:24,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:15:24,795 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.28 vs. limit=6.0 2023-09-29 07:15:25,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:15:28,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:15:30,144 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 2.042e+02 2.331e+02 2.770e+02 4.715e+02, threshold=4.662e+02, percent-clipped=1.0 2023-09-29 07:15:30,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:15:31,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 07:15:31,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:15:32,183 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:15:33,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 07:15:40,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 07:15:41,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:15:42,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 07:15:42,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:15:42,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:15:46,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 07:15:47,736 INFO [train.py:1039] (1/4) Epoch 9, batch 900, loss[loss=0.1865, simple_loss=0.2618, pruned_loss=0.05566, over 24535.00 frames. ], tot_loss[loss=0.2147, simple_loss=0.2821, pruned_loss=0.07365, over 4661147.61 frames. ], batch size: 60, lr: 1.18e-02, grad_scale: 32.0 2023-09-29 07:15:53,036 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:15:57,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:15:57,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 07:16:00,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:16:01,107 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:16:02,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 07:16:03,922 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 07:16:04,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:16:04,075 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:16:04,137 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:16:05,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:16:16,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:16:17,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:16:17,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 07:16:20,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:16:25,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 07:16:27,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:16:32,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 07:16:32,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:16:32,444 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 07:16:33,913 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 07:16:41,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 07:16:41,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:16:41,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 07:16:49,979 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:16:49,997 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:16:51,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 07:16:51,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:16:55,238 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 07:16:55,671 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=289573.3333333333, ans=0.125 2023-09-29 07:16:55,823 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=11.81 vs. limit=15.0 2023-09-29 07:16:57,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:16:57,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:16:59,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:17:00,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:17:02,731 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 07:17:02,794 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 07:17:05,696 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 07:17:05,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 07:17:07,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:17:10,225 INFO [train.py:1039] (1/4) Epoch 9, batch 950, loss[loss=0.2116, simple_loss=0.2736, pruned_loss=0.07483, over 23400.00 frames. ], tot_loss[loss=0.2143, simple_loss=0.2819, pruned_loss=0.07334, over 4689416.64 frames. ], batch size: 119, lr: 1.18e-02, grad_scale: 32.0 2023-09-29 07:17:11,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 07:17:16,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:17:19,143 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.81 vs. limit=5.0 2023-09-29 07:17:21,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:17:21,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:17:21,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 07:17:24,775 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 07:17:28,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:17:29,957 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:17:31,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:17:32,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:17:32,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 07:17:33,632 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 07:17:35,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:17:37,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 07:17:37,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:17:41,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:17:41,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:17:42,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:17:43,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 07:17:43,754 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 07:17:45,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:17:46,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:17:52,804 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:17:52,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:17:56,026 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 07:17:58,340 WARNING [train.py:1197] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 07:17:58,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 07:18:00,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:18:00,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:18:00,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:18:01,031 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten.whitening_limit, batch_count=289840.0, ans=22.5 2023-09-29 07:18:03,707 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=289840.0, ans=0.0 2023-09-29 07:18:07,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 07:18:07,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:18:10,679 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:18:12,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:18:12,157 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 07:18:12,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:18:12,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:18:13,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 07:18:16,759 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.662e+02 1.953e+02 2.303e+02 2.846e+02 4.844e+02, threshold=4.606e+02, percent-clipped=1.0 2023-09-29 07:18:16,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:18:18,911 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=289906.6666666667, ans=0.125 2023-09-29 07:18:20,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:18:24,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:18:26,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 07:18:26,244 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 07:18:31,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:18:32,661 INFO [train.py:1039] (1/4) Epoch 9, batch 1000, loss[loss=0.2047, simple_loss=0.2513, pruned_loss=0.07902, over 22708.00 frames. ], tot_loss[loss=0.2136, simple_loss=0.2806, pruned_loss=0.0733, over 4694037.72 frames. ], batch size: 322, lr: 1.17e-02, grad_scale: 32.0 2023-09-29 07:18:36,245 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 07:18:37,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:18:43,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:18:45,159 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 07:18:45,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 07:18:49,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:18:49,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:18:50,206 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=290040.0, ans=0.0 2023-09-29 07:18:51,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:18:54,588 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 07:18:57,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 07:18:57,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 07:18:59,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:19:00,914 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 07:19:02,492 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 07:19:02,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 07:19:02,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:19:04,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:19:15,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:19:16,044 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.76 vs. limit=15.0 2023-09-29 07:19:16,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:19:17,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:19:18,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:19:18,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 07:19:18,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:19:20,332 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:19:20,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:19:21,849 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 07:19:24,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 07:19:25,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 07:19:26,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=290173.3333333333, ans=0.125 2023-09-29 07:19:28,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 07:19:29,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:19:31,532 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=290173.3333333333, ans=0.125 2023-09-29 07:19:34,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:19:34,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:19:34,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:19:38,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:19:38,491 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:19:39,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 07:19:43,088 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:19:43,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 07:19:43,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 07:19:46,300 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:19:46,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:19:48,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:19:51,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 07:19:53,458 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:19:56,394 INFO [train.py:1039] (1/4) Epoch 9, batch 1050, loss[loss=0.1817, simple_loss=0.2218, pruned_loss=0.0708, over 19308.00 frames. ], tot_loss[loss=0.2117, simple_loss=0.2787, pruned_loss=0.07236, over 4700309.67 frames. ], batch size: 388, lr: 1.17e-02, grad_scale: 16.0 2023-09-29 07:19:56,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:19:58,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:20:01,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 07:20:01,235 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:20:02,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:20:05,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:20:05,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:20:09,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:20:10,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:20:10,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:20:12,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:20:13,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 07:20:14,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:20:14,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 07:20:17,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:20:17,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 07:20:17,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 07:20:24,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:20:24,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 07:20:26,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:20:27,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 07:20:27,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 07:20:29,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:20:32,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 07:20:36,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 07:20:36,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:20:41,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 07:20:43,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 07:20:43,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:20:43,273 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:20:48,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:20:53,826 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 07:20:54,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 07:20:56,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 07:20:56,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:20:56,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:20:57,739 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 07:20:57,955 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=290506.6666666667, ans=0.1 2023-09-29 07:21:02,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:21:04,054 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 2.054e+02 2.289e+02 2.734e+02 4.286e+02, threshold=4.577e+02, percent-clipped=0.0 2023-09-29 07:21:04,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:21:04,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:21:05,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:21:05,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:21:10,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:21:10,510 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 07:21:10,855 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_ff3.min_abs, batch_count=290573.3333333333, ans=0.2 2023-09-29 07:21:12,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:21:12,160 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 07:21:12,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 07:21:13,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:21:16,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:21:17,130 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=290640.0, ans=0.1 2023-09-29 07:21:18,248 INFO [train.py:1039] (1/4) Epoch 9, batch 1100, loss[loss=0.2155, simple_loss=0.2967, pruned_loss=0.06719, over 24583.00 frames. ], tot_loss[loss=0.2109, simple_loss=0.2779, pruned_loss=0.07193, over 4681738.25 frames. ], batch size: 71, lr: 1.17e-02, grad_scale: 16.0 2023-09-29 07:21:23,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:21:23,808 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=290640.0, ans=0.125 2023-09-29 07:21:29,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 07:21:32,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:21:32,290 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:21:33,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 07:21:33,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:21:36,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 07:21:40,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:21:43,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:21:43,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 07:21:45,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 07:21:45,259 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:21:45,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:21:48,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:21:50,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 07:21:54,726 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:21:58,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 07:22:00,662 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 07:22:00,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:22:03,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:22:05,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 07:22:06,888 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:22:07,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 07:22:08,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:22:08,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:22:08,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:22:10,570 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:22:10,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 07:22:12,519 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=290840.0, ans=0.125 2023-09-29 07:22:12,546 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=290840.0, ans=0.125 2023-09-29 07:22:17,017 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:22:17,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 07:22:19,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:22:24,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 07:22:27,569 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 07:22:27,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 07:22:29,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:22:32,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:22:32,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:22:34,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 07:22:35,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:22:37,204 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:22:37,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 07:22:39,414 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:22:39,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 07:22:41,378 INFO [train.py:1039] (1/4) Epoch 9, batch 1150, loss[loss=0.2051, simple_loss=0.2714, pruned_loss=0.06941, over 23491.00 frames. ], tot_loss[loss=0.2108, simple_loss=0.2778, pruned_loss=0.07196, over 4681045.47 frames. ], batch size: 134, lr: 1.17e-02, grad_scale: 16.0 2023-09-29 07:22:41,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:22:41,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:22:41,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:22:47,125 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=290973.3333333333, ans=0.125 2023-09-29 07:22:48,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:22:49,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:22:52,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:22:52,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:22:52,915 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 07:22:52,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:22:56,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 07:22:56,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:22:57,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 07:23:03,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 07:23:05,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:23:10,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:23:10,746 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:23:10,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 07:23:11,447 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 07:23:11,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:23:17,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 07:23:18,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:23:20,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:23:30,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:23:34,045 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=291173.3333333333, ans=0.125 2023-09-29 07:23:36,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:23:38,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 07:23:38,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:23:38,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:23:45,426 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 07:23:47,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:23:48,993 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.654e+02 2.090e+02 2.381e+02 2.869e+02 4.983e+02, threshold=4.763e+02, percent-clipped=2.0 2023-09-29 07:23:49,549 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=291240.0, ans=0.125 2023-09-29 07:23:53,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=291240.0, ans=0.125 2023-09-29 07:23:54,738 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 07:23:57,823 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:23:59,267 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:23:59,317 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 07:24:00,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:24:02,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:24:03,991 INFO [train.py:1039] (1/4) Epoch 9, batch 1200, loss[loss=0.2121, simple_loss=0.2915, pruned_loss=0.06636, over 24259.00 frames. ], tot_loss[loss=0.212, simple_loss=0.2793, pruned_loss=0.07236, over 4689082.89 frames. ], batch size: 74, lr: 1.17e-02, grad_scale: 32.0 2023-09-29 07:24:05,852 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=291306.6666666667, ans=0.1 2023-09-29 07:24:07,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:24:07,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:24:10,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:24:10,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:24:10,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:24:11,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:24:15,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 07:24:15,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:24:16,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:24:20,118 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 07:24:24,342 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 07:24:26,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:24:29,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:24:31,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:24:32,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:24:32,815 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 07:24:34,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:24:35,363 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.37 vs. limit=15.0 2023-09-29 07:24:43,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 07:24:43,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:24:43,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 07:24:44,957 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:24:50,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 07:24:53,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 07:24:54,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:24:54,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:24:57,106 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=291506.6666666667, ans=0.04949747468305833 2023-09-29 07:24:58,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:24:58,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 07:25:00,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:25:00,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:25:02,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:25:02,221 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 07:25:02,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 07:25:02,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 07:25:02,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 07:25:05,450 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:25:05,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:25:09,490 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.40 vs. limit=15.0 2023-09-29 07:25:10,098 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 07:25:13,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:25:16,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=291573.3333333333, ans=0.2 2023-09-29 07:25:17,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 07:25:19,754 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=291573.3333333333, ans=0.125 2023-09-29 07:25:20,928 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 07:25:22,521 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:25:25,032 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=291640.0, ans=0.1 2023-09-29 07:25:26,013 INFO [train.py:1039] (1/4) Epoch 9, batch 1250, loss[loss=0.2287, simple_loss=0.2855, pruned_loss=0.08601, over 23617.00 frames. ], tot_loss[loss=0.2124, simple_loss=0.2803, pruned_loss=0.07227, over 4710671.47 frames. ], batch size: 256, lr: 1.17e-02, grad_scale: 32.0 2023-09-29 07:25:26,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 07:25:27,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:25:27,912 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=291640.0, ans=0.1 2023-09-29 07:25:29,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:25:32,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 07:25:33,444 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.95 vs. limit=22.5 2023-09-29 07:25:38,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:25:38,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:25:39,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 07:25:41,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:25:42,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 07:25:47,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 07:25:47,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:25:49,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 07:25:49,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:25:51,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 07:25:52,360 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys.whitening_limit, batch_count=291706.6666666667, ans=6.0 2023-09-29 07:25:56,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 07:25:56,086 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 07:25:56,095 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:25:57,747 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:25:57,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:26:03,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:26:03,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 07:26:10,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 07:26:10,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:26:11,246 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=291773.3333333333, ans=0.0 2023-09-29 07:26:13,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:26:14,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 07:26:15,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:26:15,544 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 07:26:15,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:26:15,589 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:26:15,826 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=291840.0, ans=0.125 2023-09-29 07:26:20,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:26:20,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:26:21,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:26:22,208 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=291840.0, ans=0.1 2023-09-29 07:26:24,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 07:26:24,856 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 07:26:24,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 07:26:27,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:26:29,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 07:26:29,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:26:31,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 07:26:31,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:26:33,088 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.648e+02 1.977e+02 2.179e+02 2.388e+02 3.416e+02, threshold=4.359e+02, percent-clipped=0.0 2023-09-29 07:26:33,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 07:26:33,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 07:26:34,716 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:26:34,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 07:26:34,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:26:35,085 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=291906.6666666667, ans=0.1 2023-09-29 07:26:36,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 07:26:40,738 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:26:42,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:26:44,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:26:45,732 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 07:26:48,705 INFO [train.py:1039] (1/4) Epoch 9, batch 1300, loss[loss=0.1771, simple_loss=0.2512, pruned_loss=0.05148, over 24293.00 frames. ], tot_loss[loss=0.2133, simple_loss=0.2807, pruned_loss=0.07296, over 4697094.59 frames. ], batch size: 61, lr: 1.17e-02, grad_scale: 32.0 2023-09-29 07:26:48,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:26:48,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 07:26:53,429 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:26:55,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 07:26:56,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:26:58,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:26:59,904 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:26:59,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 07:27:05,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:27:06,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 07:27:08,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 07:27:13,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 07:27:16,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:27:16,995 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:27:19,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:27:20,857 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=292106.6666666667, ans=0.2 2023-09-29 07:27:22,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:27:23,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 07:27:23,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 07:27:23,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 07:27:31,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:27:31,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 07:27:32,713 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 07:27:34,250 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 07:27:35,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:27:37,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:27:38,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 07:27:41,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:27:41,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 07:27:41,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:27:44,663 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:27:44,669 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:27:48,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 07:27:49,814 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 07:27:51,282 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 07:27:56,462 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:27:59,610 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 07:28:00,167 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.00 vs. limit=15.0 2023-09-29 07:28:01,167 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:28:10,008 INFO [train.py:1039] (1/4) Epoch 9, batch 1350, loss[loss=0.1921, simple_loss=0.2699, pruned_loss=0.05716, over 24509.00 frames. ], tot_loss[loss=0.2124, simple_loss=0.2793, pruned_loss=0.0728, over 4692462.30 frames. ], batch size: 63, lr: 1.17e-02, grad_scale: 32.0 2023-09-29 07:28:10,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 07:28:13,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:28:14,252 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=292306.6666666667, ans=0.125 2023-09-29 07:28:16,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:28:21,570 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:28:22,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:28:25,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:28:25,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:28:29,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:28:31,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 07:28:33,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 07:28:33,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:28:35,284 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=292373.3333333333, ans=0.125 2023-09-29 07:28:36,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 07:28:37,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:28:39,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:28:39,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 07:28:41,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 07:28:42,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 07:28:44,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:28:44,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 07:28:48,952 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.49 vs. limit=15.0 2023-09-29 07:28:56,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:29:04,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:29:04,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:29:06,867 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 07:29:10,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:29:10,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 07:29:11,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 07:29:13,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:29:14,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:29:18,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 07:29:20,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:29:21,319 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.985e+02 2.227e+02 2.562e+02 4.004e+02, threshold=4.454e+02, percent-clipped=0.0 2023-09-29 07:29:25,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 07:29:27,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 07:29:32,019 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=292640.0, ans=0.04949747468305833 2023-09-29 07:29:33,067 INFO [train.py:1039] (1/4) Epoch 9, batch 1400, loss[loss=0.2258, simple_loss=0.2919, pruned_loss=0.07987, over 23207.00 frames. ], tot_loss[loss=0.2117, simple_loss=0.2784, pruned_loss=0.07248, over 4687330.57 frames. ], batch size: 105, lr: 1.17e-02, grad_scale: 8.0 2023-09-29 07:29:33,595 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=292640.0, ans=0.125 2023-09-29 07:29:34,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 07:29:36,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:29:39,514 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:29:39,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:29:40,089 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=292640.0, ans=0.125 2023-09-29 07:29:46,132 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 07:29:46,329 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 07:29:46,518 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=292640.0, ans=0.1 2023-09-29 07:29:56,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:29:58,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:30:00,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:30:01,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 07:30:04,820 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:30:07,254 WARNING [train.py:1197] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 07:30:12,067 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=292773.3333333333, ans=0.125 2023-09-29 07:30:16,922 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:30:17,159 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=292773.3333333333, ans=0.0 2023-09-29 07:30:18,272 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:30:21,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 07:30:22,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 07:30:23,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:30:24,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:30:24,606 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:30:26,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:30:26,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:30:26,868 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:30:28,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 07:30:28,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:30:32,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:30:35,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:30:37,779 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=292840.0, ans=0.025 2023-09-29 07:30:42,026 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 07:30:43,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 07:30:43,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:30:46,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 07:30:48,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:30:50,074 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:30:52,460 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.74 vs. limit=15.0 2023-09-29 07:30:53,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 07:30:56,704 INFO [train.py:1039] (1/4) Epoch 9, batch 1450, loss[loss=0.2082, simple_loss=0.2706, pruned_loss=0.07288, over 23570.00 frames. ], tot_loss[loss=0.2109, simple_loss=0.278, pruned_loss=0.07197, over 4687633.67 frames. ], batch size: 134, lr: 1.17e-02, grad_scale: 8.0 2023-09-29 07:30:58,265 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:30:58,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:30:58,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 07:31:03,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:31:03,423 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:31:03,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=292973.3333333333, ans=0.1 2023-09-29 07:31:05,036 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 07:31:08,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:31:08,115 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 07:31:09,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 07:31:09,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 07:31:11,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:31:11,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:31:11,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 07:31:13,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:31:14,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:31:16,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 07:31:16,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:31:16,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:31:19,590 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:31:22,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:31:25,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:31:25,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:31:29,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:31:29,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:31:32,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:31:32,355 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:31:32,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:31:32,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:31:34,895 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=293106.6666666667, ans=0.125 2023-09-29 07:31:36,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 07:31:39,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:31:44,377 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 07:31:45,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:31:48,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:31:49,434 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:31:49,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 07:31:54,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:31:55,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 07:31:57,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 07:31:57,342 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:32:00,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:32:00,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:32:02,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 07:32:03,085 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=293240.0, ans=0.1 2023-09-29 07:32:05,041 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.25 vs. limit=10.0 2023-09-29 07:32:05,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 07:32:05,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 07:32:07,321 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 1.945e+02 2.236e+02 2.452e+02 4.458e+02, threshold=4.473e+02, percent-clipped=1.0 2023-09-29 07:32:07,579 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:32:09,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 07:32:19,971 INFO [train.py:1039] (1/4) Epoch 9, batch 1500, loss[loss=0.2121, simple_loss=0.2868, pruned_loss=0.06866, over 23693.00 frames. ], tot_loss[loss=0.2124, simple_loss=0.2797, pruned_loss=0.07258, over 4694766.34 frames. ], batch size: 85, lr: 1.17e-02, grad_scale: 8.0 2023-09-29 07:32:23,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 07:32:23,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 07:32:23,159 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:32:24,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:32:24,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:32:24,949 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=293306.6666666667, ans=0.125 2023-09-29 07:32:30,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:32:30,624 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 07:32:32,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 07:32:33,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 07:32:33,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:32:33,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:32:36,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:32:38,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:32:43,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:32:44,644 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 07:32:44,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 07:32:46,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:32:46,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:32:49,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 07:32:54,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 07:32:57,954 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:32:59,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 07:33:03,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 07:33:06,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:33:06,313 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:33:07,695 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:33:07,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 07:33:07,949 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:33:09,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:33:09,592 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 07:33:11,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:33:11,578 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=293506.6666666667, ans=0.0 2023-09-29 07:33:14,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:33:14,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 07:33:19,722 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:33:23,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 07:33:25,026 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=293506.6666666667, ans=0.0 2023-09-29 07:33:27,681 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 07:33:27,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:33:27,799 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 07:33:29,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:33:31,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:33:31,425 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 07:33:31,560 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 07:33:35,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 07:33:36,813 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=293573.3333333333, ans=0.0 2023-09-29 07:33:37,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:33:41,254 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=293573.3333333333, ans=0.0 2023-09-29 07:33:42,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:33:42,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:33:42,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:33:42,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:33:44,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:33:45,493 INFO [train.py:1039] (1/4) Epoch 9, batch 1550, loss[loss=0.2339, simple_loss=0.2913, pruned_loss=0.08828, over 23328.00 frames. ], tot_loss[loss=0.2129, simple_loss=0.2807, pruned_loss=0.07261, over 4709182.38 frames. ], batch size: 119, lr: 1.17e-02, grad_scale: 8.0 2023-09-29 07:33:47,126 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 07:33:47,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 07:33:47,565 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=293640.0, ans=0.1 2023-09-29 07:33:48,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:33:48,837 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 07:33:48,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 07:33:50,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:33:52,476 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:33:52,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:33:52,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:33:54,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:33:54,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:33:57,950 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 07:33:59,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:33:59,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:34:00,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 07:34:02,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:34:02,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 07:34:04,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:34:04,537 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 07:34:06,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 07:34:06,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 07:34:07,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:34:09,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:34:12,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:34:15,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 07:34:15,802 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 07:34:17,589 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=293773.3333333333, ans=0.125 2023-09-29 07:34:21,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:34:26,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:34:28,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:34:28,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:34:28,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 07:34:33,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 07:34:35,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:34:38,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:34:40,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:34:41,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:34:41,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 07:34:41,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:34:42,179 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=293840.0, ans=0.125 2023-09-29 07:34:43,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:34:43,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:34:45,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 07:34:45,394 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 07:34:48,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:34:50,376 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=293906.6666666667, ans=0.0 2023-09-29 07:34:55,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 07:34:56,629 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.590e+02 1.971e+02 2.191e+02 2.523e+02 4.378e+02, threshold=4.383e+02, percent-clipped=0.0 2023-09-29 07:34:58,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:35:00,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:35:01,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 07:35:03,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:35:04,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:35:04,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 07:35:04,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:35:04,893 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=293906.6666666667, ans=0.125 2023-09-29 07:35:06,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:35:08,742 INFO [train.py:1039] (1/4) Epoch 9, batch 1600, loss[loss=0.1847, simple_loss=0.2485, pruned_loss=0.06045, over 24337.00 frames. ], tot_loss[loss=0.2131, simple_loss=0.2811, pruned_loss=0.07258, over 4712836.31 frames. ], batch size: 56, lr: 1.17e-02, grad_scale: 16.0 2023-09-29 07:35:10,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:35:12,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 07:35:13,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 07:35:15,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 07:35:16,833 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:35:19,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 07:35:21,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:35:23,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:35:28,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:35:29,899 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=294040.0, ans=0.0 2023-09-29 07:35:32,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 07:35:33,241 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=294040.0, ans=0.125 2023-09-29 07:35:33,592 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.30 vs. limit=6.0 2023-09-29 07:35:34,788 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=294040.0, ans=0.125 2023-09-29 07:35:35,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:35:36,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 07:35:37,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:35:37,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 07:35:44,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 07:35:44,533 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=294106.6666666667, ans=0.125 2023-09-29 07:35:45,027 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=20.11 vs. limit=22.5 2023-09-29 07:35:52,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:35:52,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 07:35:52,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:35:53,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:35:53,860 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:35:57,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 07:36:01,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 07:36:04,816 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:36:04,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:36:04,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:36:06,477 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:36:07,069 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.43 vs. limit=22.5 2023-09-29 07:36:08,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:36:09,035 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=294173.3333333333, ans=0.0 2023-09-29 07:36:10,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:36:11,725 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 07:36:16,384 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.45 vs. limit=22.5 2023-09-29 07:36:16,531 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.64 vs. limit=12.0 2023-09-29 07:36:17,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:36:18,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:36:20,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 07:36:20,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:36:22,175 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 07:36:26,150 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.90 vs. limit=15.0 2023-09-29 07:36:27,127 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=294240.0, ans=0.125 2023-09-29 07:36:28,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:36:30,236 INFO [train.py:1039] (1/4) Epoch 9, batch 1650, loss[loss=0.2068, simple_loss=0.2906, pruned_loss=0.06154, over 24323.00 frames. ], tot_loss[loss=0.2139, simple_loss=0.282, pruned_loss=0.07289, over 4709964.24 frames. ], batch size: 74, lr: 1.17e-02, grad_scale: 8.0 2023-09-29 07:36:31,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:36:31,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:36:31,958 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 07:36:31,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 07:36:31,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 07:36:33,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 07:36:35,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:36:36,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:36:38,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:36:38,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 07:36:39,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:36:41,926 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 07:36:44,774 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:36:44,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:36:44,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:36:44,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:36:44,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 07:36:46,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 07:36:52,344 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 07:36:55,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 07:37:06,679 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=294440.0, ans=0.0 2023-09-29 07:37:07,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 07:37:09,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:37:10,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 07:37:14,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:37:15,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:37:17,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:37:17,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:37:17,756 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=294506.6666666667, ans=0.125 2023-09-29 07:37:18,125 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.97 vs. limit=15.0 2023-09-29 07:37:18,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:37:18,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:37:21,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:37:23,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:37:24,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:37:24,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:37:26,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:37:26,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:37:31,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:37:32,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 07:37:34,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:37:34,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 07:37:35,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 07:37:35,953 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 07:37:35,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:37:37,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:37:37,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:37:38,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:37:38,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 07:37:42,541 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 2.030e+02 2.311e+02 2.647e+02 4.475e+02, threshold=4.622e+02, percent-clipped=1.0 2023-09-29 07:37:42,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:37:44,318 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:37:44,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:37:47,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 07:37:51,802 INFO [train.py:1039] (1/4) Epoch 9, batch 1700, loss[loss=0.2153, simple_loss=0.2757, pruned_loss=0.07738, over 23610.00 frames. ], tot_loss[loss=0.213, simple_loss=0.2806, pruned_loss=0.07269, over 4699316.36 frames. ], batch size: 149, lr: 1.17e-02, grad_scale: 8.0 2023-09-29 07:37:51,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:37:51,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:37:52,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 07:37:53,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:37:53,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:37:53,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:37:55,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:37:55,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:37:55,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 07:37:56,516 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.64 vs. limit=10.0 2023-09-29 07:37:59,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 07:38:08,254 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=294706.6666666667, ans=0.125 2023-09-29 07:38:09,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:38:12,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:38:19,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 07:38:19,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:38:19,530 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:38:19,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:38:22,658 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 07:38:25,580 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:38:25,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:38:27,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 07:38:27,520 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=294773.3333333333, ans=0.2 2023-09-29 07:38:29,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 07:38:30,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 07:38:32,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 07:38:34,667 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:38:36,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 07:38:38,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:38:46,086 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=294840.0, ans=0.1 2023-09-29 07:38:47,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:38:47,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:38:49,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:38:52,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 07:38:52,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 07:38:52,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:38:53,129 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=294840.0, ans=0.0 2023-09-29 07:38:54,405 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:38:54,406 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 07:38:55,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:38:55,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:38:55,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:38:55,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:39:00,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:39:00,317 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:39:00,621 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=294906.6666666667, ans=0.125 2023-09-29 07:39:01,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:39:01,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:39:03,945 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:39:05,651 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:39:07,243 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 07:39:08,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:39:09,113 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=294906.6666666667, ans=0.0 2023-09-29 07:39:12,692 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:39:12,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 07:39:15,868 INFO [train.py:1039] (1/4) Epoch 9, batch 1750, loss[loss=0.1989, simple_loss=0.2792, pruned_loss=0.0593, over 24451.00 frames. ], tot_loss[loss=0.2115, simple_loss=0.2797, pruned_loss=0.07168, over 4715222.83 frames. ], batch size: 63, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:39:19,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:39:22,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:39:22,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 07:39:22,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 07:39:23,647 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:39:27,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:39:28,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:39:31,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 07:39:33,523 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=295040.0, ans=0.0 2023-09-29 07:39:34,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:39:36,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 07:39:36,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:39:36,739 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=295040.0, ans=0.0 2023-09-29 07:39:38,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:39:42,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 07:39:42,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 07:39:42,588 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=295040.0, ans=0.0 2023-09-29 07:39:45,804 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:39:45,849 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 07:39:53,629 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:39:56,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:39:56,823 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:39:58,532 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=295106.6666666667, ans=0.0 2023-09-29 07:40:01,868 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:40:01,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:40:03,505 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:40:06,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:40:08,048 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:40:09,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:40:09,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 07:40:13,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:40:15,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 07:40:17,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:40:18,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:40:19,198 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=295173.3333333333, ans=0.2 2023-09-29 07:40:20,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:40:23,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 07:40:23,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 07:40:25,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:40:25,478 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=295240.0, ans=0.125 2023-09-29 07:40:26,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:40:27,933 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 2.058e+02 2.373e+02 2.670e+02 4.900e+02, threshold=4.746e+02, percent-clipped=2.0 2023-09-29 07:40:31,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:40:34,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:40:35,720 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:40:35,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 07:40:35,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:40:37,702 INFO [train.py:1039] (1/4) Epoch 9, batch 1800, loss[loss=0.219, simple_loss=0.2817, pruned_loss=0.07816, over 23710.00 frames. ], tot_loss[loss=0.2111, simple_loss=0.2799, pruned_loss=0.07116, over 4735276.13 frames. ], batch size: 164, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:40:37,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 07:40:37,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:40:37,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:40:37,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:40:39,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:40:41,831 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=13.34 vs. limit=15.0 2023-09-29 07:40:42,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:40:43,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:40:44,567 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=295306.6666666667, ans=0.125 2023-09-29 07:40:46,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 07:40:47,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:40:50,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 07:40:52,967 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:40:56,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:40:59,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:40:59,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:41:00,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:41:02,490 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:41:02,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 07:41:02,626 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:41:05,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:41:10,517 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 07:41:12,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 07:41:12,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 07:41:12,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:41:15,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:41:15,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:41:17,757 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:41:24,761 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 07:41:26,928 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:41:27,692 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.72 vs. limit=15.0 2023-09-29 07:41:28,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:41:28,809 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=295506.6666666667, ans=0.0 2023-09-29 07:41:30,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 07:41:30,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 07:41:30,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 07:41:31,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:41:31,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:41:37,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 07:41:44,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:41:44,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 07:41:46,316 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:41:46,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:41:47,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:41:47,860 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 07:41:49,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:41:49,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:41:49,821 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=295573.3333333333, ans=0.07 2023-09-29 07:41:53,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 07:41:53,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:41:56,934 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:41:56,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:41:57,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:41:58,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:42:00,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:42:02,007 INFO [train.py:1039] (1/4) Epoch 9, batch 1850, loss[loss=0.2134, simple_loss=0.2789, pruned_loss=0.07396, over 23215.00 frames. ], tot_loss[loss=0.2111, simple_loss=0.2798, pruned_loss=0.07124, over 4717304.99 frames. ], batch size: 119, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:42:02,214 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:42:03,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:42:06,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:42:08,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:42:14,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:42:16,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 07:42:20,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 07:42:22,401 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=295706.6666666667, ans=0.95 2023-09-29 07:42:24,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 07:42:28,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:42:28,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 07:42:28,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 07:42:39,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:42:41,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 07:42:44,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:42:44,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:42:46,484 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=295773.3333333333, ans=0.125 2023-09-29 07:42:49,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 07:42:49,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:42:49,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 07:42:50,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:42:52,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:42:55,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:42:58,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 07:42:58,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:42:59,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 07:42:59,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:43:01,060 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:43:02,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:43:06,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 07:43:07,068 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:43:07,998 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=9.37 vs. limit=15.0 2023-09-29 07:43:13,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:43:13,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:43:13,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 07:43:13,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 07:43:14,718 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 2.027e+02 2.265e+02 2.527e+02 4.357e+02, threshold=4.531e+02, percent-clipped=0.0 2023-09-29 07:43:15,041 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 07:43:16,571 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 07:43:18,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 07:43:18,833 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.86 vs. limit=15.0 2023-09-29 07:43:19,512 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:43:19,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:43:19,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:43:19,671 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 07:43:19,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:43:21,061 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:43:21,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 07:43:22,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 07:43:23,098 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=295973.3333333333, ans=0.0 2023-09-29 07:43:24,102 INFO [train.py:1039] (1/4) Epoch 9, batch 1900, loss[loss=0.224, simple_loss=0.2859, pruned_loss=0.08104, over 23890.00 frames. ], tot_loss[loss=0.2116, simple_loss=0.2803, pruned_loss=0.07141, over 4732226.54 frames. ], batch size: 195, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:43:24,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:43:24,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 07:43:25,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:43:25,882 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 07:43:25,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:43:27,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:43:32,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:43:35,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:43:37,518 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 07:43:37,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 07:43:39,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:43:40,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:43:40,770 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 07:43:40,816 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 07:43:45,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 07:43:46,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:43:50,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 07:43:53,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 07:44:04,505 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.52 vs. limit=15.0 2023-09-29 07:44:05,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 07:44:07,344 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.11 vs. limit=12.0 2023-09-29 07:44:08,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 07:44:08,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:44:09,708 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 07:44:09,729 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 07:44:09,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 07:44:11,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 07:44:11,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:44:15,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 07:44:20,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:44:21,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:44:21,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 07:44:23,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:44:25,462 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=296173.3333333333, ans=0.0 2023-09-29 07:44:26,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 07:44:28,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:44:34,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:44:34,446 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:44:34,472 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:44:35,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:44:37,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 07:44:37,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 07:44:41,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:44:42,897 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=296240.0, ans=0.1 2023-09-29 07:44:44,085 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:44:44,088 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:44:46,159 INFO [train.py:1039] (1/4) Epoch 9, batch 1950, loss[loss=0.2078, simple_loss=0.2846, pruned_loss=0.0655, over 23990.00 frames. ], tot_loss[loss=0.2127, simple_loss=0.2812, pruned_loss=0.07209, over 4721112.92 frames. ], batch size: 86, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:44:47,763 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:44:47,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:44:47,839 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:44:49,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:44:50,989 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 07:44:55,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:44:56,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:44:56,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 07:44:58,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 07:44:58,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 07:44:58,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:45:00,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:45:04,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:45:04,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:45:04,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:45:06,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:45:10,879 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 07:45:10,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 07:45:10,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:45:10,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:45:14,218 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:45:15,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:45:19,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:45:19,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:45:19,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 07:45:19,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 07:45:19,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 07:45:19,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:45:19,732 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=296440.0, ans=0.125 2023-09-29 07:45:21,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:45:24,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:45:25,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:45:30,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 07:45:32,209 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.18 vs. limit=15.0 2023-09-29 07:45:33,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:45:33,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:45:34,633 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 07:45:34,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:45:39,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:45:39,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:45:40,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:45:45,719 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=296506.6666666667, ans=0.125 2023-09-29 07:45:49,139 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:45:50,563 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:45:53,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:45:55,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:45:59,268 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.471e+02 1.955e+02 2.207e+02 2.649e+02 3.533e+02, threshold=4.414e+02, percent-clipped=0.0 2023-09-29 07:45:59,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:45:59,517 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:46:00,962 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 07:46:00,970 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 07:46:03,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:46:03,384 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=296573.3333333333, ans=0.125 2023-09-29 07:46:04,599 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 07:46:06,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:46:06,576 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=296573.3333333333, ans=0.125 2023-09-29 07:46:09,284 INFO [train.py:1039] (1/4) Epoch 9, batch 2000, loss[loss=0.1894, simple_loss=0.2634, pruned_loss=0.05772, over 24623.00 frames. ], tot_loss[loss=0.2139, simple_loss=0.2823, pruned_loss=0.07274, over 4719681.21 frames. ], batch size: 65, lr: 1.16e-02, grad_scale: 16.0 2023-09-29 07:46:09,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:46:10,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:46:10,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:46:11,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:46:12,711 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:46:17,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 07:46:17,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:46:23,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:46:24,061 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=296706.6666666667, ans=0.1 2023-09-29 07:46:25,220 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 07:46:26,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 07:46:26,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:46:30,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:46:31,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 07:46:35,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:46:36,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:46:36,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:46:39,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 07:46:39,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 07:46:41,640 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.04 vs. limit=15.0 2023-09-29 07:46:42,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 07:46:42,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:46:45,261 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:46:46,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 07:46:46,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:46:48,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:46:49,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:46:49,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 07:46:53,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 07:46:53,046 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:46:53,058 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:47:01,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:47:02,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:47:02,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 07:47:02,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:47:04,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:47:04,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:47:06,293 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 07:47:06,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:47:07,864 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:47:09,759 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:47:10,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:47:11,308 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=296840.0, ans=0.5 2023-09-29 07:47:12,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 07:47:15,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 07:47:16,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:47:21,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:47:21,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:47:24,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:47:25,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:47:25,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:47:27,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 07:47:27,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:47:29,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:47:31,097 INFO [train.py:1039] (1/4) Epoch 9, batch 2050, loss[loss=0.1973, simple_loss=0.2759, pruned_loss=0.05936, over 24444.00 frames. ], tot_loss[loss=0.2131, simple_loss=0.2815, pruned_loss=0.07239, over 4731126.92 frames. ], batch size: 63, lr: 1.16e-02, grad_scale: 16.0 2023-09-29 07:47:31,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:47:32,903 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=296973.3333333333, ans=0.125 2023-09-29 07:47:34,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:47:34,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:47:38,933 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.89 vs. limit=15.0 2023-09-29 07:47:41,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:47:46,055 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:47:46,152 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:47:48,310 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:47:49,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 07:47:49,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:47:50,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:47:50,289 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=297040.0, ans=0.125 2023-09-29 07:47:51,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 07:48:00,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:48:00,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:48:03,903 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 07:48:06,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:48:07,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 07:48:07,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:48:09,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:48:11,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:48:13,242 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 07:48:13,310 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:48:14,788 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:48:14,933 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=297106.6666666667, ans=0.125 2023-09-29 07:48:16,265 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:48:16,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:48:21,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:48:24,264 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:48:24,607 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=297173.3333333333, ans=0.1 2023-09-29 07:48:25,909 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 07:48:26,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:48:32,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 07:48:36,720 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:48:36,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 07:48:42,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:48:44,024 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.448e+02 2.106e+02 2.256e+02 2.757e+02 3.895e+02, threshold=4.512e+02, percent-clipped=0.0 2023-09-29 07:48:44,222 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:48:47,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:48:48,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 07:48:52,628 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 07:48:52,629 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:48:54,323 INFO [train.py:1039] (1/4) Epoch 9, batch 2100, loss[loss=0.2233, simple_loss=0.2702, pruned_loss=0.08823, over 22803.00 frames. ], tot_loss[loss=0.2117, simple_loss=0.28, pruned_loss=0.07175, over 4725701.08 frames. ], batch size: 322, lr: 1.16e-02, grad_scale: 16.0 2023-09-29 07:48:54,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:48:54,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:48:56,049 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:48:56,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 07:48:56,315 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=297306.6666666667, ans=0.125 2023-09-29 07:48:57,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 07:48:59,040 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 07:49:02,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:49:02,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:49:05,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:49:06,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:49:06,723 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 07:49:06,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 07:49:08,369 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 07:49:08,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 07:49:09,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:49:10,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:49:10,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 07:49:11,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 07:49:18,673 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 07:49:18,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:49:20,390 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=297373.3333333333, ans=0.125 2023-09-29 07:49:23,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:49:23,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:49:28,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:49:29,353 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.02 vs. limit=10.0 2023-09-29 07:49:29,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 07:49:29,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:49:29,936 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 07:49:32,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 07:49:32,144 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:49:32,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 07:49:32,419 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=297440.0, ans=0.09899494936611666 2023-09-29 07:49:32,515 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=297440.0, ans=0.125 2023-09-29 07:49:33,611 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 07:49:33,695 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 07:49:35,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:49:36,992 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:49:41,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 07:49:41,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 07:49:42,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:49:44,560 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:49:44,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 07:49:44,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:49:44,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:49:44,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:49:46,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 07:49:48,256 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 07:49:48,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 07:49:53,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:49:56,383 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:49:56,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 07:50:03,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:50:06,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:50:06,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:50:06,862 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:50:06,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 07:50:08,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 07:50:08,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:50:10,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 07:50:10,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:50:11,572 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:50:13,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 07:50:13,987 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.24 vs. limit=15.0 2023-09-29 07:50:14,781 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 07:50:14,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:50:16,062 INFO [train.py:1039] (1/4) Epoch 9, batch 2150, loss[loss=0.2095, simple_loss=0.2708, pruned_loss=0.07408, over 23442.00 frames. ], tot_loss[loss=0.2111, simple_loss=0.2789, pruned_loss=0.07166, over 4716937.78 frames. ], batch size: 285, lr: 1.16e-02, grad_scale: 16.0 2023-09-29 07:50:19,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:50:19,093 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:50:19,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:50:19,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:50:25,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 07:50:25,488 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=297640.0, ans=0.05 2023-09-29 07:50:26,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:50:26,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:50:28,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:50:28,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:50:28,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:50:33,463 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:50:33,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:50:33,580 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:50:38,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:50:38,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 07:50:44,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:50:46,397 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:50:48,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:50:48,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:50:48,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:50:48,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 07:50:49,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:50:49,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:50:51,052 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:50:52,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 07:50:54,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:50:56,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:50:57,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:50:57,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 07:50:59,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:51:01,553 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:51:01,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 07:51:03,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:51:03,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 07:51:03,291 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:51:06,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:51:06,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:51:09,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:51:09,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 07:51:10,019 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=297840.0, ans=0.1 2023-09-29 07:51:11,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:51:12,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:51:12,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 07:51:15,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 07:51:15,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:51:16,533 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 07:51:17,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:51:17,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:51:19,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 07:51:19,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:51:19,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 07:51:19,490 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 07:51:19,490 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 07:51:20,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 07:51:21,330 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_na.min_abs, batch_count=297906.6666666667, ans=0.02 2023-09-29 07:51:22,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:51:22,543 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:51:22,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:51:24,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:51:25,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 07:51:27,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:51:27,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:51:28,409 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.622e+02 2.066e+02 2.283e+02 2.527e+02 4.333e+02, threshold=4.566e+02, percent-clipped=0.0 2023-09-29 07:51:37,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:51:37,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 07:51:39,064 INFO [train.py:1039] (1/4) Epoch 9, batch 2200, loss[loss=0.2321, simple_loss=0.2919, pruned_loss=0.08615, over 23778.00 frames. ], tot_loss[loss=0.2113, simple_loss=0.2791, pruned_loss=0.0718, over 4719334.44 frames. ], batch size: 164, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:51:40,825 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:51:46,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:51:47,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:51:47,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:51:49,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 07:51:52,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:51:54,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:51:54,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 07:51:54,531 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=298040.0, ans=0.125 2023-09-29 07:51:54,612 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:51:56,113 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=298040.0, ans=0.125 2023-09-29 07:51:58,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 07:52:00,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 07:52:07,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 07:52:10,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:52:11,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:52:13,099 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:52:13,585 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=298106.6666666667, ans=0.0 2023-09-29 07:52:15,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:52:16,970 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 07:52:17,342 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=298106.6666666667, ans=0.125 2023-09-29 07:52:17,435 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=298106.6666666667, ans=0.2 2023-09-29 07:52:20,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 07:52:21,846 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:52:23,260 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 07:52:26,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 07:52:28,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:52:30,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:52:31,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:52:33,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 07:52:34,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:52:35,283 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=298173.3333333333, ans=0.0 2023-09-29 07:52:37,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 07:52:38,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:52:38,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 07:52:39,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:52:42,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:52:42,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:52:42,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:52:42,946 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:52:44,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 07:52:44,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:52:47,585 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 07:52:53,256 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 07:52:53,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:52:56,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 07:52:57,836 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 07:53:00,823 INFO [train.py:1039] (1/4) Epoch 9, batch 2250, loss[loss=0.2126, simple_loss=0.2898, pruned_loss=0.06774, over 24009.00 frames. ], tot_loss[loss=0.212, simple_loss=0.28, pruned_loss=0.07203, over 4710358.45 frames. ], batch size: 80, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:53:00,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:53:01,016 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 07:53:03,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 07:53:03,107 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 07:53:04,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:53:06,099 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 07:53:06,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:53:07,812 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 07:53:09,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:53:09,874 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.07 vs. limit=12.0 2023-09-29 07:53:12,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:53:13,142 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=298306.6666666667, ans=0.125 2023-09-29 07:53:18,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:53:20,503 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 07:53:24,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:53:24,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:53:25,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:53:27,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 07:53:27,873 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:53:29,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:53:30,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 07:53:32,307 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:53:32,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:53:33,867 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:53:37,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:53:40,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 07:53:40,721 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 07:53:40,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 07:53:42,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:53:45,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:53:51,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:53:53,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:53:54,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:53:54,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:53:58,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:54:00,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:54:05,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:54:05,653 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 07:54:12,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 07:54:12,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 07:54:13,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:54:15,291 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 1.931e+02 2.128e+02 2.517e+02 4.313e+02, threshold=4.256e+02, percent-clipped=0.0 2023-09-29 07:54:18,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 07:54:20,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 07:54:20,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 07:54:21,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:54:21,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:54:22,911 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=298640.0, ans=0.0 2023-09-29 07:54:23,827 INFO [train.py:1039] (1/4) Epoch 9, batch 2300, loss[loss=0.2447, simple_loss=0.3009, pruned_loss=0.09419, over 22675.00 frames. ], tot_loss[loss=0.2121, simple_loss=0.2802, pruned_loss=0.07202, over 4712888.35 frames. ], batch size: 322, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:54:25,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 07:54:28,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:54:30,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:54:35,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:54:37,890 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:54:40,828 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 07:54:42,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:54:49,042 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:54:49,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 07:54:49,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:54:50,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:54:50,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 07:54:50,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:54:53,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:54:53,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:54:57,042 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:55:00,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 07:55:02,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:55:07,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:55:07,854 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:55:10,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:55:14,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:55:14,465 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=298840.0, ans=0.0 2023-09-29 07:55:17,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:55:19,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 07:55:19,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:55:19,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 07:55:22,916 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=298840.0, ans=0.125 2023-09-29 07:55:24,126 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 07:55:24,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:55:24,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:55:24,235 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:55:24,448 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=298840.0, ans=0.125 2023-09-29 07:55:25,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:55:27,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 07:55:27,288 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:55:27,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 07:55:27,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:55:27,406 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:55:28,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 07:55:34,450 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:55:40,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:55:44,082 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:55:44,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:55:44,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 07:55:44,817 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.91 vs. limit=15.0 2023-09-29 07:55:47,457 INFO [train.py:1039] (1/4) Epoch 9, batch 2350, loss[loss=0.2292, simple_loss=0.2973, pruned_loss=0.08058, over 23173.00 frames. ], tot_loss[loss=0.2135, simple_loss=0.2814, pruned_loss=0.07281, over 4695283.60 frames. ], batch size: 105, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:55:47,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 07:55:47,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:55:47,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 07:55:47,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 07:55:55,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:55:55,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 07:55:57,517 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=298973.3333333333, ans=0.125 2023-09-29 07:56:02,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 07:56:07,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:56:10,089 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:56:10,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:56:10,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:56:11,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:56:11,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 07:56:15,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:56:20,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 07:56:22,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:56:24,545 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=299106.6666666667, ans=0.0 2023-09-29 07:56:24,998 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.63 vs. limit=15.0 2023-09-29 07:56:25,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:56:25,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:56:28,702 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:56:30,768 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 07:56:30,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:56:31,489 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=10.14 vs. limit=15.0 2023-09-29 07:56:33,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:56:33,696 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:56:33,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:56:37,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:56:40,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 07:56:40,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:56:43,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:56:43,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:56:45,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 07:56:47,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:56:49,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 07:56:50,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 07:56:54,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 07:56:58,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 07:57:00,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:57:00,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 07:57:00,311 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 07:57:00,343 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 07:57:01,785 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.694e+02 2.074e+02 2.290e+02 2.554e+02 3.364e+02, threshold=4.579e+02, percent-clipped=0.0 2023-09-29 07:57:04,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 07:57:05,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:57:08,168 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=299240.0, ans=0.1 2023-09-29 07:57:10,737 INFO [train.py:1039] (1/4) Epoch 9, batch 2400, loss[loss=0.2145, simple_loss=0.2864, pruned_loss=0.07134, over 23387.00 frames. ], tot_loss[loss=0.2131, simple_loss=0.2808, pruned_loss=0.07274, over 4691014.37 frames. ], batch size: 105, lr: 1.16e-02, grad_scale: 16.0 2023-09-29 07:57:10,867 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:57:13,987 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:57:16,936 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:57:17,039 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 07:57:17,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 07:57:19,412 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.75 vs. limit=6.0 2023-09-29 07:57:26,662 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 07:57:26,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:57:28,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 07:57:28,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:57:28,694 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=299373.3333333333, ans=0.1 2023-09-29 07:57:29,895 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:57:31,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 07:57:38,402 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:57:38,638 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 07:57:43,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 07:57:47,136 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 07:57:50,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:57:51,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:57:57,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:57:58,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 07:57:58,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:58:03,227 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:58:06,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:58:12,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:58:12,767 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=299506.6666666667, ans=0.0 2023-09-29 07:58:13,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:58:13,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 07:58:14,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:58:14,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:58:15,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:58:15,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:58:18,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:58:20,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:58:20,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 07:58:21,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 07:58:23,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:58:24,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:58:24,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 07:58:26,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 07:58:26,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 07:58:26,170 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 07:58:26,316 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 07:58:27,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:58:30,003 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:58:30,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:58:31,546 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 07:58:31,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:58:31,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 07:58:33,118 INFO [train.py:1039] (1/4) Epoch 9, batch 2450, loss[loss=0.2367, simple_loss=0.3118, pruned_loss=0.08079, over 23725.00 frames. ], tot_loss[loss=0.2123, simple_loss=0.2789, pruned_loss=0.07282, over 4686654.25 frames. ], batch size: 85, lr: 1.16e-02, grad_scale: 16.0 2023-09-29 07:58:36,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 07:58:36,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:58:38,214 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=299640.0, ans=0.2 2023-09-29 07:58:42,859 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:58:42,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:58:44,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 07:58:50,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:58:50,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:58:52,169 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=299706.6666666667, ans=0.125 2023-09-29 07:58:54,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:58:54,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:58:54,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:58:54,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 07:58:58,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:58:59,792 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 07:59:01,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:59:04,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 07:59:06,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:59:06,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:59:07,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:59:10,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 07:59:12,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:59:12,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=299773.3333333333, ans=0.125 2023-09-29 07:59:20,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:59:21,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:59:21,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:59:21,708 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:59:21,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:59:23,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:59:24,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 07:59:26,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:59:27,726 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:59:28,049 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=299840.0, ans=0.125 2023-09-29 07:59:28,094 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:59:29,588 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=299840.0, ans=0.0 2023-09-29 07:59:29,676 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=299840.0, ans=0.125 2023-09-29 07:59:30,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:59:30,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:59:32,690 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=299840.0, ans=0.125 2023-09-29 07:59:36,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:59:36,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 07:59:37,603 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:59:39,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:59:39,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 07:59:40,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:59:42,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:59:46,462 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 2.171e+02 2.453e+02 2.838e+02 4.289e+02, threshold=4.906e+02, percent-clipped=0.0 2023-09-29 07:59:46,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:59:48,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:59:50,303 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:59:53,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 07:59:54,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:59:56,022 INFO [train.py:1039] (1/4) Epoch 9, batch 2500, loss[loss=0.2132, simple_loss=0.2744, pruned_loss=0.07605, over 23803.00 frames. ], tot_loss[loss=0.2108, simple_loss=0.2774, pruned_loss=0.07211, over 4688828.34 frames. ], batch size: 179, lr: 1.16e-02, grad_scale: 16.0 2023-09-29 08:00:00,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:00:12,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:00:12,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:00:14,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:00:14,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 08:00:21,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 08:00:21,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:00:24,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 08:00:24,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 08:00:24,108 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 08:00:24,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:00:24,547 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=300040.0, ans=0.125 2023-09-29 08:00:25,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:00:25,745 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 08:00:25,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:00:28,542 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 08:00:28,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:00:28,948 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=300106.6666666667, ans=0.1 2023-09-29 08:00:33,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:00:33,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:00:36,855 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 08:00:38,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 08:00:38,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:00:40,042 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=300106.6666666667, ans=0.2 2023-09-29 08:00:41,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:00:44,242 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:00:49,397 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:00:52,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:00:57,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 08:00:59,578 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=300240.0, ans=0.0 2023-09-29 08:01:01,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 08:01:01,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:01:01,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 08:01:04,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:01:04,457 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:01:04,656 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 08:01:04,657 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 08:01:04,665 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 08:01:04,949 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=300240.0, ans=0.0 2023-09-29 08:01:08,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:01:09,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 08:01:09,727 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 08:01:11,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:01:11,254 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 08:01:14,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 08:01:16,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:01:18,173 INFO [train.py:1039] (1/4) Epoch 9, batch 2550, loss[loss=0.203, simple_loss=0.2699, pruned_loss=0.06805, over 24620.00 frames. ], tot_loss[loss=0.2109, simple_loss=0.2782, pruned_loss=0.07177, over 4700027.13 frames. ], batch size: 60, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:01:19,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:01:19,839 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:01:22,670 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:01:22,800 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 08:01:22,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:01:27,366 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 08:01:28,190 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:01:31,040 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:01:34,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:01:34,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 08:01:35,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 08:01:35,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:01:37,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:01:40,908 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:01:40,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 08:01:42,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 08:01:42,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:01:42,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 08:01:48,722 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=300440.0, ans=0.125 2023-09-29 08:01:56,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:02:02,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:02:04,061 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:02:04,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:02:04,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 08:02:11,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:02:13,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 08:02:14,072 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.59 vs. limit=22.5 2023-09-29 08:02:15,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:02:15,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:02:15,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 08:02:16,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 08:02:19,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:02:19,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:02:24,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:02:24,650 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=300573.3333333333, ans=0.125 2023-09-29 08:02:26,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 08:02:26,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:02:26,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:02:26,558 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 08:02:29,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 08:02:29,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:02:30,791 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.416e+02 1.909e+02 2.105e+02 2.404e+02 4.394e+02, threshold=4.210e+02, percent-clipped=0.0 2023-09-29 08:02:35,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:02:36,025 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=300573.3333333333, ans=0.0 2023-09-29 08:02:36,046 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=300573.3333333333, ans=0.125 2023-09-29 08:02:37,191 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:02:39,470 INFO [train.py:1039] (1/4) Epoch 9, batch 2600, loss[loss=0.2196, simple_loss=0.2973, pruned_loss=0.07098, over 24436.00 frames. ], tot_loss[loss=0.2124, simple_loss=0.2798, pruned_loss=0.07245, over 4707317.26 frames. ], batch size: 77, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:02:39,663 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 08:02:42,732 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 08:02:42,758 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:02:42,823 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 08:02:42,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 08:02:44,267 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 08:02:46,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:02:46,566 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 08:02:48,550 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 08:02:50,067 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 08:02:50,309 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=300640.0, ans=0.2 2023-09-29 08:02:53,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:02:54,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 08:02:56,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 08:02:58,158 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 08:02:58,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 08:03:01,304 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 08:03:01,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 08:03:07,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:03:07,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:03:07,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:03:07,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 08:03:09,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:03:13,179 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=300773.3333333333, ans=0.125 2023-09-29 08:03:17,403 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 08:03:24,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:03:24,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:03:24,826 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=300773.3333333333, ans=0.1 2023-09-29 08:03:26,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 08:03:27,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:03:27,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:03:27,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 08:03:30,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:03:30,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:03:34,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:03:37,969 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=300840.0, ans=0.125 2023-09-29 08:03:39,113 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 08:03:39,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:03:40,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:03:40,857 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=300840.0, ans=0.0 2023-09-29 08:03:45,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:03:47,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:03:47,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 08:03:47,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:03:49,058 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:03:50,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:03:57,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 08:03:58,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:04:00,386 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 08:04:00,652 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=300973.3333333333, ans=0.1 2023-09-29 08:04:01,771 INFO [train.py:1039] (1/4) Epoch 9, batch 2650, loss[loss=0.1922, simple_loss=0.262, pruned_loss=0.06123, over 24669.00 frames. ], tot_loss[loss=0.2145, simple_loss=0.2818, pruned_loss=0.07363, over 4701116.01 frames. ], batch size: 65, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:04:03,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 08:04:03,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:04:05,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 08:04:05,258 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 08:04:05,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:04:08,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:04:11,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 08:04:13,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:04:16,149 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:04:16,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 08:04:16,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:04:17,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:04:19,977 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.11 vs. limit=15.0 2023-09-29 08:04:21,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 08:04:23,118 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 08:04:23,463 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=301040.0, ans=0.125 2023-09-29 08:04:23,806 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.72 vs. limit=15.0 2023-09-29 08:04:26,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:04:27,842 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 08:04:27,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:04:29,931 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 08:04:35,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:04:35,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:04:35,115 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:04:35,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:04:39,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 08:04:39,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 08:04:43,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:04:45,209 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer_na.min_abs, batch_count=301106.6666666667, ans=0.02 2023-09-29 08:04:47,972 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 08:04:47,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:04:49,463 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:04:50,937 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 08:04:50,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:04:52,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:04:53,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:04:55,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:04:55,632 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:04:56,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 08:04:57,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:04:59,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:04:59,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:05:01,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:05:02,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:05:02,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 08:05:06,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:05:07,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:05:07,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:05:07,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 08:05:13,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:05:13,954 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:05:15,910 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.708e+02 2.153e+02 2.553e+02 3.125e+02 4.988e+02, threshold=5.107e+02, percent-clipped=5.0 2023-09-29 08:05:17,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:05:17,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:05:19,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 08:05:20,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:05:22,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:05:22,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 08:05:23,111 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.33 vs. limit=15.0 2023-09-29 08:05:23,523 INFO [train.py:1039] (1/4) Epoch 9, batch 2700, loss[loss=0.1944, simple_loss=0.2638, pruned_loss=0.06246, over 20305.00 frames. ], tot_loss[loss=0.2131, simple_loss=0.281, pruned_loss=0.07257, over 4713500.11 frames. ], batch size: 44, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:05:26,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:05:28,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 08:05:30,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:05:30,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:05:30,130 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:05:31,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:05:31,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:05:31,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:05:31,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 08:05:31,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 08:05:32,816 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=301306.6666666667, ans=0.125 2023-09-29 08:05:33,933 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:05:36,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:05:38,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 08:05:38,344 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:05:45,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:05:45,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 08:05:46,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:05:52,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:05:52,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:05:58,095 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:05:58,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:05:58,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:05:58,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:06:02,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:06:05,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:06:06,895 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:06:06,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:06:10,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:06:10,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:06:19,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:06:21,019 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:06:25,433 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 08:06:25,435 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:06:27,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:06:29,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:06:30,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:06:32,316 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:06:33,768 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:06:33,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:06:36,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:06:38,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:06:38,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:06:38,588 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=301573.3333333333, ans=0.2 2023-09-29 08:06:40,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 08:06:42,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:06:42,375 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=301573.3333333333, ans=0.125 2023-09-29 08:06:43,779 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:06:43,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 08:06:45,109 INFO [train.py:1039] (1/4) Epoch 9, batch 2750, loss[loss=0.23, simple_loss=0.2841, pruned_loss=0.08802, over 23853.00 frames. ], tot_loss[loss=0.2132, simple_loss=0.2812, pruned_loss=0.07255, over 4718108.05 frames. ], batch size: 195, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:06:45,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 08:06:46,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:06:50,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:06:50,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:06:53,076 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=301640.0, ans=0.125 2023-09-29 08:06:54,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:06:54,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:06:54,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:06:57,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:06:58,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 08:07:00,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:07:00,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:07:00,135 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 08:07:00,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:07:00,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:07:05,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 08:07:07,430 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer_na.min_abs, batch_count=301706.6666666667, ans=0.02 2023-09-29 08:07:08,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:07:08,617 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:07:10,032 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:07:10,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 08:07:10,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:07:11,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:07:11,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:07:12,823 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=301706.6666666667, ans=0.125 2023-09-29 08:07:13,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:07:18,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 08:07:18,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 08:07:18,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:07:20,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:07:22,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 08:07:27,286 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=301773.3333333333, ans=0.0 2023-09-29 08:07:30,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:07:31,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 08:07:33,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:07:38,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:07:38,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 08:07:38,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:07:38,863 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=301840.0, ans=10.0 2023-09-29 08:07:43,318 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:07:43,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:07:43,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 08:07:45,074 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=301840.0, ans=0.2 2023-09-29 08:07:48,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:07:48,704 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=301840.0, ans=0.125 2023-09-29 08:07:50,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 08:07:56,666 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 08:07:59,502 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.652e+02 1.994e+02 2.310e+02 2.732e+02 5.086e+02, threshold=4.620e+02, percent-clipped=0.0 2023-09-29 08:08:01,633 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:08:01,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 08:08:03,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:08:03,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:08:04,850 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 08:08:04,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:08:06,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 08:08:06,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:08:08,080 INFO [train.py:1039] (1/4) Epoch 9, batch 2800, loss[loss=0.2026, simple_loss=0.2817, pruned_loss=0.06174, over 24465.00 frames. ], tot_loss[loss=0.2118, simple_loss=0.2799, pruned_loss=0.07188, over 4710206.26 frames. ], batch size: 66, lr: 1.15e-02, grad_scale: 32.0 2023-09-29 08:08:08,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:08:10,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 08:08:10,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:08:11,800 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:08:13,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:08:14,775 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 08:08:14,776 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 08:08:18,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:08:21,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 08:08:21,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:08:26,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:08:26,445 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 08:08:29,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 08:08:30,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 08:08:31,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:08:31,671 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:08:31,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:08:33,590 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=302040.0, ans=0.1 2023-09-29 08:08:36,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:08:36,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:08:37,001 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 08:08:38,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:08:41,851 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=302106.6666666667, ans=0.2 2023-09-29 08:08:48,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:08:49,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:08:50,006 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=302106.6666666667, ans=0.0 2023-09-29 08:08:51,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:08:52,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:08:54,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:08:59,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:08:59,299 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 08:08:59,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:09:01,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:09:01,295 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:09:04,496 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:09:05,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:09:07,990 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.38 vs. limit=22.5 2023-09-29 08:09:10,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:09:12,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:09:12,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:09:12,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 08:09:12,602 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 08:09:14,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:09:15,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:09:15,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 08:09:15,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:09:17,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:09:17,599 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:09:19,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 08:09:21,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:09:21,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:09:22,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:09:23,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 08:09:28,547 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.02 vs. limit=22.5 2023-09-29 08:09:30,402 INFO [train.py:1039] (1/4) Epoch 9, batch 2850, loss[loss=0.223, simple_loss=0.2827, pruned_loss=0.08165, over 23781.00 frames. ], tot_loss[loss=0.2116, simple_loss=0.2792, pruned_loss=0.07199, over 4702905.34 frames. ], batch size: 212, lr: 1.15e-02, grad_scale: 32.0 2023-09-29 08:09:30,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:09:30,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 08:09:32,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:09:32,291 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=302306.6666666667, ans=0.125 2023-09-29 08:09:32,316 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=302306.6666666667, ans=0.0 2023-09-29 08:09:33,569 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:09:37,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:09:37,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:09:38,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:09:40,396 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:09:42,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:09:44,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:09:44,966 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 08:09:50,304 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 08:09:50,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:09:53,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 08:09:55,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:09:56,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 08:09:59,056 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 08:10:00,495 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:10:05,677 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=302440.0, ans=0.0 2023-09-29 08:10:08,596 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=302440.0, ans=0.0 2023-09-29 08:10:13,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:10:14,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:10:14,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:10:16,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 08:10:16,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 08:10:16,479 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:10:18,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:10:18,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 08:10:19,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 08:10:21,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:10:21,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:10:21,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:10:24,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:10:24,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:10:26,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:10:28,171 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:10:29,901 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:10:31,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:10:34,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:10:36,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:10:41,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:10:43,750 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.651e+02 1.988e+02 2.184e+02 2.463e+02 3.940e+02, threshold=4.369e+02, percent-clipped=0.0 2023-09-29 08:10:43,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 08:10:43,897 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 08:10:45,575 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 08:10:47,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:10:47,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 08:10:49,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:10:49,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:10:49,277 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:10:50,632 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:10:50,633 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 08:10:50,727 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 08:10:50,733 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:10:51,987 INFO [train.py:1039] (1/4) Epoch 9, batch 2900, loss[loss=0.2236, simple_loss=0.2853, pruned_loss=0.081, over 23914.00 frames. ], tot_loss[loss=0.2112, simple_loss=0.279, pruned_loss=0.07168, over 4702220.36 frames. ], batch size: 195, lr: 1.15e-02, grad_scale: 32.0 2023-09-29 08:10:52,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:10:56,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 08:10:56,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:10:59,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:10:59,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 08:11:04,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:11:04,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 08:11:06,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 08:11:06,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:11:06,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:11:08,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:11:10,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:11:14,400 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:11:14,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:11:17,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 08:11:18,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 08:11:20,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 08:11:20,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:11:20,910 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=302706.6666666667, ans=0.125 2023-09-29 08:11:23,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 08:11:25,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 08:11:28,607 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:11:28,612 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 08:11:28,653 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:11:29,039 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=302773.3333333333, ans=0.125 2023-09-29 08:11:31,557 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:11:31,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 08:11:33,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:11:35,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:11:39,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:11:39,200 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=302773.3333333333, ans=0.125 2023-09-29 08:11:41,970 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:11:42,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 08:11:44,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 08:11:44,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:11:44,618 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=302840.0, ans=0.025 2023-09-29 08:11:48,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:11:51,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 08:11:53,222 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:11:57,982 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:12:08,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:12:08,261 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:12:09,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 08:12:11,933 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=302906.6666666667, ans=0.125 2023-09-29 08:12:12,460 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.05 vs. limit=10.0 2023-09-29 08:12:14,607 INFO [train.py:1039] (1/4) Epoch 9, batch 2950, loss[loss=0.2125, simple_loss=0.2967, pruned_loss=0.06413, over 24486.00 frames. ], tot_loss[loss=0.2127, simple_loss=0.2807, pruned_loss=0.07231, over 4711053.42 frames. ], batch size: 69, lr: 1.15e-02, grad_scale: 32.0 2023-09-29 08:12:14,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:12:14,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 08:12:14,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:12:16,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 08:12:21,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:12:23,187 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 08:12:24,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:12:24,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:12:26,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:12:26,363 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=302973.3333333333, ans=0.0 2023-09-29 08:12:27,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:12:29,201 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 08:12:29,599 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=303040.0, ans=0.125 2023-09-29 08:12:30,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 08:12:30,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:12:30,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:12:37,568 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:12:39,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:12:41,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:12:41,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:12:46,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:12:46,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:12:48,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:12:49,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:12:49,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:12:52,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 08:12:57,950 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 08:12:57,997 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 08:12:59,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 08:13:00,865 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 08:13:02,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 08:13:02,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:13:03,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:13:03,913 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 08:13:03,931 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 08:13:08,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 08:13:08,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:13:10,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:13:14,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:13:15,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:13:16,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:13:16,458 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 08:13:16,526 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:13:16,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 08:13:23,566 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:13:23,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:13:24,034 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=303240.0, ans=0.0 2023-09-29 08:13:25,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 08:13:25,169 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:13:26,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 08:13:28,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:13:29,728 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.974e+02 2.174e+02 2.569e+02 4.331e+02, threshold=4.348e+02, percent-clipped=0.0 2023-09-29 08:13:30,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:13:31,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 08:13:33,479 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:13:33,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 08:13:35,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:13:36,361 INFO [train.py:1039] (1/4) Epoch 9, batch 3000, loss[loss=0.2217, simple_loss=0.287, pruned_loss=0.07818, over 24056.00 frames. ], tot_loss[loss=0.2113, simple_loss=0.28, pruned_loss=0.0713, over 4728562.26 frames. ], batch size: 80, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:13:36,362 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-29 08:13:49,678 INFO [train.py:1071] (1/4) Epoch 9, validation: loss=0.2838, simple_loss=0.2753, pruned_loss=0.1462, over 1125622.00 frames. 2023-09-29 08:13:49,679 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-29 08:13:49,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:13:49,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 08:13:49,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:13:49,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:13:52,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:13:52,293 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=303306.6666666667, ans=0.1 2023-09-29 08:13:53,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:13:53,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 08:13:55,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:13:58,320 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:13:59,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 08:14:01,543 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 08:14:01,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 08:14:05,248 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:14:06,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:14:06,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 08:14:08,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:14:14,419 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 08:14:25,541 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:14:30,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 08:14:31,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:14:33,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:14:33,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:14:33,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:14:36,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:14:37,742 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 08:14:40,158 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 08:14:40,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:14:41,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 08:14:43,287 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 08:14:43,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:14:44,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:14:44,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:14:49,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 08:14:49,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:14:49,779 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:14:52,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:14:55,131 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 08:14:57,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:14:58,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:14:58,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:15:03,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:15:03,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:15:04,552 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 08:15:04,620 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 08:15:06,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:15:06,096 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 08:15:06,171 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:15:07,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 08:15:08,100 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 08:15:10,770 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:15:10,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 08:15:10,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 08:15:12,232 INFO [train.py:1039] (1/4) Epoch 9, batch 3050, loss[loss=0.223, simple_loss=0.2827, pruned_loss=0.08161, over 23489.00 frames. ], tot_loss[loss=0.2116, simple_loss=0.2807, pruned_loss=0.07119, over 4749384.48 frames. ], batch size: 134, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:15:12,419 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 08:15:12,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 08:15:13,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:15:15,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:15:15,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 08:15:15,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:15:16,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:15:19,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 08:15:20,644 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:15:24,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:15:24,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 08:15:26,150 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=303640.0, ans=0.09899494936611666 2023-09-29 08:15:29,641 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:15:31,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 08:15:38,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 08:15:39,008 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 08:15:39,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:15:42,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:15:42,671 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=303706.6666666667, ans=0.0 2023-09-29 08:15:45,539 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:15:45,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:15:45,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:15:45,923 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 08:15:49,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:15:50,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:15:50,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:15:50,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:15:50,943 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:15:52,453 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:15:56,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:15:59,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:16:00,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 08:16:00,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:16:00,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:16:04,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:16:06,060 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:16:06,173 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:16:07,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:16:12,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:16:13,606 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:16:18,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:16:19,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:16:19,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:16:21,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:16:21,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 08:16:21,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:16:23,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 08:16:25,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:16:26,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:16:26,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 08:16:28,061 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.681e+02 1.995e+02 2.261e+02 2.647e+02 3.760e+02, threshold=4.522e+02, percent-clipped=0.0 2023-09-29 08:16:28,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:16:32,109 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=303906.6666666667, ans=0.1 2023-09-29 08:16:35,199 INFO [train.py:1039] (1/4) Epoch 9, batch 3100, loss[loss=0.2102, simple_loss=0.2867, pruned_loss=0.06683, over 24005.00 frames. ], tot_loss[loss=0.2114, simple_loss=0.2807, pruned_loss=0.07104, over 4730943.81 frames. ], batch size: 86, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:16:35,362 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:16:36,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:16:40,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 08:16:42,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 08:16:44,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 08:16:45,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 08:16:47,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:16:47,432 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=303973.3333333333, ans=0.125 2023-09-29 08:16:50,259 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:16:50,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:16:52,106 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=304040.0, ans=0.125 2023-09-29 08:16:53,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 08:16:58,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:17:03,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 08:17:08,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 08:17:10,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:17:10,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:17:10,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:17:12,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 08:17:14,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:17:14,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 08:17:14,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:17:15,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:17:17,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 08:17:18,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:17:20,407 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=304106.6666666667, ans=0.125 2023-09-29 08:17:23,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:17:24,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 08:17:24,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 08:17:26,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:17:26,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:17:29,327 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:17:29,346 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:17:29,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:17:31,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 08:17:31,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:17:33,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:17:33,234 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:17:33,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:17:33,247 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 08:17:38,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:17:39,100 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.71 vs. limit=15.0 2023-09-29 08:17:39,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 08:17:40,152 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=304240.0, ans=0.125 2023-09-29 08:17:42,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:17:44,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 08:17:45,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:17:45,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:17:47,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 08:17:55,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 08:17:57,826 INFO [train.py:1039] (1/4) Epoch 9, batch 3150, loss[loss=0.2096, simple_loss=0.2937, pruned_loss=0.06276, over 24576.00 frames. ], tot_loss[loss=0.2104, simple_loss=0.2794, pruned_loss=0.07073, over 4724059.96 frames. ], batch size: 71, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:17:58,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:17:59,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:18:01,137 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:18:01,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:18:02,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 08:18:04,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:18:04,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 08:18:04,954 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=304306.6666666667, ans=0.1 2023-09-29 08:18:07,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 08:18:09,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:18:11,201 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 08:18:11,401 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=304306.6666666667, ans=0.2 2023-09-29 08:18:13,059 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=304373.3333333333, ans=0.125 2023-09-29 08:18:14,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 08:18:14,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:18:14,467 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 08:18:15,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 08:18:18,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 08:18:18,571 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.97 vs. limit=15.0 2023-09-29 08:18:19,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 08:18:19,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 08:18:19,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:18:19,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:18:20,988 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:18:23,120 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 08:18:26,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:18:26,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:18:27,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:18:29,228 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 08:18:29,849 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.31 vs. limit=22.5 2023-09-29 08:18:32,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 08:18:33,948 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:18:36,112 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.11 vs. limit=6.0 2023-09-29 08:18:36,820 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 08:18:38,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:18:38,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 08:18:40,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 08:18:41,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:18:42,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 08:18:42,032 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 08:18:42,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:18:42,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 08:18:45,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 08:18:45,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 08:18:45,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 08:18:45,934 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=304506.6666666667, ans=0.1 2023-09-29 08:18:46,049 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=304506.6666666667, ans=0.125 2023-09-29 08:18:47,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 08:18:47,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:18:48,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:18:49,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:18:51,413 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 08:18:51,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:18:53,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 08:18:53,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:18:55,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 08:18:57,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 08:18:57,301 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:18:58,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:19:00,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 08:19:01,877 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 08:19:01,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:19:03,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:19:05,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:19:06,812 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:19:10,785 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=304573.3333333333, ans=0.1 2023-09-29 08:19:13,058 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.470e+02 2.121e+02 2.392e+02 3.280e+02 6.565e+02, threshold=4.784e+02, percent-clipped=9.0 2023-09-29 08:19:13,259 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 08:19:13,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:19:16,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 08:19:19,814 INFO [train.py:1039] (1/4) Epoch 9, batch 3200, loss[loss=0.2081, simple_loss=0.2808, pruned_loss=0.06772, over 24475.00 frames. ], tot_loss[loss=0.2089, simple_loss=0.2773, pruned_loss=0.07027, over 4722893.30 frames. ], batch size: 63, lr: 1.15e-02, grad_scale: 32.0 2023-09-29 08:19:21,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:19:21,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 08:19:26,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:19:28,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:19:28,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 08:19:30,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:19:35,752 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=304706.6666666667, ans=0.125 2023-09-29 08:19:35,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=304706.6666666667, ans=0.04949747468305833 2023-09-29 08:19:36,955 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 08:19:38,666 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:19:47,240 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=304706.6666666667, ans=0.1 2023-09-29 08:19:48,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:19:58,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 08:19:58,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:20:01,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 08:20:02,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 08:20:07,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:20:07,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:20:08,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:20:13,282 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 08:20:13,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 08:20:16,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 08:20:20,055 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 08:20:23,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:20:28,181 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:20:28,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 08:20:29,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:20:29,751 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 08:20:29,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 08:20:31,645 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=304906.6666666667, ans=0.0 2023-09-29 08:20:32,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:20:35,458 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=304906.6666666667, ans=0.1 2023-09-29 08:20:36,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 08:20:36,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 08:20:38,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 08:20:39,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 08:20:41,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:20:43,054 INFO [train.py:1039] (1/4) Epoch 9, batch 3250, loss[loss=0.2047, simple_loss=0.2678, pruned_loss=0.07077, over 23699.00 frames. ], tot_loss[loss=0.2086, simple_loss=0.2773, pruned_loss=0.06995, over 4726231.52 frames. ], batch size: 149, lr: 1.15e-02, grad_scale: 32.0 2023-09-29 08:20:43,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 08:20:44,821 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 08:20:44,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:20:44,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:20:45,052 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 08:20:49,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:20:53,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:21:02,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:21:02,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 08:21:02,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:21:04,035 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:21:04,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:21:05,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:21:05,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 08:21:09,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:21:09,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:21:11,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:21:11,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:21:11,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:21:11,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:21:15,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:21:15,873 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=305106.6666666667, ans=0.0 2023-09-29 08:21:17,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:21:18,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:21:18,879 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:21:21,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:21:21,074 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:21:21,090 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:21:26,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 08:21:26,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:21:26,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:21:27,280 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.46 vs. limit=10.0 2023-09-29 08:21:28,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:21:28,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 08:21:34,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:21:41,020 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=305173.3333333333, ans=0.2 2023-09-29 08:21:46,063 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:21:46,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:21:46,144 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 08:21:46,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:21:46,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 08:21:47,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:21:48,007 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=305240.0, ans=0.125 2023-09-29 08:21:49,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 08:21:49,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 08:21:50,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:21:51,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:21:52,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:21:52,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 08:21:54,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:21:55,841 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=305240.0, ans=0.1 2023-09-29 08:21:57,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:21:57,892 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.02 vs. limit=22.5 2023-09-29 08:21:58,893 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 2.015e+02 2.327e+02 2.716e+02 4.299e+02, threshold=4.655e+02, percent-clipped=0.0 2023-09-29 08:21:59,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:22:01,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 08:22:01,194 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:22:02,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:22:02,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 08:22:06,376 INFO [train.py:1039] (1/4) Epoch 9, batch 3300, loss[loss=0.1873, simple_loss=0.2768, pruned_loss=0.04887, over 24329.00 frames. ], tot_loss[loss=0.2097, simple_loss=0.2781, pruned_loss=0.07065, over 4722957.11 frames. ], batch size: 74, lr: 1.15e-02, grad_scale: 32.0 2023-09-29 08:22:06,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:22:06,583 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 08:22:09,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 08:22:11,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 08:22:11,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:22:11,244 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=305306.6666666667, ans=0.125 2023-09-29 08:22:17,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:22:18,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:22:18,193 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:22:19,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 08:22:21,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 08:22:21,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:22:22,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:22:27,760 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 08:22:29,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:22:29,099 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:22:30,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:22:32,102 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 08:22:33,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:22:33,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 08:22:35,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:22:35,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:22:35,926 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 08:22:39,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:22:39,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 08:22:41,560 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=11.74 vs. limit=15.0 2023-09-29 08:22:42,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:22:42,712 WARNING [train.py:1197] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 08:22:44,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 08:22:44,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:22:44,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:22:47,416 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 08:22:48,466 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.39 vs. limit=6.0 2023-09-29 08:22:48,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 08:22:48,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:22:49,187 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=305440.0, ans=0.125 2023-09-29 08:22:52,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 08:22:54,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 08:22:57,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 08:22:58,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:23:00,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:23:01,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:23:01,910 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:23:01,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 08:23:03,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:23:03,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:23:05,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:23:07,117 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 08:23:09,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 08:23:10,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 08:23:10,874 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:23:10,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:23:13,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:23:13,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:23:15,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 08:23:17,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:23:17,287 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 08:23:18,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:23:19,448 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.82 vs. limit=15.0 2023-09-29 08:23:20,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 08:23:23,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 08:23:23,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:23:23,598 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:23:25,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:23:25,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 08:23:28,429 INFO [train.py:1039] (1/4) Epoch 9, batch 3350, loss[loss=0.2071, simple_loss=0.2871, pruned_loss=0.06353, over 24656.00 frames. ], tot_loss[loss=0.2106, simple_loss=0.2791, pruned_loss=0.07111, over 4727887.67 frames. ], batch size: 73, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:23:28,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:23:30,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:23:30,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:23:33,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:23:34,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:23:36,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:23:37,642 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.17 vs. limit=15.0 2023-09-29 08:23:38,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:23:38,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:23:40,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:23:42,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:23:44,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 08:23:45,656 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 08:23:45,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:23:49,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 08:23:49,246 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 08:23:49,395 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 08:23:49,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:23:52,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:23:52,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 08:23:53,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:23:53,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:23:56,882 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:23:59,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:23:59,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:24:00,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:24:02,586 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=305773.3333333333, ans=0.5 2023-09-29 08:24:05,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:24:06,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:24:07,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:24:11,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:24:13,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:24:16,546 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:24:16,570 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:24:17,230 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=7.57 vs. limit=15.0 2023-09-29 08:24:20,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:24:21,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 08:24:23,239 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 08:24:23,291 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 08:24:23,363 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:24:24,937 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 08:24:25,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:24:27,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:24:27,510 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=305840.0, ans=0.0 2023-09-29 08:24:33,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:24:34,872 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 08:24:34,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 08:24:37,043 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 08:24:37,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:24:43,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:24:43,786 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=305906.6666666667, ans=0.125 2023-09-29 08:24:44,736 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.686e+02 2.029e+02 2.244e+02 2.615e+02 3.935e+02, threshold=4.489e+02, percent-clipped=0.0 2023-09-29 08:24:44,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 08:24:45,240 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=305906.6666666667, ans=0.0 2023-09-29 08:24:46,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 08:24:46,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:24:48,793 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=305906.6666666667, ans=0.125 2023-09-29 08:24:49,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:24:50,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 08:24:50,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:24:51,467 INFO [train.py:1039] (1/4) Epoch 9, batch 3400, loss[loss=0.2127, simple_loss=0.2917, pruned_loss=0.06683, over 24534.00 frames. ], tot_loss[loss=0.2124, simple_loss=0.2805, pruned_loss=0.07215, over 4723118.51 frames. ], batch size: 71, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:24:51,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 08:24:52,717 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=305973.3333333333, ans=10.0 2023-09-29 08:24:53,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:24:53,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:24:53,969 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:24:55,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:24:56,900 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 08:25:00,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 08:25:00,152 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 08:25:00,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:25:05,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:25:05,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 08:25:06,906 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:25:08,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 08:25:10,747 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.43 vs. limit=15.0 2023-09-29 08:25:13,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:25:16,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 08:25:18,565 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=306040.0, ans=0.2 2023-09-29 08:25:20,036 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=306040.0, ans=0.2 2023-09-29 08:25:22,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:25:24,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:25:25,089 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:25:26,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 08:25:26,965 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=306106.6666666667, ans=0.125 2023-09-29 08:25:34,113 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.87 vs. limit=22.5 2023-09-29 08:25:34,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:25:38,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 08:25:42,533 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.91 vs. limit=12.0 2023-09-29 08:25:46,791 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:25:46,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:25:46,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 08:25:46,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:25:47,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:25:48,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:25:48,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:25:51,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:25:54,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:25:54,841 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:25:55,176 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=306173.3333333333, ans=0.2 2023-09-29 08:26:01,591 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:26:03,974 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 08:26:09,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 08:26:13,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 08:26:15,233 INFO [train.py:1039] (1/4) Epoch 9, batch 3450, loss[loss=0.2136, simple_loss=0.2851, pruned_loss=0.07111, over 23415.00 frames. ], tot_loss[loss=0.2122, simple_loss=0.2803, pruned_loss=0.07206, over 4721862.69 frames. ], batch size: 93, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:26:16,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 08:26:18,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:26:19,105 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:26:19,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 08:26:20,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:26:24,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:26:32,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:26:33,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:26:33,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:26:33,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:26:37,145 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=306373.3333333333, ans=0.0 2023-09-29 08:26:38,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:26:43,823 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=306373.3333333333, ans=0.125 2023-09-29 08:26:45,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 08:26:45,479 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=306373.3333333333, ans=0.125 2023-09-29 08:26:50,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 08:26:50,632 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 08:26:50,698 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:26:52,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:26:57,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 08:26:58,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 08:27:02,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:27:02,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:27:03,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 08:27:05,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:27:07,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 08:27:07,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:27:07,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:27:08,127 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=306506.6666666667, ans=0.0 2023-09-29 08:27:10,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:27:13,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 08:27:18,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:27:22,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:27:22,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:27:27,923 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:27:32,306 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.752e+02 2.104e+02 2.323e+02 2.853e+02 3.879e+02, threshold=4.645e+02, percent-clipped=0.0 2023-09-29 08:27:32,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:27:32,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:27:32,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:27:34,197 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:27:38,560 INFO [train.py:1039] (1/4) Epoch 9, batch 3500, loss[loss=0.2149, simple_loss=0.276, pruned_loss=0.07696, over 23547.00 frames. ], tot_loss[loss=0.2111, simple_loss=0.2784, pruned_loss=0.07192, over 4706459.74 frames. ], batch size: 106, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:27:38,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:27:43,901 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:27:44,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 08:27:44,173 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=306640.0, ans=0.0 2023-09-29 08:27:47,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 08:27:52,264 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 08:27:53,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:27:53,935 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 08:27:58,595 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:27:58,885 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=306706.6666666667, ans=0.125 2023-09-29 08:28:00,065 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:28:02,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:28:02,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:28:03,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 08:28:03,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:28:03,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:28:05,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 08:28:08,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:28:09,896 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 08:28:11,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:28:14,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:28:16,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 08:28:16,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:28:20,747 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:28:21,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:28:21,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:28:23,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:28:25,195 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:28:25,405 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 08:28:28,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 08:28:28,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 08:28:28,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:28:29,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:28:30,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:28:30,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 08:28:35,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 08:28:35,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:28:40,046 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:28:41,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 08:28:41,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 08:28:41,602 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:28:45,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:28:45,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:28:47,571 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:28:49,267 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 08:28:50,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:28:52,823 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:28:54,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 08:28:56,355 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 08:29:01,016 INFO [train.py:1039] (1/4) Epoch 9, batch 3550, loss[loss=0.1989, simple_loss=0.2812, pruned_loss=0.05831, over 24656.00 frames. ], tot_loss[loss=0.2097, simple_loss=0.2775, pruned_loss=0.07096, over 4700253.25 frames. ], batch size: 65, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:29:01,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:29:02,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:29:02,647 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:29:04,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:29:05,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:29:13,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:29:15,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 08:29:16,442 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=307040.0, ans=0.0 2023-09-29 08:29:17,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:29:19,106 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:29:20,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:29:22,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:29:22,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 08:29:25,442 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:29:26,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:29:26,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:29:27,497 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 08:29:27,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 08:29:34,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:29:34,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:29:35,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:29:35,894 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:29:35,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:29:36,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 08:29:37,379 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:29:39,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:29:39,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 08:29:47,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:29:47,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:29:47,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:29:50,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 08:29:50,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:29:51,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 08:29:53,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:29:56,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:29:56,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:29:58,412 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=5.140e-03 2023-09-29 08:30:01,164 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 08:30:01,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:30:08,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:30:10,111 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 08:30:10,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:30:15,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:30:15,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 08:30:17,034 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.963e+02 2.242e+02 2.644e+02 4.260e+02, threshold=4.484e+02, percent-clipped=0.0 2023-09-29 08:30:21,968 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 08:30:22,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:30:22,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:30:23,418 INFO [train.py:1039] (1/4) Epoch 9, batch 3600, loss[loss=0.222, simple_loss=0.2794, pruned_loss=0.08228, over 23759.00 frames. ], tot_loss[loss=0.2099, simple_loss=0.278, pruned_loss=0.07091, over 4705141.57 frames. ], batch size: 232, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:30:24,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:30:25,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:30:26,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:30:29,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:30:31,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:30:32,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:30:34,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:30:36,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:30:36,190 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 08:30:38,508 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.78 vs. limit=15.0 2023-09-29 08:30:40,098 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 08:30:40,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:30:43,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:30:46,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:30:49,067 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=307373.3333333333, ans=0.125 2023-09-29 08:30:50,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:30:50,239 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:30:50,274 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 08:30:51,730 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:30:53,608 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=307373.3333333333, ans=0.1 2023-09-29 08:30:54,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:30:56,304 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 08:30:57,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:30:59,380 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:30:59,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:31:00,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 08:31:09,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:31:10,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 08:31:12,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 08:31:12,598 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=307506.6666666667, ans=0.125 2023-09-29 08:31:12,889 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.36 vs. limit=15.0 2023-09-29 08:31:16,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:31:22,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:31:25,761 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:31:31,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:31:31,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:31:31,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 08:31:33,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 08:31:35,251 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 08:31:36,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:31:37,687 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.10 vs. limit=15.0 2023-09-29 08:31:38,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:31:38,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 08:31:39,936 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:31:39,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:31:40,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:31:40,528 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.31 vs. limit=15.0 2023-09-29 08:31:41,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 08:31:41,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 08:31:45,442 INFO [train.py:1039] (1/4) Epoch 9, batch 3650, loss[loss=0.2837, simple_loss=0.3224, pruned_loss=0.1225, over 19589.00 frames. ], tot_loss[loss=0.2105, simple_loss=0.2784, pruned_loss=0.07126, over 4716947.41 frames. ], batch size: 388, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:31:45,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:31:47,133 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 08:31:51,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 08:31:51,441 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=307640.0, ans=0.0 2023-09-29 08:31:52,546 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:31:56,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 08:31:57,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 08:32:00,923 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:32:01,215 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=307706.6666666667, ans=0.0 2023-09-29 08:32:02,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:32:02,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 08:32:06,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 08:32:08,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:32:08,354 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=307706.6666666667, ans=0.2 2023-09-29 08:32:09,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 08:32:09,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 08:32:09,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:32:11,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 08:32:11,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 08:32:11,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:32:11,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:32:14,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:32:18,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 08:32:18,548 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 08:32:19,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:32:21,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 08:32:25,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:32:25,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:32:30,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:32:33,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:32:33,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:32:35,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:32:35,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:32:36,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:32:41,637 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:32:43,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:32:43,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:32:43,517 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 08:32:44,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 08:32:46,335 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:32:46,435 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:32:49,955 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=307906.6666666667, ans=0.125 2023-09-29 08:32:50,001 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=307906.6666666667, ans=0.1 2023-09-29 08:32:54,755 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 08:33:00,351 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:33:00,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:33:01,736 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 2.073e+02 2.361e+02 2.805e+02 4.754e+02, threshold=4.723e+02, percent-clipped=2.0 2023-09-29 08:33:01,849 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 08:33:01,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:33:03,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:33:05,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:33:07,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 08:33:07,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:33:08,644 INFO [train.py:1039] (1/4) Epoch 9, batch 3700, loss[loss=0.2165, simple_loss=0.2945, pruned_loss=0.0692, over 24397.00 frames. ], tot_loss[loss=0.211, simple_loss=0.2794, pruned_loss=0.0713, over 4728232.81 frames. ], batch size: 77, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:33:10,183 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 08:33:11,739 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:33:11,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:33:12,117 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:33:12,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 08:33:12,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:33:13,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 08:33:13,747 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 08:33:17,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 08:33:19,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:33:20,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:33:21,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:33:21,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:33:22,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 08:33:25,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:33:26,589 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 08:33:30,407 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=308040.0, ans=0.125 2023-09-29 08:33:35,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:33:35,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 08:33:38,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 08:33:38,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 08:33:40,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:33:43,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:33:44,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 08:33:46,202 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:33:47,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:33:50,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:33:50,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 08:33:52,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 08:33:55,984 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:33:55,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 08:33:57,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:33:57,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 08:34:03,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:34:04,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:34:07,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:34:09,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 08:34:12,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:34:12,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 08:34:13,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:34:13,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:34:16,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:34:18,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 08:34:19,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 08:34:19,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:34:19,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:34:22,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:34:24,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:34:25,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:34:27,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:34:28,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:34:30,337 INFO [train.py:1039] (1/4) Epoch 9, batch 3750, loss[loss=0.2095, simple_loss=0.2944, pruned_loss=0.06229, over 24581.00 frames. ], tot_loss[loss=0.2116, simple_loss=0.2799, pruned_loss=0.0717, over 4730293.75 frames. ], batch size: 71, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:34:30,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 08:34:32,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 08:34:34,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 08:34:35,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 08:34:35,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:34:36,241 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=308306.6666666667, ans=0.125 2023-09-29 08:34:38,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:34:38,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:34:39,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:34:42,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:34:48,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:34:49,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:34:49,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:34:49,962 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 08:34:54,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:34:54,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 08:34:57,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:34:59,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:34:59,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:35:00,995 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=308373.3333333333, ans=0.1 2023-09-29 08:35:02,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 08:35:02,523 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=308440.0, ans=0.125 2023-09-29 08:35:05,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 08:35:07,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:35:08,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:35:11,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:35:11,389 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=308440.0, ans=0.0 2023-09-29 08:35:13,332 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.77 vs. limit=6.0 2023-09-29 08:35:17,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:35:18,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 08:35:23,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 08:35:27,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:35:28,156 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=308506.6666666667, ans=0.125 2023-09-29 08:35:29,500 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=308506.6666666667, ans=0.0 2023-09-29 08:35:29,905 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.85 vs. limit=22.5 2023-09-29 08:35:30,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:35:30,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:35:32,599 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=308506.6666666667, ans=0.125 2023-09-29 08:35:35,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 08:35:39,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 08:35:41,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 08:35:43,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 08:35:45,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:35:47,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 08:35:48,986 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.742e+02 2.167e+02 2.525e+02 3.168e+02 5.587e+02, threshold=5.051e+02, percent-clipped=3.0 2023-09-29 08:35:53,552 INFO [train.py:1039] (1/4) Epoch 9, batch 3800, loss[loss=0.2309, simple_loss=0.2829, pruned_loss=0.08947, over 23657.00 frames. ], tot_loss[loss=0.2108, simple_loss=0.2796, pruned_loss=0.07098, over 4728794.88 frames. ], batch size: 232, lr: 1.14e-02, grad_scale: 16.0 2023-09-29 08:35:57,748 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:36:01,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:36:02,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 08:36:02,654 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 08:36:04,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:36:07,107 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:36:08,652 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 08:36:09,511 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.66 vs. limit=22.5 2023-09-29 08:36:10,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 08:36:10,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:36:11,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:36:13,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:36:14,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:36:14,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:36:16,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 08:36:19,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 08:36:21,162 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:36:24,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:36:27,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:36:27,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 08:36:30,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 08:36:30,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:36:34,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:36:34,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:36:38,546 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.46 vs. limit=15.0 2023-09-29 08:36:39,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 08:36:39,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 08:36:40,150 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.89 vs. limit=15.0 2023-09-29 08:36:41,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:36:47,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:36:53,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:36:55,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 08:36:57,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 08:36:59,126 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:37:00,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:37:02,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:37:03,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 08:37:07,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 08:37:07,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 08:37:09,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:37:09,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:37:09,931 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.73 vs. limit=6.0 2023-09-29 08:37:15,126 INFO [train.py:1039] (1/4) Epoch 9, batch 3850, loss[loss=0.2105, simple_loss=0.2616, pruned_loss=0.07971, over 23390.00 frames. ], tot_loss[loss=0.2113, simple_loss=0.2794, pruned_loss=0.07157, over 4738010.19 frames. ], batch size: 285, lr: 1.14e-02, grad_scale: 16.0 2023-09-29 08:37:15,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:37:16,855 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:37:21,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:37:21,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 08:37:25,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 08:37:25,187 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:37:28,378 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 08:37:32,061 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:37:33,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 08:37:33,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 08:37:40,878 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:37:43,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:37:47,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:37:47,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:37:51,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:37:51,708 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:37:51,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:37:53,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:37:53,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:37:54,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:37:54,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:37:56,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:37:58,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 08:37:58,509 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 08:37:59,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:38:01,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:38:05,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:38:06,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:38:06,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 08:38:08,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 08:38:09,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:38:10,094 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=309173.3333333333, ans=0.125 2023-09-29 08:38:11,479 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 08:38:15,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 08:38:19,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:38:20,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:38:25,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:38:25,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 08:38:27,056 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=309240.0, ans=0.125 2023-09-29 08:38:28,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 08:38:30,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:38:31,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:38:33,497 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.661e+02 1.974e+02 2.231e+02 2.579e+02 4.458e+02, threshold=4.461e+02, percent-clipped=0.0 2023-09-29 08:38:33,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 08:38:33,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:38:35,215 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:38:35,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:38:35,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:38:35,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 08:38:38,202 INFO [train.py:1039] (1/4) Epoch 9, batch 3900, loss[loss=0.2038, simple_loss=0.2716, pruned_loss=0.06795, over 23406.00 frames. ], tot_loss[loss=0.2105, simple_loss=0.2784, pruned_loss=0.07133, over 4738381.28 frames. ], batch size: 93, lr: 1.14e-02, grad_scale: 16.0 2023-09-29 08:38:38,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:38:39,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 08:38:39,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:38:39,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:38:40,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:38:40,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:38:42,350 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:38:43,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:38:43,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:38:44,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:38:44,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 08:38:45,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:38:48,351 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:38:49,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 08:38:51,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:38:51,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:38:55,153 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=309373.3333333333, ans=0.0 2023-09-29 08:38:56,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 08:38:56,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:38:57,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 08:38:59,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 08:38:59,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:39:01,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 08:39:02,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:39:02,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 08:39:05,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 08:39:09,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:39:09,969 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten.whitening_limit, batch_count=309373.3333333333, ans=22.5 2023-09-29 08:39:10,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:39:10,888 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 08:39:12,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:39:17,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:39:19,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:39:19,649 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=14.82 vs. limit=15.0 2023-09-29 08:39:22,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:39:22,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:39:24,395 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:39:30,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:39:30,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:39:32,262 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=309506.6666666667, ans=0.0 2023-09-29 08:39:36,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 08:39:38,499 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:39:42,770 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.88 vs. limit=15.0 2023-09-29 08:39:50,423 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:39:52,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:39:52,220 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 08:39:52,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 08:39:52,291 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:39:55,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 08:39:57,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:39:59,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 08:39:59,719 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=309573.3333333333, ans=0.2 2023-09-29 08:40:02,348 INFO [train.py:1039] (1/4) Epoch 9, batch 3950, loss[loss=0.2256, simple_loss=0.2872, pruned_loss=0.08197, over 23565.00 frames. ], tot_loss[loss=0.21, simple_loss=0.2784, pruned_loss=0.07084, over 4740004.99 frames. ], batch size: 256, lr: 1.14e-02, grad_scale: 16.0 2023-09-29 08:40:05,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:40:07,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 08:40:07,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:40:10,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:40:10,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:40:12,831 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.24 vs. limit=15.0 2023-09-29 08:40:15,784 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 08:40:17,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 08:40:17,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 08:40:18,785 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 08:40:18,829 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:40:22,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:40:22,544 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=309706.6666666667, ans=0.2 2023-09-29 08:40:23,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 08:40:23,736 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:40:26,896 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 08:40:30,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:40:31,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 08:40:31,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 08:40:32,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:40:33,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:40:46,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:40:46,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:40:49,537 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=309840.0, ans=0.125 2023-09-29 08:40:52,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 08:40:57,343 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=309840.0, ans=0.0 2023-09-29 08:40:57,601 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.86 vs. limit=15.0 2023-09-29 08:40:59,783 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 08:40:59,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 08:40:59,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:41:01,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:41:01,732 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=309840.0, ans=0.125 2023-09-29 08:41:11,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:41:11,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 08:41:11,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:41:11,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:41:13,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 08:41:16,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:41:18,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:41:19,320 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.683e+02 2.102e+02 2.264e+02 2.656e+02 4.963e+02, threshold=4.527e+02, percent-clipped=1.0 2023-09-29 08:41:19,809 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=309906.6666666667, ans=0.125 2023-09-29 08:41:22,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 08:41:23,370 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.31 vs. limit=15.0 2023-09-29 08:41:23,845 INFO [train.py:1039] (1/4) Epoch 9, batch 4000, loss[loss=0.2116, simple_loss=0.2748, pruned_loss=0.07423, over 23361.00 frames. ], tot_loss[loss=0.2107, simple_loss=0.2791, pruned_loss=0.07113, over 4746897.73 frames. ], batch size: 285, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:41:32,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:41:41,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:41:44,491 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=310040.0, ans=0.1 2023-09-29 08:41:45,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:41:47,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:41:47,910 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:41:47,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 08:41:49,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:41:49,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 08:41:49,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:41:49,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 08:41:52,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:41:55,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:41:55,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:41:55,938 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:41:57,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:41:57,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 08:41:59,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 08:41:59,373 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 08:41:59,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:42:00,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:42:04,690 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 08:42:04,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 08:42:04,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:42:05,132 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=310106.6666666667, ans=0.025 2023-09-29 08:42:13,567 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 08:42:13,651 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:42:13,896 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=310173.3333333333, ans=0.2 2023-09-29 08:42:15,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:42:16,780 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 08:42:18,377 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:42:19,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 08:42:19,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:42:19,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:42:21,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:42:23,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:42:24,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 08:42:24,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:42:25,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 08:42:26,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:42:28,102 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 08:42:30,066 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=310240.0, ans=0.125 2023-09-29 08:42:32,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 08:42:37,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 08:42:39,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:42:39,847 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.74 vs. limit=22.5 2023-09-29 08:42:40,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:42:40,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:42:42,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:42:47,927 INFO [train.py:1039] (1/4) Epoch 9, batch 4050, loss[loss=0.2431, simple_loss=0.298, pruned_loss=0.09406, over 22758.00 frames. ], tot_loss[loss=0.2121, simple_loss=0.2804, pruned_loss=0.0719, over 4730442.97 frames. ], batch size: 322, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:42:48,163 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:42:49,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 08:42:51,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 08:42:52,834 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 08:42:52,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:42:54,304 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:42:54,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:42:56,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:42:59,827 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=310306.6666666667, ans=0.125 2023-09-29 08:43:01,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:43:04,032 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:43:04,258 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=310373.3333333333, ans=0.0 2023-09-29 08:43:05,571 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 08:43:07,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:43:07,501 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=310373.3333333333, ans=0.1 2023-09-29 08:43:08,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:43:11,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:43:16,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:43:17,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 08:43:19,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 08:43:21,194 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 08:43:22,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:43:29,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 08:43:30,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:43:34,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:43:35,992 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=310506.6666666667, ans=0.125 2023-09-29 08:43:37,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:43:38,757 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:43:40,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:43:41,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:43:46,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 08:43:46,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 08:43:47,796 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:43:50,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 08:43:56,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:44:01,995 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.85 vs. limit=15.0 2023-09-29 08:44:02,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 08:44:03,004 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=310573.3333333333, ans=0.125 2023-09-29 08:44:04,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:44:04,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 08:44:05,848 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.662e+02 2.021e+02 2.194e+02 2.667e+02 4.003e+02, threshold=4.389e+02, percent-clipped=0.0 2023-09-29 08:44:06,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 08:44:06,169 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 08:44:06,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:44:09,639 INFO [train.py:1039] (1/4) Epoch 9, batch 4100, loss[loss=0.2385, simple_loss=0.2989, pruned_loss=0.08902, over 23870.00 frames. ], tot_loss[loss=0.2135, simple_loss=0.282, pruned_loss=0.07246, over 4731470.13 frames. ], batch size: 195, lr: 1.14e-02, grad_scale: 16.0 2023-09-29 08:44:09,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:44:09,911 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:44:09,935 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:44:17,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 08:44:19,364 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 08:44:22,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 08:44:23,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 08:44:23,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:44:25,184 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:44:25,251 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:44:25,274 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:44:26,744 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 08:44:30,369 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:44:31,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:44:31,878 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:44:33,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:44:35,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:44:36,817 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:44:36,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:44:36,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 08:44:38,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:44:38,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:44:38,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:44:38,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:44:38,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 08:44:41,551 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:44:45,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 08:44:45,709 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.71 vs. limit=15.0 2023-09-29 08:44:46,634 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:44:49,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:44:49,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 08:44:51,276 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=310773.3333333333, ans=0.1 2023-09-29 08:44:52,011 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=310773.3333333333, ans=0.0 2023-09-29 08:44:53,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:44:53,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:44:53,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 08:44:56,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 08:44:57,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 08:44:57,952 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:45:00,071 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 08:45:01,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:45:01,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:45:04,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:45:09,843 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:45:14,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:45:14,393 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:45:19,394 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=310906.6666666667, ans=0.0 2023-09-29 08:45:22,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:45:22,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:45:28,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:45:31,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:45:33,037 INFO [train.py:1039] (1/4) Epoch 9, batch 4150, loss[loss=0.2102, simple_loss=0.2898, pruned_loss=0.06527, over 24641.00 frames. ], tot_loss[loss=0.2147, simple_loss=0.283, pruned_loss=0.07324, over 4711161.17 frames. ], batch size: 73, lr: 1.14e-02, grad_scale: 16.0 2023-09-29 08:45:33,250 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:45:34,749 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 08:45:34,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:45:34,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:45:35,283 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=310973.3333333333, ans=0.125 2023-09-29 08:45:38,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 08:45:38,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:45:40,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 08:45:40,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 08:45:41,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 08:45:41,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:45:47,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:45:47,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:45:48,635 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.90 vs. limit=12.0 2023-09-29 08:45:52,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:45:53,082 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:45:54,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 08:45:57,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 08:45:57,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:45:58,938 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 08:46:04,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:46:08,710 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:46:10,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 08:46:10,380 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=311106.6666666667, ans=0.125 2023-09-29 08:46:12,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 08:46:12,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:46:13,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 08:46:13,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:46:13,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:46:17,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:46:17,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:46:19,173 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=311106.6666666667, ans=0.125 2023-09-29 08:46:21,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 08:46:26,268 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:46:26,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:46:27,902 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 08:46:27,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:46:28,222 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=311173.3333333333, ans=0.1 2023-09-29 08:46:30,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 08:46:31,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:46:31,952 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 08:46:34,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:46:36,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:46:38,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 08:46:38,298 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:46:38,302 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 08:46:39,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 08:46:41,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 08:46:43,689 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:46:43,696 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 08:46:43,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 08:46:43,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 08:46:43,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:46:45,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 08:46:46,765 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:46:46,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:46:47,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 08:46:48,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:46:51,311 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.629e+02 2.017e+02 2.325e+02 2.759e+02 4.576e+02, threshold=4.650e+02, percent-clipped=1.0 2023-09-29 08:46:51,683 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=311240.0, ans=0.2 2023-09-29 08:46:54,466 INFO [train.py:1039] (1/4) Epoch 9, batch 4200, loss[loss=0.2042, simple_loss=0.2573, pruned_loss=0.07557, over 23429.00 frames. ], tot_loss[loss=0.2125, simple_loss=0.281, pruned_loss=0.07197, over 4715871.47 frames. ], batch size: 285, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 08:46:54,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:46:56,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 08:46:57,786 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 08:46:59,430 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:47:00,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 08:47:02,873 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:47:02,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:47:05,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 08:47:09,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 08:47:10,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:47:12,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:47:16,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:47:16,904 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.47 vs. limit=12.0 2023-09-29 08:47:21,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 08:47:21,335 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:47:21,374 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:47:22,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 08:47:22,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:47:24,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:47:24,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:47:24,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 08:47:26,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 08:47:26,883 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=311440.0, ans=0.1 2023-09-29 08:47:28,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 08:47:28,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:47:32,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 08:47:32,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:47:36,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:47:37,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:47:38,219 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=311440.0, ans=0.0 2023-09-29 08:47:40,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:47:40,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 08:47:41,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:47:42,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:47:46,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 08:47:50,048 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:47:53,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:47:55,675 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=311506.6666666667, ans=0.0 2023-09-29 08:47:58,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 08:48:00,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:48:04,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 08:48:06,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:48:07,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 08:48:15,346 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:48:15,636 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=311573.3333333333, ans=0.125 2023-09-29 08:48:18,141 INFO [train.py:1039] (1/4) Epoch 9, batch 4250, loss[loss=0.2311, simple_loss=0.3083, pruned_loss=0.0769, over 24071.00 frames. ], tot_loss[loss=0.2114, simple_loss=0.2795, pruned_loss=0.07168, over 4721239.02 frames. ], batch size: 80, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 08:48:19,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:48:19,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:48:22,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:48:28,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 08:48:29,010 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 08:48:29,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:48:33,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:48:36,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:48:41,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:48:41,330 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:48:43,435 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.95 vs. limit=15.0 2023-09-29 08:48:44,424 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:48:44,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:48:45,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:48:45,995 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:48:46,333 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=311706.6666666667, ans=0.125 2023-09-29 08:48:48,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:48:50,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:48:50,671 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=311773.3333333333, ans=0.125 2023-09-29 08:48:51,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:48:53,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 08:48:55,740 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=311773.3333333333, ans=0.2 2023-09-29 08:48:57,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 08:48:57,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:48:59,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:49:00,524 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:49:00,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 08:49:00,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:49:00,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:49:05,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 08:49:05,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:49:11,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:49:12,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:49:13,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 08:49:13,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:49:13,399 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=311840.0, ans=0.125 2023-09-29 08:49:14,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 08:49:16,088 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:49:19,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 08:49:20,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:49:20,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:49:22,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 08:49:24,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 08:49:24,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:49:25,964 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.98 vs. limit=15.0 2023-09-29 08:49:28,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:49:32,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:49:33,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:49:33,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:49:35,417 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:49:36,203 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.03 vs. limit=15.0 2023-09-29 08:49:36,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:49:37,172 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=311906.6666666667, ans=0.125 2023-09-29 08:49:38,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:49:38,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 08:49:39,728 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 2.120e+02 2.435e+02 2.958e+02 4.592e+02, threshold=4.869e+02, percent-clipped=0.0 2023-09-29 08:49:40,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:49:41,346 INFO [train.py:1039] (1/4) Epoch 9, batch 4300, loss[loss=0.2285, simple_loss=0.2834, pruned_loss=0.08679, over 23790.00 frames. ], tot_loss[loss=0.2102, simple_loss=0.2781, pruned_loss=0.07114, over 4718669.76 frames. ], batch size: 179, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 08:49:46,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:49:46,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:49:51,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:49:59,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:49:59,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 08:50:01,855 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:50:05,364 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:50:05,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:50:05,418 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 08:50:08,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 08:50:10,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 08:50:13,252 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 08:50:14,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:50:14,651 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 08:50:16,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 08:50:16,712 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=312106.6666666667, ans=0.125 2023-09-29 08:50:17,951 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:50:20,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:50:20,982 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:50:22,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:50:24,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:50:25,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:50:25,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 08:50:25,700 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 08:50:28,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:50:30,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:50:30,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 08:50:30,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:50:30,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:50:31,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 08:50:31,959 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 08:50:32,058 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 08:50:34,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:50:36,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 08:50:36,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 08:50:40,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:50:42,283 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 08:50:43,840 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:50:46,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:50:46,866 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:50:48,378 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 08:50:49,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 08:50:49,822 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:50:49,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:50:49,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:50:50,172 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=312240.0, ans=0.0 2023-09-29 08:50:51,423 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:50:51,715 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=312240.0, ans=0.125 2023-09-29 08:50:54,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:50:57,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:50:57,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:50:58,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:51:02,124 INFO [train.py:1039] (1/4) Epoch 9, batch 4350, loss[loss=0.1714, simple_loss=0.2427, pruned_loss=0.05004, over 20292.00 frames. ], tot_loss[loss=0.2102, simple_loss=0.2785, pruned_loss=0.07094, over 4715897.07 frames. ], batch size: 44, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 08:51:03,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 08:51:03,780 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 08:51:09,261 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:51:12,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:51:13,027 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=312306.6666666667, ans=0.125 2023-09-29 08:51:15,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:51:15,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:51:20,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 08:51:25,161 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:51:26,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:51:26,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:51:29,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:51:31,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:51:33,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 08:51:38,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 08:51:39,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:51:41,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:51:45,107 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=312440.0, ans=0.1 2023-09-29 08:51:46,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:51:48,529 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=312440.0, ans=0.125 2023-09-29 08:51:48,636 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=312440.0, ans=0.09899494936611666 2023-09-29 08:51:49,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 08:51:51,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:51:54,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 08:51:59,119 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 08:52:00,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:52:00,683 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:52:02,280 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 08:52:02,390 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 08:52:02,398 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:52:03,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:52:03,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:52:05,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:52:07,012 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:52:07,080 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:52:08,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 08:52:08,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:52:08,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:52:10,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:52:10,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 08:52:11,793 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 08:52:11,800 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 08:52:11,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 08:52:17,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:52:17,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 08:52:17,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:52:19,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:52:20,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 08:52:22,647 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.612e+02 1.989e+02 2.158e+02 2.516e+02 4.089e+02, threshold=4.315e+02, percent-clipped=0.0 2023-09-29 08:52:22,914 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 08:52:22,925 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:52:24,244 INFO [train.py:1039] (1/4) Epoch 9, batch 4400, loss[loss=0.2103, simple_loss=0.2813, pruned_loss=0.0696, over 23692.00 frames. ], tot_loss[loss=0.211, simple_loss=0.2792, pruned_loss=0.07139, over 4710728.45 frames. ], batch size: 85, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 08:52:29,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:52:29,055 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:52:30,618 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:52:33,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 08:52:33,779 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 08:52:33,841 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 08:52:33,872 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 08:52:35,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 08:52:35,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:52:35,909 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=312640.0, ans=0.2 2023-09-29 08:52:38,511 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 08:52:41,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:52:42,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:52:43,050 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=312706.6666666667, ans=0.2 2023-09-29 08:52:44,178 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 08:52:47,923 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:52:47,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 08:52:48,018 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 08:52:51,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 08:52:53,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 08:52:53,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 08:52:53,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:52:53,320 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:52:53,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:52:54,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:52:57,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 08:52:57,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 08:52:57,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:53:02,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:53:02,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:53:02,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:53:03,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:53:03,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 08:53:05,431 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 08:53:08,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:53:15,071 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:53:18,031 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 08:53:21,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:53:25,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:53:27,094 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 08:53:29,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 08:53:29,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:53:30,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 08:53:30,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:53:30,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 08:53:33,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 08:53:38,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 08:53:39,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 08:53:39,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:53:39,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 08:53:39,348 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=312906.6666666667, ans=0.125 2023-09-29 08:53:40,669 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:53:43,671 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:53:45,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 08:53:46,777 INFO [train.py:1039] (1/4) Epoch 9, batch 4450, loss[loss=0.206, simple_loss=0.2696, pruned_loss=0.07121, over 16970.00 frames. ], tot_loss[loss=0.2125, simple_loss=0.2805, pruned_loss=0.07221, over 4709326.01 frames. ], batch size: 36, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 08:53:50,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:53:50,462 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=312973.3333333333, ans=0.125 2023-09-29 08:53:53,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:53:53,266 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:53:59,987 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:54:00,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:54:06,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:54:08,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:54:11,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:54:11,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:54:13,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 08:54:13,328 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:54:13,455 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:54:14,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:54:14,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 08:54:16,432 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 08:54:21,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:54:22,547 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:54:25,523 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:54:25,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:54:25,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:54:30,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 08:54:32,060 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 08:54:32,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 08:54:32,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:54:35,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:54:39,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 08:54:39,342 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 08:54:42,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 08:54:45,327 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:54:46,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 08:54:46,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:54:46,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:54:46,956 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:54:46,967 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:54:49,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:54:52,123 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 08:54:52,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 08:54:55,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 08:54:55,282 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:54:55,718 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=313240.0, ans=0.2 2023-09-29 08:54:56,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:54:58,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:54:58,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 08:55:00,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 08:55:00,599 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=313240.0, ans=0.0 2023-09-29 08:55:03,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 08:55:05,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:55:05,879 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=313240.0, ans=0.125 2023-09-29 08:55:06,703 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.657e+02 2.016e+02 2.496e+02 2.860e+02 5.111e+02, threshold=4.992e+02, percent-clipped=1.0 2023-09-29 08:55:08,159 INFO [train.py:1039] (1/4) Epoch 9, batch 4500, loss[loss=0.2109, simple_loss=0.2613, pruned_loss=0.08021, over 23743.00 frames. ], tot_loss[loss=0.2132, simple_loss=0.2808, pruned_loss=0.0728, over 4698361.81 frames. ], batch size: 232, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 08:55:09,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:55:12,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 08:55:12,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 08:55:12,670 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=313306.6666666667, ans=0.0 2023-09-29 08:55:13,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:55:20,754 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:55:20,827 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:55:20,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:55:22,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:55:22,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:55:23,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:55:31,980 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=313373.3333333333, ans=0.125 2023-09-29 08:55:34,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:55:36,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:55:38,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:55:40,349 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:55:41,757 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 08:55:46,469 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=313440.0, ans=0.125 2023-09-29 08:55:49,878 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 08:55:50,399 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=313440.0, ans=0.1 2023-09-29 08:55:51,854 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=313440.0, ans=0.125 2023-09-29 08:55:53,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 08:55:59,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 08:56:00,003 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=313506.6666666667, ans=0.1 2023-09-29 08:56:02,664 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:56:02,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 08:56:02,842 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:56:04,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:56:05,216 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.31 vs. limit=15.0 2023-09-29 08:56:05,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:56:05,987 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:56:08,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:56:09,023 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 08:56:09,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 08:56:09,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:56:16,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:56:16,105 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:56:17,765 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:56:20,015 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.54 vs. limit=22.5 2023-09-29 08:56:20,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 08:56:20,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:56:22,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 08:56:25,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 08:56:25,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 08:56:30,942 INFO [train.py:1039] (1/4) Epoch 9, batch 4550, loss[loss=0.2092, simple_loss=0.2894, pruned_loss=0.06457, over 24559.00 frames. ], tot_loss[loss=0.2118, simple_loss=0.2794, pruned_loss=0.07212, over 4703980.54 frames. ], batch size: 71, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 08:56:31,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 08:56:34,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 08:56:35,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:56:37,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:56:38,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:56:41,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:56:45,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:56:47,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:56:49,369 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:56:49,895 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.04 vs. limit=15.0 2023-09-29 08:56:50,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:56:50,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:56:52,489 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:56:52,558 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:56:59,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:56:59,464 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=313706.6666666667, ans=0.125 2023-09-29 08:57:00,644 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 08:57:00,754 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 08:57:02,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:57:04,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 08:57:06,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 08:57:06,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:57:10,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 08:57:12,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 08:57:14,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:57:14,493 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=313773.3333333333, ans=0.125 2023-09-29 08:57:16,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:57:16,081 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:57:19,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 08:57:22,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:57:24,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:57:25,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:57:26,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:57:27,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 08:57:29,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 08:57:29,051 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:57:31,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 08:57:34,435 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 08:57:34,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:57:36,215 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:57:36,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:57:37,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:57:37,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:57:39,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 08:57:39,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 08:57:42,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:57:42,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 08:57:44,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 08:57:44,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:57:44,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 08:57:47,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:57:47,489 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:57:47,708 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=313906.6666666667, ans=0.1 2023-09-29 08:57:48,273 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=11.35 vs. limit=15.0 2023-09-29 08:57:51,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:57:52,531 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.671e+02 2.108e+02 2.512e+02 3.014e+02 4.343e+02, threshold=5.024e+02, percent-clipped=0.0 2023-09-29 08:57:52,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:57:52,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 08:57:54,156 INFO [train.py:1039] (1/4) Epoch 9, batch 4600, loss[loss=0.2391, simple_loss=0.2959, pruned_loss=0.09113, over 23775.00 frames. ], tot_loss[loss=0.2106, simple_loss=0.2786, pruned_loss=0.0713, over 4708680.29 frames. ], batch size: 179, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 08:57:54,234 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:57:55,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:57:57,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:57:59,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:58:02,679 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:58:02,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 08:58:04,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:58:06,467 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 08:58:07,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:58:08,314 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 08:58:11,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 08:58:11,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:58:15,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:58:22,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 08:58:24,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:58:26,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:58:29,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:58:29,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:58:35,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 08:58:35,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 08:58:37,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:58:41,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:58:41,760 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=314106.6666666667, ans=0.0 2023-09-29 08:58:42,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:58:44,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:58:47,528 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 08:58:49,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 08:58:49,223 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=314173.3333333333, ans=0.2 2023-09-29 08:58:54,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:58:55,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:58:57,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:58:57,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 08:58:57,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:58:57,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 08:58:59,573 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:58:59,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:58:59,847 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=314240.0, ans=0.0 2023-09-29 08:59:01,270 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:59:02,708 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:59:04,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:59:05,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 08:59:05,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 08:59:05,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 08:59:07,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:59:07,592 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=314240.0, ans=0.1 2023-09-29 08:59:07,971 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.93 vs. limit=15.0 2023-09-29 08:59:09,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:59:11,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:59:11,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:59:17,975 INFO [train.py:1039] (1/4) Epoch 9, batch 4650, loss[loss=0.1682, simple_loss=0.2399, pruned_loss=0.04824, over 24354.00 frames. ], tot_loss[loss=0.2096, simple_loss=0.2777, pruned_loss=0.07071, over 4709716.80 frames. ], batch size: 56, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 08:59:21,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 08:59:21,436 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=314306.6666666667, ans=0.0 2023-09-29 08:59:25,044 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=314306.6666666667, ans=0.1 2023-09-29 08:59:26,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:59:26,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:59:26,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:59:26,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:59:27,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:59:29,339 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:59:31,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 08:59:36,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:59:37,805 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 08:59:37,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:59:39,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 08:59:39,359 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:59:39,462 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 08:59:40,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 08:59:40,875 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:59:40,990 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:59:45,322 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 08:59:46,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:59:46,734 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 08:59:49,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:59:51,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 08:59:54,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:59:54,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:59:55,823 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 08:59:57,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:00:02,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:00:05,471 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:00:10,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:00:13,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:00:13,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:00:13,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 09:00:17,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 09:00:19,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 09:00:19,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 09:00:19,151 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 09:00:20,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:00:21,870 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=5.82 vs. limit=15.0 2023-09-29 09:00:27,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:00:27,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:00:27,516 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 09:00:27,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:00:27,720 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=314573.3333333333, ans=0.125 2023-09-29 09:00:30,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:00:30,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 09:00:30,838 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:00:34,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:00:34,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:00:35,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:00:38,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:00:38,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:00:38,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 09:00:40,269 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.585e+02 2.149e+02 2.483e+02 2.991e+02 4.624e+02, threshold=4.965e+02, percent-clipped=0.0 2023-09-29 09:00:40,329 INFO [train.py:1039] (1/4) Epoch 9, batch 4700, loss[loss=0.1766, simple_loss=0.246, pruned_loss=0.05355, over 24268.00 frames. ], tot_loss[loss=0.2103, simple_loss=0.2787, pruned_loss=0.07096, over 4712264.27 frames. ], batch size: 56, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 09:00:40,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 09:00:41,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 09:00:42,125 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 09:00:43,962 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=314640.0, ans=10.0 2023-09-29 09:00:50,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:00:50,770 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=314640.0, ans=0.0 2023-09-29 09:00:52,037 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:00:52,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:00:54,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:00:57,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 09:01:01,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 09:01:03,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 09:01:05,991 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:01:06,146 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:01:08,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:01:08,322 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=314706.6666666667, ans=0.0 2023-09-29 09:01:09,938 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=314706.6666666667, ans=0.0 2023-09-29 09:01:09,946 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=314706.6666666667, ans=0.125 2023-09-29 09:01:09,963 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=314706.6666666667, ans=0.0 2023-09-29 09:01:10,402 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.94 vs. limit=15.0 2023-09-29 09:01:11,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:01:16,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:01:17,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 09:01:21,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:01:26,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 09:01:27,001 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=314773.3333333333, ans=0.2 2023-09-29 09:01:28,799 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:01:30,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:01:35,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 09:01:35,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:01:41,066 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:01:41,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 09:01:42,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:01:42,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:01:43,026 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=314840.0, ans=0.1 2023-09-29 09:01:46,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:01:46,571 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:01:47,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 09:01:48,063 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 09:01:49,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:01:49,824 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=314906.6666666667, ans=0.0 2023-09-29 09:01:53,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:01:53,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:01:53,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 09:01:55,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:01:58,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 09:02:01,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:02:03,559 INFO [train.py:1039] (1/4) Epoch 9, batch 4750, loss[loss=0.2002, simple_loss=0.2754, pruned_loss=0.06253, over 24462.00 frames. ], tot_loss[loss=0.2106, simple_loss=0.2792, pruned_loss=0.07099, over 4722450.46 frames. ], batch size: 63, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 09:02:03,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:02:08,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:02:09,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:02:11,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 09:02:11,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:02:14,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 09:02:16,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:02:16,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:02:17,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:02:19,593 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=315040.0, ans=0.2 2023-09-29 09:02:22,552 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=315040.0, ans=0.125 2023-09-29 09:02:24,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 09:02:24,902 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.25 vs. limit=15.0 2023-09-29 09:02:28,036 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:02:30,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 09:02:31,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:02:36,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:02:36,543 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:02:36,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:02:38,240 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 09:02:38,245 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 09:02:43,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 09:02:43,950 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=315106.6666666667, ans=0.07 2023-09-29 09:02:46,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:02:48,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:02:50,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:02:50,229 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 09:02:50,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:02:54,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:02:58,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:03:00,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 09:03:01,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 09:03:01,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:03:01,615 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:03:03,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:03:03,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 09:03:05,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 09:03:07,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 09:03:07,317 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=315173.3333333333, ans=0.125 2023-09-29 09:03:10,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:03:11,752 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:03:11,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 09:03:11,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:03:12,457 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.58 vs. limit=22.5 2023-09-29 09:03:13,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:03:15,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:03:17,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:03:17,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:03:20,277 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:03:20,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 09:03:21,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 09:03:23,329 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 09:03:26,104 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 2.000e+02 2.225e+02 2.502e+02 3.899e+02, threshold=4.449e+02, percent-clipped=0.0 2023-09-29 09:03:26,155 INFO [train.py:1039] (1/4) Epoch 9, batch 4800, loss[loss=0.2022, simple_loss=0.286, pruned_loss=0.05921, over 24266.00 frames. ], tot_loss[loss=0.2113, simple_loss=0.2801, pruned_loss=0.07126, over 4720258.04 frames. ], batch size: 74, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 09:03:26,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 09:03:26,360 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:03:27,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 09:03:32,322 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.93 vs. limit=15.0 2023-09-29 09:03:35,964 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:03:36,068 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:03:40,195 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=315306.6666666667, ans=0.0 2023-09-29 09:03:41,995 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:03:43,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:03:43,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:03:44,038 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=315373.3333333333, ans=0.0 2023-09-29 09:03:45,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 09:03:45,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:03:45,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:03:47,146 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=315373.3333333333, ans=0.125 2023-09-29 09:03:48,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 09:03:53,252 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:03:56,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:03:56,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:03:56,437 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=315373.3333333333, ans=0.125 2023-09-29 09:03:57,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:03:57,859 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 09:03:57,884 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:03:59,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:04:03,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:04:05,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:04:06,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:04:06,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 09:04:08,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 09:04:10,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:04:10,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 09:04:12,252 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 09:04:12,405 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:04:12,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:04:13,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:04:13,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:04:13,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:04:15,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:04:17,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:04:19,222 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:04:24,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:04:25,807 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:04:29,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 09:04:29,071 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:04:29,782 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.79 vs. limit=15.0 2023-09-29 09:04:30,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:04:30,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:04:32,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:04:35,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:04:36,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:04:36,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:04:36,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:04:36,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 09:04:37,512 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.54 vs. limit=15.0 2023-09-29 09:04:38,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 09:04:42,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:04:42,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:04:42,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:04:44,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 09:04:47,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 09:04:47,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:04:47,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:04:49,615 INFO [train.py:1039] (1/4) Epoch 9, batch 4850, loss[loss=0.2245, simple_loss=0.3055, pruned_loss=0.07169, over 24417.00 frames. ], tot_loss[loss=0.2109, simple_loss=0.28, pruned_loss=0.07095, over 4728407.79 frames. ], batch size: 69, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 09:04:49,778 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:04:49,780 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:04:52,727 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:04:54,569 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=315640.0, ans=0.125 2023-09-29 09:05:00,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 09:05:03,761 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:05:08,314 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:05:08,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 09:05:08,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:05:13,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:05:13,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:05:14,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:05:14,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 09:05:15,208 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=315706.6666666667, ans=0.125 2023-09-29 09:05:20,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:05:22,123 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 09:05:22,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 09:05:22,290 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:05:22,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 09:05:25,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:05:26,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:05:26,293 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=315773.3333333333, ans=0.0 2023-09-29 09:05:31,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:05:31,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 09:05:32,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 09:05:32,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 09:05:36,257 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=315773.3333333333, ans=0.125 2023-09-29 09:05:40,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:05:40,783 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=315840.0, ans=0.1 2023-09-29 09:05:41,966 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 09:05:42,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:05:42,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:05:42,377 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=315840.0, ans=0.0 2023-09-29 09:05:43,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 09:05:45,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 09:05:45,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:05:46,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 09:05:48,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:05:50,061 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:05:50,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 09:05:59,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:06:04,561 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:06:04,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:06:09,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 09:06:09,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:06:12,306 INFO [train.py:1039] (1/4) Epoch 9, batch 4900, loss[loss=0.2059, simple_loss=0.2557, pruned_loss=0.07805, over 23445.00 frames. ], tot_loss[loss=0.2102, simple_loss=0.2784, pruned_loss=0.07097, over 4720122.83 frames. ], batch size: 285, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 09:06:13,847 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.814e+02 2.067e+02 2.446e+02 3.189e+02 7.103e+02, threshold=4.893e+02, percent-clipped=2.0 2023-09-29 09:06:14,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:06:15,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:06:15,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:06:19,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 09:06:24,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 09:06:28,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 09:06:31,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 09:06:31,504 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 09:06:33,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:06:33,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:06:33,078 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:06:33,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:06:34,576 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 09:06:36,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 09:06:37,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 09:06:39,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:06:41,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 09:06:44,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:06:44,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:06:46,133 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:06:46,146 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 09:06:47,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:06:48,637 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten.whitening_limit, batch_count=316106.6666666667, ans=22.5 2023-09-29 09:06:49,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:06:50,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 09:06:50,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 09:06:54,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 09:06:54,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:06:56,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:06:57,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:06:57,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:06:57,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 09:06:57,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:06:59,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 09:07:04,602 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:07:06,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 09:07:07,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:07:10,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 09:07:12,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:07:13,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 09:07:14,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 09:07:15,491 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.76 vs. limit=15.0 2023-09-29 09:07:16,527 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=316173.3333333333, ans=0.125 2023-09-29 09:07:19,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:07:19,646 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=316240.0, ans=0.1 2023-09-29 09:07:22,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 09:07:23,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 09:07:23,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 09:07:23,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 09:07:25,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:07:30,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:07:30,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:07:30,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:07:30,635 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 09:07:32,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 09:07:35,216 INFO [train.py:1039] (1/4) Epoch 9, batch 4950, loss[loss=0.2012, simple_loss=0.2716, pruned_loss=0.06533, over 19352.00 frames. ], tot_loss[loss=0.2091, simple_loss=0.2768, pruned_loss=0.07072, over 4702637.60 frames. ], batch size: 42, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 09:07:37,320 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:07:37,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 09:07:39,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 09:07:41,063 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 09:07:41,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 09:07:42,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 09:07:42,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:07:42,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:07:44,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:07:44,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:07:47,616 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:07:49,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:07:49,203 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:07:50,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:07:50,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:07:52,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:07:55,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 09:07:57,360 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=316373.3333333333, ans=0.1 2023-09-29 09:08:00,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:08:01,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 09:08:03,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:08:05,556 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:08:07,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:08:07,310 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 09:08:08,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 09:08:10,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:08:13,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:08:13,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 09:08:16,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:08:16,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:08:19,594 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 09:08:20,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:08:25,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 09:08:27,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:08:27,401 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=316506.6666666667, ans=0.025 2023-09-29 09:08:28,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:08:28,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:08:30,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 09:08:30,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:08:31,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 09:08:34,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:08:35,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:08:35,147 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:08:36,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:08:38,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 09:08:40,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:08:41,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:08:41,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:08:43,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:08:43,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 09:08:49,063 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:08:49,655 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:08:53,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 09:08:53,191 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 09:08:53,515 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=316573.3333333333, ans=0.125 2023-09-29 09:08:56,589 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=316573.3333333333, ans=0.0 2023-09-29 09:08:59,013 INFO [train.py:1039] (1/4) Epoch 9, batch 5000, loss[loss=0.2155, simple_loss=0.2748, pruned_loss=0.07815, over 22797.00 frames. ], tot_loss[loss=0.2086, simple_loss=0.2767, pruned_loss=0.07028, over 4707564.72 frames. ], batch size: 322, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 09:09:00,622 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.640e+02 2.030e+02 2.416e+02 2.948e+02 4.844e+02, threshold=4.831e+02, percent-clipped=0.0 2023-09-29 09:09:00,865 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:09:00,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:09:02,428 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 09:09:03,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 09:09:04,767 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.05 vs. limit=12.0 2023-09-29 09:09:05,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:09:06,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 09:09:07,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:09:07,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 09:09:08,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 09:09:08,623 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:09:08,707 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:09:08,897 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=316640.0, ans=0.125 2023-09-29 09:09:10,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 09:09:10,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:09:10,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:09:10,341 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=316640.0, ans=0.125 2023-09-29 09:09:13,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 09:09:13,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 09:09:15,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:09:15,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 09:09:15,358 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 09:09:16,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:09:16,797 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 09:09:16,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 09:09:16,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 09:09:20,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 09:09:20,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:09:20,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:09:22,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 09:09:22,083 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:09:23,121 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=316706.6666666667, ans=0.125 2023-09-29 09:09:24,263 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:09:25,826 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:09:28,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 09:09:28,837 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 09:09:30,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:09:31,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:09:36,650 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:09:37,782 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 09:09:39,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:09:41,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:09:41,124 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:09:47,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 09:09:47,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:09:47,141 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:09:49,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:09:50,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 09:09:50,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:09:54,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:09:56,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:10:00,146 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=316840.0, ans=0.0 2023-09-29 09:10:02,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 09:10:07,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:10:10,630 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=316906.6666666667, ans=0.125 2023-09-29 09:10:16,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:10:19,283 INFO [train.py:1039] (1/4) Epoch 9, batch 5050, loss[loss=0.1905, simple_loss=0.2684, pruned_loss=0.05635, over 24470.00 frames. ], tot_loss[loss=0.2093, simple_loss=0.2776, pruned_loss=0.0705, over 4716048.89 frames. ], batch size: 63, lr: 1.12e-02, grad_scale: 8.0 2023-09-29 09:10:19,345 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:10:19,358 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:10:19,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:10:19,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 09:10:19,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:10:19,580 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:10:21,561 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=316973.3333333333, ans=0.5 2023-09-29 09:10:22,062 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.24 vs. limit=15.0 2023-09-29 09:10:24,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:10:24,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 09:10:27,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:10:30,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:10:31,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 09:10:32,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 09:10:33,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:10:34,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:10:38,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 09:10:39,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 09:10:41,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 09:10:46,509 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=317040.0, ans=0.125 2023-09-29 09:10:49,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 09:10:50,567 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 09:10:50,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:10:52,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 09:10:52,127 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:10:53,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:10:53,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:10:55,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:10:55,261 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 09:10:55,413 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 09:10:56,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:10:59,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:11:02,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:11:02,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 09:11:03,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:11:06,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 09:11:09,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:11:09,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:11:11,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:11:12,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:11:14,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:11:17,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:11:19,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:11:19,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:11:19,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:11:20,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 09:11:20,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:11:22,526 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=317173.3333333333, ans=0.1 2023-09-29 09:11:23,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:11:27,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:11:27,200 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 09:11:27,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 09:11:27,509 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=317240.0, ans=0.05 2023-09-29 09:11:27,575 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=317240.0, ans=0.2 2023-09-29 09:11:28,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:11:30,318 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:11:30,609 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=317240.0, ans=0.125 2023-09-29 09:11:31,676 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 09:11:35,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:11:35,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 09:11:35,503 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:11:35,635 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=317240.0, ans=0.125 2023-09-29 09:11:37,336 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=317240.0, ans=0.5 2023-09-29 09:11:40,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:11:41,480 INFO [train.py:1039] (1/4) Epoch 9, batch 5100, loss[loss=0.2058, simple_loss=0.2698, pruned_loss=0.07092, over 23393.00 frames. ], tot_loss[loss=0.2102, simple_loss=0.2785, pruned_loss=0.07101, over 4714440.33 frames. ], batch size: 134, lr: 1.12e-02, grad_scale: 8.0 2023-09-29 09:11:41,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:11:41,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 09:11:43,046 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.653e+02 2.032e+02 2.319e+02 2.659e+02 3.992e+02, threshold=4.638e+02, percent-clipped=0.0 2023-09-29 09:11:43,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 09:11:45,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:11:45,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:11:46,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:11:49,311 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 09:11:52,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:11:56,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 09:11:57,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 09:11:57,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:11:59,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:12:02,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:12:02,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 09:12:02,406 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 09:12:06,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:12:08,432 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:12:12,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:12:15,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 09:12:15,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:12:16,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:12:16,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 09:12:19,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:12:22,546 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:12:22,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 09:12:24,055 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 09:12:24,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:12:24,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 09:12:24,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 09:12:28,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:12:34,389 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=317506.6666666667, ans=0.0 2023-09-29 09:12:35,688 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:12:38,525 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:12:41,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 09:12:43,119 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 09:12:43,132 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 09:12:45,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 09:12:45,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:12:47,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 09:12:51,981 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 09:12:54,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 09:12:56,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 09:12:59,169 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 09:13:00,852 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 09:13:00,909 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 09:13:05,220 INFO [train.py:1039] (1/4) Epoch 9, batch 5150, loss[loss=0.1939, simple_loss=0.277, pruned_loss=0.05542, over 24467.00 frames. ], tot_loss[loss=0.2121, simple_loss=0.2799, pruned_loss=0.07219, over 4705928.30 frames. ], batch size: 69, lr: 1.12e-02, grad_scale: 8.0 2023-09-29 09:13:06,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:13:07,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:13:07,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:13:08,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:13:08,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 09:13:10,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:13:10,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 09:13:10,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 09:13:12,219 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 09:13:13,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:13:13,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 09:13:13,903 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:13:15,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 09:13:17,450 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:13:19,031 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:13:23,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 09:13:23,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 09:13:25,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:13:26,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:13:28,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 09:13:28,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:13:28,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:13:28,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:13:28,945 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 09:13:31,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 09:13:32,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:13:32,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 09:13:34,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 09:13:37,307 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 09:13:37,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:13:38,204 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.64 vs. limit=12.0 2023-09-29 09:13:43,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:13:45,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 09:13:48,920 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:13:53,146 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=317773.3333333333, ans=0.125 2023-09-29 09:13:54,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:13:55,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:13:57,740 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=317840.0, ans=0.0 2023-09-29 09:14:01,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:14:01,153 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:14:04,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 09:14:06,179 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=317840.0, ans=0.125 2023-09-29 09:14:07,505 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:14:08,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:14:09,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 09:14:13,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:14:15,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:14:16,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 09:14:20,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:14:23,718 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 09:14:25,476 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:14:25,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:14:25,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 09:14:25,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 09:14:26,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:14:26,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:14:28,341 INFO [train.py:1039] (1/4) Epoch 9, batch 5200, loss[loss=0.2177, simple_loss=0.295, pruned_loss=0.07022, over 24666.00 frames. ], tot_loss[loss=0.2133, simple_loss=0.2811, pruned_loss=0.07273, over 4697912.69 frames. ], batch size: 68, lr: 1.12e-02, grad_scale: 16.0 2023-09-29 09:14:29,840 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.704e+02 2.093e+02 2.397e+02 2.811e+02 4.237e+02, threshold=4.795e+02, percent-clipped=0.0 2023-09-29 09:14:30,194 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=317973.3333333333, ans=0.0 2023-09-29 09:14:31,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:14:32,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:14:36,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:14:40,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 09:14:40,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:14:41,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:14:44,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:14:45,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:14:45,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:14:45,313 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=318040.0, ans=0.125 2023-09-29 09:14:48,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 09:14:49,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 09:14:51,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:14:53,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 09:14:55,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 09:14:55,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 09:14:56,658 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 09:14:56,755 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 09:14:59,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 09:15:01,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:15:01,178 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 09:15:01,189 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:15:02,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:15:02,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:15:04,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 09:15:05,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:15:06,080 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=318106.6666666667, ans=0.125 2023-09-29 09:15:08,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:15:11,589 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 09:15:11,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 09:15:12,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 09:15:17,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 09:15:17,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 09:15:25,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:15:25,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:15:27,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 09:15:27,879 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:15:27,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 09:15:27,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:15:29,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 09:15:33,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:15:33,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:15:35,609 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=318240.0, ans=0.0 2023-09-29 09:15:38,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:15:40,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:15:40,022 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:15:47,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:15:47,249 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 09:15:47,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:15:48,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:15:49,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:15:49,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 09:15:49,549 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=318306.6666666667, ans=0.125 2023-09-29 09:15:50,527 INFO [train.py:1039] (1/4) Epoch 9, batch 5250, loss[loss=0.1797, simple_loss=0.2603, pruned_loss=0.04952, over 24464.00 frames. ], tot_loss[loss=0.2125, simple_loss=0.2802, pruned_loss=0.07236, over 4707267.46 frames. ], batch size: 63, lr: 1.12e-02, grad_scale: 8.0 2023-09-29 09:15:50,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:15:52,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:15:55,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:15:55,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:15:57,694 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:16:04,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:16:04,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 09:16:07,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:16:08,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 09:16:10,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 09:16:11,941 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:16:12,740 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:16:19,853 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=318373.3333333333, ans=0.125 2023-09-29 09:16:24,331 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=318440.0, ans=0.1 2023-09-29 09:16:34,443 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.79 vs. limit=15.0 2023-09-29 09:16:41,099 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=318506.6666666667, ans=0.0 2023-09-29 09:16:44,053 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=318506.6666666667, ans=0.125 2023-09-29 09:17:02,650 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=318573.3333333333, ans=0.125 2023-09-29 09:17:04,988 INFO [train.py:1039] (1/4) Epoch 9, batch 5300, loss[loss=0.2059, simple_loss=0.2886, pruned_loss=0.06162, over 24494.00 frames. ], tot_loss[loss=0.2106, simple_loss=0.278, pruned_loss=0.07158, over 4699909.62 frames. ], batch size: 66, lr: 1.12e-02, grad_scale: 8.0 2023-09-29 09:17:07,706 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.994e+02 2.180e+02 2.520e+02 5.243e+02, threshold=4.360e+02, percent-clipped=1.0 2023-09-29 09:17:20,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:17:20,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 09:17:20,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 09:17:20,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:17:21,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:17:21,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:17:21,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:17:21,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:17:21,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:17:21,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:17:21,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 09:17:21,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:17:22,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 09:17:22,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 09:17:22,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 09:17:22,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 09:17:22,473 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 09:17:22,620 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 09:17:22,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:17:23,323 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:17:23,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:17:23,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:17:23,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:17:24,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:17:24,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:17:24,683 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:17:24,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:17:24,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:17:24,882 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:17:24,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:17:24,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:17:25,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 09:17:26,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:17:26,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:17:26,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 09:17:26,587 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 09:17:26,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:17:26,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:17:26,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 09:17:27,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 09:17:27,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 09:17:27,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:17:28,647 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:17:28,800 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 09:17:28,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 09:17:28,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 09:17:29,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:17:29,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 09:17:29,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 09:17:29,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 09:17:29,699 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 09:17:39,049 INFO [train.py:1039] (1/4) Epoch 10, batch 0, loss[loss=0.2169, simple_loss=0.2992, pruned_loss=0.06729, over 24557.00 frames. ], tot_loss[loss=0.2169, simple_loss=0.2992, pruned_loss=0.06729, over 24557.00 frames. ], batch size: 71, lr: 1.07e-02, grad_scale: 16.0 2023-09-29 09:17:39,050 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-29 09:17:53,008 INFO [train.py:1071] (1/4) Epoch 10, validation: loss=0.3048, simple_loss=0.281, pruned_loss=0.1643, over 1125622.00 frames. 2023-09-29 09:17:53,009 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-29 09:17:56,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 09:17:56,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:17:58,428 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:18:02,225 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=8.38 vs. limit=15.0 2023-09-29 09:18:03,809 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:18:03,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:18:05,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:18:06,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 09:18:08,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 09:18:09,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:18:09,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:18:11,838 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=318786.6666666667, ans=0.1 2023-09-29 09:18:13,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:18:13,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:18:13,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:18:13,227 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:18:14,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 09:18:17,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:18:20,515 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=318786.6666666667, ans=0.125 2023-09-29 09:18:22,050 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=318786.6666666667, ans=0.1 2023-09-29 09:18:23,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 09:18:23,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:18:25,537 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 09:18:30,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:18:30,824 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:18:33,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:18:34,168 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=318853.3333333333, ans=0.125 2023-09-29 09:18:35,750 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=318853.3333333333, ans=0.125 2023-09-29 09:18:36,913 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:18:42,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:18:47,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 09:18:51,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 09:18:53,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:18:53,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:18:53,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:18:53,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:18:56,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 09:19:00,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:19:00,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:19:01,953 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=318986.6666666667, ans=0.1 2023-09-29 09:19:06,911 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:19:07,777 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.59 vs. limit=12.0 2023-09-29 09:19:08,819 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 09:19:10,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:19:12,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:19:13,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:19:13,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 09:19:15,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:19:15,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:19:16,513 INFO [train.py:1039] (1/4) Epoch 10, batch 50, loss[loss=0.2032, simple_loss=0.2648, pruned_loss=0.07073, over 23699.00 frames. ], tot_loss[loss=0.2102, simple_loss=0.2799, pruned_loss=0.07025, over 1070081.63 frames. ], batch size: 232, lr: 1.07e-02, grad_scale: 16.0 2023-09-29 09:19:18,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:19:18,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:19:22,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:19:25,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 09:19:25,885 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:19:32,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 09:19:34,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 09:19:35,091 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=319120.0, ans=0.125 2023-09-29 09:19:38,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 09:19:39,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:19:41,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:19:41,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:19:43,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:19:44,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 09:19:44,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 09:19:44,933 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:19:48,557 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=319186.6666666667, ans=0.125 2023-09-29 09:19:48,562 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=319186.6666666667, ans=0.0 2023-09-29 09:19:52,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:19:53,989 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:19:54,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 09:19:55,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 09:19:57,067 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:19:57,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 09:19:57,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 09:19:58,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:20:00,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 09:20:06,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:20:07,005 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:20:08,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:20:11,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:20:11,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 09:20:11,477 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=319253.3333333333, ans=0.09899494936611666 2023-09-29 09:20:12,956 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=319253.3333333333, ans=0.125 2023-09-29 09:20:14,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 09:20:14,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 09:20:16,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:20:16,433 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 09:20:17,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:20:18,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:20:18,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 09:20:18,323 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=319253.3333333333, ans=0.125 2023-09-29 09:20:19,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 09:20:20,927 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 09:20:22,303 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.676e+02 2.176e+02 2.452e+02 2.821e+02 3.971e+02, threshold=4.904e+02, percent-clipped=0.0 2023-09-29 09:20:22,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:20:22,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:20:24,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 09:20:24,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 09:20:24,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:20:25,719 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:20:27,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 09:20:27,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:20:30,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:20:34,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:20:34,729 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=319320.0, ans=0.125 2023-09-29 09:20:36,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:20:38,788 INFO [train.py:1039] (1/4) Epoch 10, batch 100, loss[loss=0.1917, simple_loss=0.2665, pruned_loss=0.05849, over 16168.00 frames. ], tot_loss[loss=0.2119, simple_loss=0.2801, pruned_loss=0.07181, over 1849048.12 frames. ], batch size: 35, lr: 1.07e-02, grad_scale: 16.0 2023-09-29 09:20:38,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 09:20:38,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:20:45,529 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:20:45,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:20:45,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:20:45,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:20:45,657 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:20:47,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 09:20:49,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 09:20:49,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:20:49,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:20:49,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:20:55,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 09:20:56,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:20:57,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:20:57,325 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_positive, batch_count=319453.3333333333, ans=0.05 2023-09-29 09:20:58,504 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 09:21:01,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 09:21:01,860 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=319453.3333333333, ans=0.0 2023-09-29 09:21:04,740 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 09:21:04,776 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 09:21:06,374 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:21:06,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 09:21:08,251 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:21:09,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 09:21:12,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:21:14,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:21:14,784 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=319520.0, ans=0.2 2023-09-29 09:21:21,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:21:21,281 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 09:21:23,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 09:21:27,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:21:29,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:21:30,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:21:33,951 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:21:37,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:21:38,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:21:40,451 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=319586.6666666667, ans=0.0 2023-09-29 09:21:41,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:21:43,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:21:43,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:21:43,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:21:45,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:21:45,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 09:21:45,601 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 09:21:45,722 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=319653.3333333333, ans=0.2 2023-09-29 09:21:47,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:21:48,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:21:48,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:21:48,604 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:21:48,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 09:21:48,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 09:21:50,146 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 09:21:50,157 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:21:51,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:21:51,674 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:21:51,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:21:53,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:21:55,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:21:59,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:21:59,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:22:00,950 INFO [train.py:1039] (1/4) Epoch 10, batch 150, loss[loss=0.2037, simple_loss=0.2761, pruned_loss=0.06565, over 24328.00 frames. ], tot_loss[loss=0.2108, simple_loss=0.2803, pruned_loss=0.07067, over 2502068.37 frames. ], batch size: 61, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:22:01,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:22:02,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:22:04,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:22:05,862 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=319720.0, ans=0.125 2023-09-29 09:22:07,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:22:08,692 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:22:13,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 09:22:13,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 09:22:13,306 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 09:22:16,281 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:22:16,291 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 09:22:17,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:22:19,362 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:22:19,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:22:19,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:22:19,512 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:22:21,637 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 09:22:23,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:22:30,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:22:32,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:22:35,827 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 09:22:38,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:22:38,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:22:38,935 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:22:41,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:22:43,725 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=319853.3333333333, ans=0.125 2023-09-29 09:22:44,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:22:44,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:22:46,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:22:47,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 09:22:53,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:22:55,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:22:55,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:22:55,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:22:59,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:22:59,612 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=319920.0, ans=0.125 2023-09-29 09:23:00,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 09:23:02,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:23:04,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:23:05,968 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:23:08,107 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.959e+02 2.278e+02 2.639e+02 3.877e+02, threshold=4.556e+02, percent-clipped=0.0 2023-09-29 09:23:08,288 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:23:08,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 09:23:08,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:23:08,383 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 09:23:17,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:23:22,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:23:23,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:23:25,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 09:23:26,916 INFO [train.py:1039] (1/4) Epoch 10, batch 200, loss[loss=0.1973, simple_loss=0.2599, pruned_loss=0.06742, over 23614.00 frames. ], tot_loss[loss=0.2112, simple_loss=0.2802, pruned_loss=0.07108, over 2995071.77 frames. ], batch size: 149, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:23:27,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:23:27,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:23:27,790 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.08 vs. limit=15.0 2023-09-29 09:23:31,428 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 09:23:31,808 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=320053.3333333333, ans=0.125 2023-09-29 09:23:31,839 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=320053.3333333333, ans=0.125 2023-09-29 09:23:32,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 09:23:33,293 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=320053.3333333333, ans=0.2 2023-09-29 09:23:34,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:23:35,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:23:39,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:23:41,117 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:23:41,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:23:56,520 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=320120.0, ans=0.125 2023-09-29 09:24:00,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:24:02,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:24:03,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:24:05,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:24:06,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 09:24:06,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 09:24:08,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:24:08,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 09:24:08,493 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=320186.6666666667, ans=0.125 2023-09-29 09:24:09,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:24:09,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:24:10,051 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=320186.6666666667, ans=0.1 2023-09-29 09:24:12,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 09:24:12,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 09:24:12,815 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:24:17,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:24:25,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:24:27,362 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=320253.3333333333, ans=0.0 2023-09-29 09:24:30,684 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:24:32,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:24:38,398 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:24:40,189 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=320320.0, ans=0.1 2023-09-29 09:24:41,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 09:24:42,946 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:24:42,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:24:42,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:24:43,142 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 09:24:44,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 09:24:44,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:24:44,711 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 09:24:45,082 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=320320.0, ans=0.0 2023-09-29 09:24:46,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:24:47,877 INFO [train.py:1039] (1/4) Epoch 10, batch 250, loss[loss=0.2199, simple_loss=0.2883, pruned_loss=0.07577, over 24271.00 frames. ], tot_loss[loss=0.2103, simple_loss=0.2792, pruned_loss=0.07071, over 3378306.30 frames. ], batch size: 77, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:24:50,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:24:51,549 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:24:51,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:24:53,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:24:53,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:24:57,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:24:59,356 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=320386.6666666667, ans=10.0 2023-09-29 09:25:00,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:25:04,571 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=320453.3333333333, ans=0.125 2023-09-29 09:25:07,923 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.02 vs. limit=10.0 2023-09-29 09:25:10,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:25:10,694 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=320453.3333333333, ans=0.125 2023-09-29 09:25:13,465 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:25:14,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:25:21,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 09:25:21,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 09:25:22,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:25:22,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:25:24,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 09:25:24,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:25:26,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:25:26,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=320520.0, ans=0.0 2023-09-29 09:25:29,703 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:25:33,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 09:25:33,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:25:35,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:25:35,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:25:36,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:25:36,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:25:36,831 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=320586.6666666667, ans=10.0 2023-09-29 09:25:36,835 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=320586.6666666667, ans=0.125 2023-09-29 09:25:38,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 09:25:38,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:25:40,015 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:25:41,599 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:25:41,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:25:46,075 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:25:50,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:25:50,866 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=320586.6666666667, ans=0.025 2023-09-29 09:25:53,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:25:55,084 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 2.019e+02 2.235e+02 2.582e+02 3.547e+02, threshold=4.469e+02, percent-clipped=0.0 2023-09-29 09:25:59,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:26:02,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:26:04,411 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=320653.3333333333, ans=0.125 2023-09-29 09:26:07,104 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 09:26:07,257 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:26:08,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:26:08,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 09:26:10,200 INFO [train.py:1039] (1/4) Epoch 10, batch 300, loss[loss=0.2006, simple_loss=0.2792, pruned_loss=0.061, over 24669.00 frames. ], tot_loss[loss=0.2082, simple_loss=0.2766, pruned_loss=0.06988, over 3671077.27 frames. ], batch size: 65, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:26:10,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 09:26:11,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:26:11,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 09:26:15,506 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=320720.0, ans=0.025 2023-09-29 09:26:16,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:26:17,349 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.40 vs. limit=22.5 2023-09-29 09:26:18,286 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:26:21,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:26:21,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 09:26:23,035 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:26:24,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 09:26:24,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 09:26:24,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:26:27,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 09:26:32,413 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 09:26:34,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 09:26:38,162 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 09:26:38,232 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:26:41,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:26:43,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:26:43,143 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 09:26:43,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:26:45,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:26:46,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:26:48,459 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:26:52,989 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 09:26:52,998 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 09:26:53,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:26:54,933 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=320853.3333333333, ans=0.2 2023-09-29 09:26:56,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:26:57,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 09:26:57,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:27:01,063 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:27:04,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:27:04,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 09:27:08,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:27:08,567 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 09:27:11,410 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:27:11,640 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=320920.0, ans=0.125 2023-09-29 09:27:14,280 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:27:14,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 09:27:14,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 09:27:14,580 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=320986.6666666667, ans=0.0 2023-09-29 09:27:15,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:27:17,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 09:27:20,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:27:21,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:27:22,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:27:22,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:27:22,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:27:27,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:27:27,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 09:27:30,386 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:27:31,861 INFO [train.py:1039] (1/4) Epoch 10, batch 350, loss[loss=0.2193, simple_loss=0.3021, pruned_loss=0.06831, over 24329.00 frames. ], tot_loss[loss=0.2068, simple_loss=0.2751, pruned_loss=0.06927, over 3896006.69 frames. ], batch size: 74, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:27:35,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:27:40,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:27:40,589 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=321053.3333333333, ans=0.125 2023-09-29 09:27:40,997 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.64 vs. limit=15.0 2023-09-29 09:27:41,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:27:45,457 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 09:27:47,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:27:47,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 09:27:49,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:27:50,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 09:27:50,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:27:55,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 09:27:56,169 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=321120.0, ans=0.2 2023-09-29 09:27:57,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:27:59,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:28:00,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:28:00,786 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=321120.0, ans=0.1 2023-09-29 09:28:02,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:28:02,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:28:02,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:28:02,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:28:02,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:28:04,479 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=6.85 vs. limit=12.0 2023-09-29 09:28:05,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:28:05,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:28:05,624 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=321186.6666666667, ans=0.125 2023-09-29 09:28:12,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:28:12,968 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 09:28:15,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:28:15,909 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:28:17,590 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=321186.6666666667, ans=0.0 2023-09-29 09:28:21,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 09:28:21,877 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:28:27,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:28:27,045 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:28:27,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:28:29,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 09:28:29,489 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=321253.3333333333, ans=0.1 2023-09-29 09:28:32,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:28:32,384 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 09:28:35,265 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 09:28:35,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:28:39,694 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 2.071e+02 2.540e+02 3.083e+02 5.946e+02, threshold=5.081e+02, percent-clipped=4.0 2023-09-29 09:28:39,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:28:39,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 09:28:43,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:28:44,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:28:46,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:28:47,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:28:47,925 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:28:50,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:28:52,168 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=321320.0, ans=0.125 2023-09-29 09:28:53,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:28:55,216 INFO [train.py:1039] (1/4) Epoch 10, batch 400, loss[loss=0.2294, simple_loss=0.2699, pruned_loss=0.09446, over 19227.00 frames. ], tot_loss[loss=0.2071, simple_loss=0.2749, pruned_loss=0.06968, over 4068655.28 frames. ], batch size: 388, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:28:56,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:28:58,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 09:28:58,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:28:58,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:29:02,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:29:02,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:29:05,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:29:07,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:29:08,733 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 09:29:10,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 09:29:10,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:29:11,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 09:29:11,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:29:15,428 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=321453.3333333333, ans=0.125 2023-09-29 09:29:16,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:29:16,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:29:16,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 09:29:18,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:29:18,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:29:18,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:29:19,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:29:21,992 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.35 vs. limit=22.5 2023-09-29 09:29:22,556 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 09:29:23,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 09:29:23,591 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=321453.3333333333, ans=0.2 2023-09-29 09:29:26,987 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.48 vs. limit=15.0 2023-09-29 09:29:29,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:29:30,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:29:30,427 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=321520.0, ans=0.125 2023-09-29 09:29:31,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 09:29:33,080 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 09:29:33,531 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=321520.0, ans=0.0 2023-09-29 09:29:34,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:29:37,792 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:29:42,291 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=321520.0, ans=0.0 2023-09-29 09:29:45,178 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 09:29:48,195 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 09:29:49,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 09:29:52,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:29:53,354 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.43 vs. limit=15.0 2023-09-29 09:29:54,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:29:55,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 09:29:58,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:30:02,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 09:30:04,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:30:04,397 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=321653.3333333333, ans=0.125 2023-09-29 09:30:07,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:30:08,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 09:30:10,084 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 09:30:11,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 09:30:15,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 09:30:15,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:30:16,767 INFO [train.py:1039] (1/4) Epoch 10, batch 450, loss[loss=0.2076, simple_loss=0.2839, pruned_loss=0.06568, over 24152.00 frames. ], tot_loss[loss=0.2075, simple_loss=0.2756, pruned_loss=0.06976, over 4209158.81 frames. ], batch size: 86, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:30:16,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 09:30:18,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 09:30:18,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:30:20,516 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 09:30:22,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 09:30:22,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:30:23,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:30:25,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:30:25,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 09:30:26,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:30:26,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 09:30:28,981 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.35 vs. limit=6.0 2023-09-29 09:30:29,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:30:37,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:30:38,422 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:30:41,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 09:30:41,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 09:30:43,806 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=4.26 vs. limit=12.0 2023-09-29 09:30:45,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 09:30:48,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:30:51,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:30:55,358 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.17 vs. limit=22.5 2023-09-29 09:30:56,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:30:56,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:30:59,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 09:30:59,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 09:31:00,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 09:31:02,192 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:31:02,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:31:03,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:31:05,340 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 09:31:05,354 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 09:31:05,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:31:06,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:31:07,256 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=321920.0, ans=0.125 2023-09-29 09:31:09,039 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 09:31:12,798 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 09:31:14,186 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:31:14,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 09:31:15,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 09:31:16,102 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=321920.0, ans=0.125 2023-09-29 09:31:17,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:31:17,644 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=321920.0, ans=0.125 2023-09-29 09:31:20,455 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 09:31:21,823 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 09:31:22,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 09:31:25,376 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 1.898e+02 2.147e+02 2.621e+02 3.477e+02, threshold=4.294e+02, percent-clipped=0.0 2023-09-29 09:31:25,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:31:27,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 09:31:29,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 09:31:30,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:31:35,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:31:36,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:31:38,098 INFO [train.py:1039] (1/4) Epoch 10, batch 500, loss[loss=0.2968, simple_loss=0.3316, pruned_loss=0.131, over 19158.00 frames. ], tot_loss[loss=0.2091, simple_loss=0.2769, pruned_loss=0.07059, over 4307620.60 frames. ], batch size: 388, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:31:39,614 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:31:39,663 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 09:31:44,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:31:45,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:31:46,858 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:31:46,886 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 09:31:48,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 09:31:48,432 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:31:48,788 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=322053.3333333333, ans=0.125 2023-09-29 09:31:51,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 09:31:52,631 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.20 vs. limit=6.0 2023-09-29 09:31:54,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 09:31:57,428 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 09:32:00,935 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:32:00,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:32:02,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:32:08,184 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=322120.0, ans=0.1 2023-09-29 09:32:09,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=322186.6666666667, ans=0.125 2023-09-29 09:32:11,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:32:11,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 09:32:12,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 09:32:12,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:32:12,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 09:32:12,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 09:32:17,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:32:19,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:32:19,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:32:19,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:32:21,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 09:32:24,632 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 09:32:26,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:32:27,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:32:29,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:32:29,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:32:30,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 09:32:32,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 09:32:34,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:32:37,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:32:41,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:32:44,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:32:46,099 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=322320.0, ans=0.1 2023-09-29 09:32:48,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:32:52,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 09:32:52,558 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:32:52,576 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:32:55,162 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=322320.0, ans=0.0 2023-09-29 09:32:56,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 09:32:56,300 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 09:32:59,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:33:00,701 INFO [train.py:1039] (1/4) Epoch 10, batch 550, loss[loss=0.2091, simple_loss=0.2863, pruned_loss=0.06601, over 23822.00 frames. ], tot_loss[loss=0.2094, simple_loss=0.2778, pruned_loss=0.07048, over 4389671.38 frames. ], batch size: 85, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:33:04,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 09:33:07,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 09:33:07,537 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:33:07,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 09:33:09,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:33:09,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:33:10,520 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:33:12,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:33:12,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:33:14,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:33:14,486 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=322386.6666666667, ans=0.2 2023-09-29 09:33:15,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:33:17,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 09:33:17,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:33:20,611 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:33:20,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:33:23,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:33:25,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:33:25,461 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=322453.3333333333, ans=0.125 2023-09-29 09:33:25,564 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=322453.3333333333, ans=0.125 2023-09-29 09:33:29,820 WARNING [train.py:1197] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 09:33:31,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 09:33:32,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:33:37,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:33:37,510 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:33:39,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 09:33:39,895 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=322520.0, ans=0.1 2023-09-29 09:33:44,016 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:33:44,025 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 09:33:46,016 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:33:47,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 09:33:49,324 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:33:49,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 09:33:49,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 09:33:52,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:33:52,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 09:33:54,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 09:33:55,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:33:55,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:33:55,755 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=322586.6666666667, ans=0.125 2023-09-29 09:33:57,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:33:57,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:34:00,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:34:00,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:34:03,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:34:04,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:34:07,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 09:34:08,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 09:34:10,435 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 2.009e+02 2.272e+02 2.657e+02 5.113e+02, threshold=4.543e+02, percent-clipped=1.0 2023-09-29 09:34:10,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:34:12,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:34:12,199 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:34:14,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 09:34:14,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 09:34:21,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 09:34:22,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 09:34:24,160 INFO [train.py:1039] (1/4) Epoch 10, batch 600, loss[loss=0.2096, simple_loss=0.2865, pruned_loss=0.06637, over 24378.00 frames. ], tot_loss[loss=0.2094, simple_loss=0.2777, pruned_loss=0.07052, over 4455008.68 frames. ], batch size: 77, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:34:24,260 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:34:24,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 09:34:24,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:34:31,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:34:35,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 09:34:37,647 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 09:34:39,231 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 09:34:39,583 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=322786.6666666667, ans=0.0 2023-09-29 09:34:40,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:34:42,502 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:34:43,333 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.77 vs. limit=12.0 2023-09-29 09:34:46,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 09:34:46,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:34:50,862 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=322786.6666666667, ans=0.125 2023-09-29 09:34:52,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 09:34:55,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:34:55,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:34:55,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:35:04,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:35:04,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:35:04,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:35:13,002 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:35:16,885 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:35:16,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:35:16,908 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:35:24,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 09:35:28,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 09:35:29,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:35:33,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 09:35:33,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:35:36,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 09:35:38,062 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:35:38,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:35:44,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 09:35:46,801 INFO [train.py:1039] (1/4) Epoch 10, batch 650, loss[loss=0.1886, simple_loss=0.2713, pruned_loss=0.05294, over 24437.00 frames. ], tot_loss[loss=0.2088, simple_loss=0.2767, pruned_loss=0.07044, over 4509806.48 frames. ], batch size: 69, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:35:46,998 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 09:35:49,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:35:51,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:35:55,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:35:56,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 09:35:56,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:36:00,657 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.74 vs. limit=6.0 2023-09-29 09:36:03,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:36:03,473 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:36:03,713 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=323120.0, ans=0.0 2023-09-29 09:36:05,204 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=323120.0, ans=0.0 2023-09-29 09:36:06,683 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:36:11,097 WARNING [train.py:1197] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 09:36:14,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:36:14,132 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:36:19,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:36:19,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 09:36:23,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:36:23,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:36:23,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 09:36:24,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:36:25,027 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:36:25,168 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=323186.6666666667, ans=0.125 2023-09-29 09:36:28,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:36:28,094 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 09:36:28,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:36:28,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:36:33,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:36:33,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:36:35,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:36:36,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:36:36,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 09:36:38,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:36:38,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:36:38,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 09:36:38,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:36:39,000 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=323253.3333333333, ans=0.2 2023-09-29 09:36:40,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 09:36:41,971 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 09:36:44,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 09:36:44,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:36:44,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:36:44,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:36:45,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:36:46,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:36:50,566 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=323253.3333333333, ans=0.04949747468305833 2023-09-29 09:36:54,000 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:36:55,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:36:56,669 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.920e+02 2.159e+02 2.398e+02 3.616e+02, threshold=4.317e+02, percent-clipped=0.0 2023-09-29 09:36:56,843 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:36:59,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:36:59,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 09:37:01,243 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:37:09,365 INFO [train.py:1039] (1/4) Epoch 10, batch 700, loss[loss=0.2208, simple_loss=0.2778, pruned_loss=0.08191, over 23573.00 frames. ], tot_loss[loss=0.2073, simple_loss=0.2755, pruned_loss=0.06955, over 4565131.80 frames. ], batch size: 256, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:37:09,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 09:37:09,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:37:09,501 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:37:10,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:37:14,733 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 09:37:16,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 09:37:19,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 09:37:19,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:37:20,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:37:23,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 09:37:26,793 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=323453.3333333333, ans=0.125 2023-09-29 09:37:28,750 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.31 vs. limit=10.0 2023-09-29 09:37:29,532 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:37:30,028 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=323453.3333333333, ans=0.125 2023-09-29 09:37:31,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:37:32,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:37:32,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 09:37:34,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:37:36,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:37:36,930 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.21 vs. limit=15.0 2023-09-29 09:37:39,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 09:37:39,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:37:42,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 09:37:45,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 09:37:49,423 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 09:37:49,729 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=323520.0, ans=0.0 2023-09-29 09:37:49,755 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=323520.0, ans=0.125 2023-09-29 09:37:50,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:37:52,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 09:37:57,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:37:57,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 09:38:04,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:38:04,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:38:05,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 09:38:06,160 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=323586.6666666667, ans=0.0 2023-09-29 09:38:10,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:38:10,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:38:13,079 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.58 vs. limit=5.0 2023-09-29 09:38:15,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:38:18,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:38:18,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 09:38:22,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 09:38:22,100 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 09:38:25,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:38:28,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:38:30,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:38:32,730 INFO [train.py:1039] (1/4) Epoch 10, batch 750, loss[loss=0.2196, simple_loss=0.2818, pruned_loss=0.07868, over 23749.00 frames. ], tot_loss[loss=0.2069, simple_loss=0.2753, pruned_loss=0.06925, over 4597795.90 frames. ], batch size: 232, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:38:32,824 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:38:32,833 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 09:38:33,302 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=323720.0, ans=0.125 2023-09-29 09:38:37,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 09:38:37,538 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 09:38:37,844 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=323720.0, ans=0.125 2023-09-29 09:38:38,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 09:38:38,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 09:38:38,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 09:38:40,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:38:40,822 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=323720.0, ans=0.125 2023-09-29 09:38:41,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 09:38:42,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:38:43,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 09:38:43,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:38:45,299 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:38:46,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 09:38:46,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:38:50,413 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:38:50,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 09:38:53,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:38:55,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:38:57,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:38:57,125 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 09:38:58,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:38:58,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:39:00,359 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:39:03,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 09:39:05,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 09:39:05,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:39:07,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 09:39:07,201 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 09:39:07,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 09:39:07,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:39:07,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 09:39:10,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:39:13,895 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten.whitening_limit, batch_count=323853.3333333333, ans=22.5 2023-09-29 09:39:17,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 09:39:17,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:39:17,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 09:39:21,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:39:21,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:39:21,775 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=323920.0, ans=0.0 2023-09-29 09:39:22,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 09:39:22,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:39:24,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 09:39:25,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:39:29,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:39:29,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 09:39:30,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:39:36,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:39:38,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:39:38,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:39:41,886 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.983e+02 2.343e+02 2.893e+02 4.717e+02, threshold=4.686e+02, percent-clipped=1.0 2023-09-29 09:39:41,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:39:43,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 09:39:43,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:39:43,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:39:49,745 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:39:49,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:39:51,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:39:53,039 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 09:39:54,411 INFO [train.py:1039] (1/4) Epoch 10, batch 800, loss[loss=0.1766, simple_loss=0.247, pruned_loss=0.05306, over 24335.00 frames. ], tot_loss[loss=0.2073, simple_loss=0.2757, pruned_loss=0.06947, over 4617052.79 frames. ], batch size: 56, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:40:00,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:40:00,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:40:02,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:40:02,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:40:04,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:40:04,491 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:40:06,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:40:12,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:40:12,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:40:16,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 09:40:16,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:40:19,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:40:19,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:40:19,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:40:19,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 09:40:21,150 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:40:21,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 09:40:24,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:40:25,766 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:40:27,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:40:27,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:40:29,705 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=324186.6666666667, ans=0.125 2023-09-29 09:40:30,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:40:30,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:40:35,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:40:35,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:40:35,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 09:40:37,101 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 09:40:38,478 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 09:40:38,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 09:40:38,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:40:38,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:40:40,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:40:46,214 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 09:40:46,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 09:40:49,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:40:49,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 09:40:51,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:40:54,523 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:40:56,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 09:40:57,954 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:41:01,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 09:41:02,858 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=324320.0, ans=0.0 2023-09-29 09:41:07,220 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=324320.0, ans=0.0 2023-09-29 09:41:09,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:41:11,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:41:12,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 09:41:13,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:41:14,542 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:41:16,491 INFO [train.py:1039] (1/4) Epoch 10, batch 850, loss[loss=0.2294, simple_loss=0.3025, pruned_loss=0.07813, over 23768.00 frames. ], tot_loss[loss=0.2073, simple_loss=0.2759, pruned_loss=0.0693, over 4649249.80 frames. ], batch size: 85, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:41:16,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 09:41:16,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:41:20,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:41:21,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:41:22,109 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=324386.6666666667, ans=0.0 2023-09-29 09:41:23,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 09:41:25,004 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:41:26,635 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 09:41:28,020 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 09:41:28,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 09:41:28,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:41:28,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:41:30,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:41:31,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:41:31,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:41:37,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:41:37,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:41:37,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 09:41:42,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 09:41:45,610 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:41:47,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 09:41:53,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 09:41:55,749 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 09:41:57,574 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 09:41:57,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:41:57,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:41:57,616 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 09:42:00,545 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:42:03,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:42:03,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 09:42:05,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:42:05,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:42:06,571 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:42:06,602 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 09:42:08,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:42:09,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 09:42:11,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 09:42:13,003 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=324586.6666666667, ans=0.2 2023-09-29 09:42:14,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:42:14,850 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:42:16,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:42:16,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:42:17,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:42:19,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:42:19,895 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=324653.3333333333, ans=0.2 2023-09-29 09:42:22,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:42:22,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 09:42:24,595 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.670e+02 1.918e+02 2.187e+02 2.530e+02 4.309e+02, threshold=4.375e+02, percent-clipped=0.0 2023-09-29 09:42:24,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:42:26,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 09:42:30,809 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=324653.3333333333, ans=0.2 2023-09-29 09:42:33,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 09:42:35,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:42:36,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 09:42:36,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:42:36,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:42:38,087 INFO [train.py:1039] (1/4) Epoch 10, batch 900, loss[loss=0.2475, simple_loss=0.3039, pruned_loss=0.09552, over 22756.00 frames. ], tot_loss[loss=0.2083, simple_loss=0.277, pruned_loss=0.06984, over 4671632.58 frames. ], batch size: 322, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:42:38,392 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=324720.0, ans=0.125 2023-09-29 09:42:40,399 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.90 vs. limit=6.0 2023-09-29 09:42:41,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 09:42:45,871 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:42:47,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:42:47,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 09:42:50,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:42:51,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 09:42:52,952 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 09:42:53,305 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=324786.6666666667, ans=0.1 2023-09-29 09:42:54,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:42:54,525 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:42:54,609 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 09:42:54,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:43:08,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:43:09,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:43:09,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 09:43:14,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:43:18,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 09:43:19,186 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=324853.3333333333, ans=0.1 2023-09-29 09:43:20,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:43:24,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 09:43:25,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 09:43:26,968 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 09:43:27,103 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 09:43:35,185 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 09:43:35,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:43:36,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 09:43:44,078 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:43:44,096 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:43:45,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 09:43:47,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:43:48,737 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 09:43:48,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:43:50,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:43:50,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:43:51,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:43:55,239 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 09:43:56,495 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 09:43:58,061 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 09:43:59,850 INFO [train.py:1039] (1/4) Epoch 10, batch 950, loss[loss=0.1826, simple_loss=0.258, pruned_loss=0.05362, over 24340.00 frames. ], tot_loss[loss=0.2074, simple_loss=0.2765, pruned_loss=0.06915, over 4682916.10 frames. ], batch size: 61, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:43:59,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 09:44:03,016 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:44:07,004 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:44:08,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 09:44:12,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:44:14,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:44:16,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:44:16,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 09:44:18,334 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 09:44:21,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:44:21,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:44:21,695 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=325120.0, ans=0.125 2023-09-29 09:44:22,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:44:24,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:44:24,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 09:44:24,655 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 09:44:27,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:44:28,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 09:44:29,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:44:32,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:44:32,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:44:32,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:44:33,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 09:44:37,522 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 09:44:37,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:44:40,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:44:45,714 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:44:45,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:44:47,598 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=325253.3333333333, ans=0.125 2023-09-29 09:44:47,930 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.59 vs. limit=15.0 2023-09-29 09:44:49,400 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 09:44:50,976 WARNING [train.py:1197] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 09:44:50,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 09:44:52,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:44:53,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:44:53,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:44:57,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 09:44:58,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:45:00,434 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:45:01,801 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:45:01,844 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 09:45:03,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:45:03,247 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:45:03,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 09:45:04,129 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.12 vs. limit=15.0 2023-09-29 09:45:07,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:45:09,610 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.907e+02 2.117e+02 2.443e+02 3.103e+02, threshold=4.235e+02, percent-clipped=0.0 2023-09-29 09:45:09,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:45:15,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:45:17,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 09:45:17,343 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 09:45:22,952 INFO [train.py:1039] (1/4) Epoch 10, batch 1000, loss[loss=0.2169, simple_loss=0.2999, pruned_loss=0.06691, over 24548.00 frames. ], tot_loss[loss=0.2076, simple_loss=0.2767, pruned_loss=0.06924, over 4693970.00 frames. ], batch size: 71, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:45:23,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:45:27,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 09:45:27,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:45:35,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:45:35,183 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 09:45:35,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 09:45:39,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:45:39,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:45:40,239 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:45:41,703 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=325453.3333333333, ans=0.125 2023-09-29 09:45:43,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:45:46,974 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 09:45:50,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 09:45:50,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 09:45:51,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:45:53,875 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 09:45:54,054 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 09:45:54,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 09:45:57,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:45:57,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:46:06,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:46:06,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:46:08,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:46:08,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:46:08,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 09:46:09,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:46:11,210 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:46:11,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:46:12,837 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 09:46:14,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 09:46:16,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 09:46:18,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 09:46:20,047 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_ff3.min_abs, batch_count=325586.6666666667, ans=0.2 2023-09-29 09:46:21,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:46:27,028 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=325653.3333333333, ans=0.2 2023-09-29 09:46:28,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:46:28,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:46:30,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:46:30,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:46:33,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 09:46:34,961 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:46:35,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 09:46:35,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 09:46:35,381 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=325653.3333333333, ans=0.0 2023-09-29 09:46:36,701 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:46:36,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:46:38,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:46:38,914 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=325653.3333333333, ans=0.1 2023-09-29 09:46:41,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 09:46:42,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:46:44,491 INFO [train.py:1039] (1/4) Epoch 10, batch 1050, loss[loss=0.2074, simple_loss=0.2598, pruned_loss=0.07747, over 22847.00 frames. ], tot_loss[loss=0.2065, simple_loss=0.2753, pruned_loss=0.06891, over 4688449.70 frames. ], batch size: 322, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:46:46,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:46:47,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:46:49,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 09:46:51,215 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:46:52,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:46:53,101 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=325720.0, ans=0.0 2023-09-29 09:46:54,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:46:56,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 09:46:59,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:46:59,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:46:59,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 09:47:01,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:47:03,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 09:47:03,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:47:04,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 09:47:06,265 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:47:06,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 09:47:06,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 09:47:15,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:47:15,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 09:47:15,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:47:19,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 09:47:20,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 09:47:20,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:47:22,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 09:47:25,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 09:47:26,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:47:27,201 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=325853.3333333333, ans=0.0 2023-09-29 09:47:31,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 09:47:34,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 09:47:34,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:47:35,062 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:47:40,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:47:42,155 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer_ff2.min_abs, batch_count=325920.0, ans=0.1 2023-09-29 09:47:43,383 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 09:47:44,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 09:47:45,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 09:47:45,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:47:46,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 09:47:47,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 09:47:48,343 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=325986.6666666667, ans=0.0 2023-09-29 09:47:51,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:47:52,396 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 2.027e+02 2.343e+02 2.792e+02 3.800e+02, threshold=4.687e+02, percent-clipped=0.0 2023-09-29 09:47:54,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:47:54,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:47:56,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:47:56,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:48:01,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:48:01,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 09:48:04,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:48:04,138 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 09:48:04,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 09:48:05,653 INFO [train.py:1039] (1/4) Epoch 10, batch 1100, loss[loss=0.1976, simple_loss=0.2741, pruned_loss=0.06053, over 24647.00 frames. ], tot_loss[loss=0.2056, simple_loss=0.2745, pruned_loss=0.06835, over 4699716.73 frames. ], batch size: 65, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 09:48:05,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:48:11,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:48:14,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:48:17,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:48:18,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:48:19,428 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:48:19,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 09:48:20,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:48:22,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:48:26,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:48:29,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:48:29,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 09:48:30,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 09:48:32,682 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:48:32,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:48:35,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:48:37,520 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=326186.6666666667, ans=0.125 2023-09-29 09:48:38,763 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:48:44,464 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:48:47,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 09:48:47,726 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 09:48:47,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:48:52,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:48:53,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 09:48:53,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:48:55,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 09:48:56,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:48:56,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:48:56,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:48:56,778 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:48:58,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 09:49:04,943 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:49:04,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 09:49:08,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:49:11,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 09:49:13,442 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 09:49:13,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 09:49:15,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:49:18,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:49:18,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:49:20,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 09:49:21,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:49:21,930 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:49:23,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 09:49:23,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:49:23,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 09:49:25,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:49:25,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:49:25,490 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:49:26,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 09:49:28,272 INFO [train.py:1039] (1/4) Epoch 10, batch 1150, loss[loss=0.2251, simple_loss=0.2927, pruned_loss=0.07871, over 23838.00 frames. ], tot_loss[loss=0.2068, simple_loss=0.2759, pruned_loss=0.06885, over 4712965.31 frames. ], batch size: 212, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 09:49:31,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:49:34,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:49:36,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:49:36,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:49:38,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 09:49:38,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:49:41,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 09:49:43,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:49:43,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 09:49:49,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 09:49:51,524 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:49:55,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:49:56,606 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:49:58,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 09:49:58,076 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 09:49:58,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:50:00,207 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.43 vs. limit=15.0 2023-09-29 09:50:04,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 09:50:05,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:50:07,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:50:17,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:50:23,839 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:50:23,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 09:50:25,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:50:25,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:50:30,554 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 09:50:30,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:50:37,350 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 2.028e+02 2.366e+02 3.044e+02 5.235e+02, threshold=4.733e+02, percent-clipped=2.0 2023-09-29 09:50:39,067 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 09:50:43,562 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=11.25 vs. limit=15.0 2023-09-29 09:50:44,345 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:50:45,848 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:50:45,898 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 09:50:45,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:50:49,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:50:50,924 INFO [train.py:1039] (1/4) Epoch 10, batch 1200, loss[loss=0.2452, simple_loss=0.2977, pruned_loss=0.09636, over 23845.00 frames. ], tot_loss[loss=0.2074, simple_loss=0.2773, pruned_loss=0.06871, over 4721368.18 frames. ], batch size: 195, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 09:50:52,058 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=326720.0, ans=10.0 2023-09-29 09:50:54,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:50:54,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 09:50:57,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:50:57,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:50:57,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:51:00,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:51:02,234 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 09:51:04,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:51:04,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:51:06,357 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=326786.6666666667, ans=0.125 2023-09-29 09:51:07,438 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 09:51:10,401 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 09:51:15,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:51:17,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:51:20,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:51:20,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:51:20,396 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 09:51:23,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:51:31,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 09:51:31,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:51:31,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 09:51:33,071 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:51:35,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 09:51:40,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 09:51:40,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:51:42,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:51:43,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:51:44,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 09:51:44,330 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=326920.0, ans=0.1 2023-09-29 09:51:45,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:51:45,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:51:46,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:51:47,065 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 09:51:47,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 09:51:49,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:51:49,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 09:51:50,989 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:51:51,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:51:55,466 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 09:51:58,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:52:02,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 09:52:05,233 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 09:52:08,112 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:52:11,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:52:13,019 INFO [train.py:1039] (1/4) Epoch 10, batch 1250, loss[loss=0.197, simple_loss=0.2778, pruned_loss=0.05806, over 24496.00 frames. ], tot_loss[loss=0.2081, simple_loss=0.2779, pruned_loss=0.06921, over 4718304.73 frames. ], batch size: 66, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 09:52:13,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:52:15,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:52:16,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 09:52:18,568 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=327053.3333333333, ans=0.125 2023-09-29 09:52:21,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:52:22,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:52:23,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 09:52:26,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:52:26,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 09:52:31,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 09:52:33,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:52:33,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:52:33,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:52:36,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 09:52:37,486 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.26 vs. limit=22.5 2023-09-29 09:52:41,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 09:52:41,057 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 09:52:41,075 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:52:41,282 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=327120.0, ans=0.125 2023-09-29 09:52:42,548 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:52:44,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:52:47,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:52:49,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 09:52:52,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 09:52:54,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:52:57,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:52:57,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 09:52:57,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:52:59,578 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 09:52:59,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:52:59,637 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:53:04,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:53:06,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:53:07,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:53:08,119 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=327253.3333333333, ans=0.1 2023-09-29 09:53:09,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 09:53:09,204 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 09:53:09,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 09:53:09,612 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=327253.3333333333, ans=0.0 2023-09-29 09:53:10,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:53:12,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 09:53:12,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:53:16,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 09:53:16,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:53:17,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 09:53:18,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 09:53:19,063 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:53:19,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 09:53:20,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:53:21,761 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 2.035e+02 2.245e+02 2.590e+02 3.760e+02, threshold=4.489e+02, percent-clipped=0.0 2023-09-29 09:53:22,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 09:53:26,505 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:53:28,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:53:30,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:53:35,028 INFO [train.py:1039] (1/4) Epoch 10, batch 1300, loss[loss=0.1953, simple_loss=0.2699, pruned_loss=0.06036, over 24652.00 frames. ], tot_loss[loss=0.209, simple_loss=0.2784, pruned_loss=0.06978, over 4712789.91 frames. ], batch size: 65, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 09:53:35,118 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 09:53:37,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:53:38,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 09:53:42,369 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=327386.6666666667, ans=0.0 2023-09-29 09:53:43,499 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:53:45,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 09:53:48,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:53:49,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:53:49,713 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:53:51,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 09:53:53,723 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:53:55,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:53:56,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 09:53:57,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 09:53:59,830 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=327453.3333333333, ans=0.125 2023-09-29 09:54:01,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 09:54:04,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:54:04,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:54:06,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:54:08,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:54:08,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 09:54:10,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 09:54:11,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 09:54:13,769 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=327520.0, ans=0.125 2023-09-29 09:54:17,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:54:18,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 09:54:19,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 09:54:21,718 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 09:54:24,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:54:26,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:54:26,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 09:54:27,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:54:27,539 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 09:54:29,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:54:33,793 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:54:33,798 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:54:36,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 09:54:38,553 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 09:54:40,512 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 09:54:47,746 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:54:49,458 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 09:54:51,056 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:54:52,854 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=327653.3333333333, ans=0.1 2023-09-29 09:54:56,846 INFO [train.py:1039] (1/4) Epoch 10, batch 1350, loss[loss=0.2043, simple_loss=0.2604, pruned_loss=0.0741, over 23622.00 frames. ], tot_loss[loss=0.2087, simple_loss=0.278, pruned_loss=0.0697, over 4715392.71 frames. ], batch size: 232, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 09:54:59,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 09:55:03,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:55:03,892 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=327720.0, ans=0.125 2023-09-29 09:55:04,255 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.00 vs. limit=15.0 2023-09-29 09:55:05,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:55:07,137 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:55:07,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:55:10,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:55:10,391 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=327720.0, ans=0.125 2023-09-29 09:55:11,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:55:11,901 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=327786.6666666667, ans=0.1 2023-09-29 09:55:13,986 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=327786.6666666667, ans=0.1 2023-09-29 09:55:15,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:55:19,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 09:55:19,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 09:55:20,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:55:20,973 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=327786.6666666667, ans=0.125 2023-09-29 09:55:22,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 09:55:24,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:55:25,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:55:25,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 09:55:27,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 09:55:28,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 09:55:31,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:55:32,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 09:55:43,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:55:44,269 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.14 vs. limit=15.0 2023-09-29 09:55:49,977 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.84 vs. limit=15.0 2023-09-29 09:55:53,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:55:55,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:55:55,953 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 09:55:58,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:56:00,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 09:56:00,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 09:56:00,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:56:03,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:56:06,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 09:56:06,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:56:09,399 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 2.130e+02 2.395e+02 2.953e+02 6.223e+02, threshold=4.790e+02, percent-clipped=2.0 2023-09-29 09:56:12,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 09:56:14,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 09:56:20,584 INFO [train.py:1039] (1/4) Epoch 10, batch 1400, loss[loss=0.1949, simple_loss=0.2652, pruned_loss=0.06232, over 18913.00 frames. ], tot_loss[loss=0.2069, simple_loss=0.2761, pruned_loss=0.06886, over 4708720.19 frames. ], batch size: 41, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 09:56:20,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 09:56:23,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:56:25,924 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:56:27,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:56:31,431 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=328053.3333333333, ans=0.125 2023-09-29 09:56:34,098 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 09:56:35,652 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 09:56:36,052 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=328120.0, ans=0.0 2023-09-29 09:56:45,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:56:47,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:56:48,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:56:49,776 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.51 vs. limit=15.0 2023-09-29 09:56:50,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 09:56:55,375 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:56:56,783 WARNING [train.py:1197] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 09:57:02,701 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.01 vs. limit=22.5 2023-09-29 09:57:07,222 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:57:08,670 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:57:11,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 09:57:12,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:57:12,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:57:14,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:57:14,126 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:57:15,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:57:15,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:57:15,653 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:57:17,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 09:57:17,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:57:18,914 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=328253.3333333333, ans=0.125 2023-09-29 09:57:21,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:57:25,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 09:57:27,303 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=328320.0, ans=0.1 2023-09-29 09:57:31,715 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 09:57:33,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 09:57:35,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:57:38,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 09:57:38,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:57:41,997 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:57:43,358 INFO [train.py:1039] (1/4) Epoch 10, batch 1450, loss[loss=0.2258, simple_loss=0.3035, pruned_loss=0.07408, over 24046.00 frames. ], tot_loss[loss=0.2065, simple_loss=0.2755, pruned_loss=0.0687, over 4706747.06 frames. ], batch size: 80, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 09:57:45,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 09:57:47,263 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:57:47,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:57:47,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 09:57:52,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:57:53,533 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 09:57:53,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:57:55,559 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 09:57:56,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 09:57:57,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 09:57:57,375 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=328386.6666666667, ans=0.125 2023-09-29 09:57:58,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:58:00,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:58:00,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 09:58:01,730 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:58:01,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 09:58:01,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 09:58:01,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:58:02,845 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=7.95 vs. limit=15.0 2023-09-29 09:58:03,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:58:04,151 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.19 vs. limit=10.0 2023-09-29 09:58:06,309 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:58:10,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:58:11,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:58:11,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:58:14,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:58:14,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:58:19,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:58:19,213 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:58:19,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:58:19,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:58:22,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 09:58:24,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:58:29,236 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 09:58:30,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:58:32,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 09:58:33,977 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:58:35,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 09:58:41,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:58:41,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 09:58:43,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 09:58:45,176 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:58:49,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:58:49,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:58:53,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 09:58:53,726 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=328653.3333333333, ans=0.0 2023-09-29 09:58:54,751 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.639e+02 1.987e+02 2.423e+02 2.995e+02 4.591e+02, threshold=4.846e+02, percent-clipped=0.0 2023-09-29 09:58:54,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 09:58:54,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 09:58:57,074 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:58:57,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 09:59:05,590 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=328720.0, ans=0.0 2023-09-29 09:59:06,765 INFO [train.py:1039] (1/4) Epoch 10, batch 1500, loss[loss=0.2004, simple_loss=0.2671, pruned_loss=0.06687, over 20308.00 frames. ], tot_loss[loss=0.2071, simple_loss=0.2755, pruned_loss=0.0693, over 4699476.11 frames. ], batch size: 44, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 09:59:10,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 09:59:11,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:59:11,399 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 09:59:11,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:59:12,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:59:14,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:59:14,613 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 09:59:16,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:59:16,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 09:59:16,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:59:17,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:59:19,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:59:21,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:59:26,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:59:26,543 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 09:59:28,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 09:59:29,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:59:29,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:59:31,903 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=328786.6666666667, ans=0.0 2023-09-29 09:59:33,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 09:59:39,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 09:59:39,821 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:59:41,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 09:59:42,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 09:59:44,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:59:47,336 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:59:47,363 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:59:48,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 09:59:49,667 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.75 vs. limit=12.0 2023-09-29 09:59:50,287 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:59:50,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:59:50,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 09:59:50,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:59:50,749 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=328853.3333333333, ans=0.125 2023-09-29 09:59:52,478 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=328853.3333333333, ans=0.125 2023-09-29 09:59:57,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:59:57,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 10:00:03,951 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 10:00:05,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 10:00:12,146 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 10:00:12,804 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.99 vs. limit=15.0 2023-09-29 10:00:14,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:00:14,076 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 10:00:15,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:00:17,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:00:17,265 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 10:00:18,880 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 10:00:19,262 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=328986.6666666667, ans=0.1 2023-09-29 10:00:21,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 10:00:22,252 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=328986.6666666667, ans=0.125 2023-09-29 10:00:23,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:00:27,773 INFO [train.py:1039] (1/4) Epoch 10, batch 1550, loss[loss=0.2145, simple_loss=0.2689, pruned_loss=0.08005, over 23714.00 frames. ], tot_loss[loss=0.2074, simple_loss=0.2763, pruned_loss=0.06924, over 4714532.58 frames. ], batch size: 149, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 10:00:27,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:00:27,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:00:27,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:00:28,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:00:29,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:00:31,874 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 10:00:31,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 10:00:31,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:00:32,103 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 10:00:32,365 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=329053.3333333333, ans=0.125 2023-09-29 10:00:32,750 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=13.12 vs. limit=15.0 2023-09-29 10:00:33,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 10:00:35,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:00:35,387 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=329053.3333333333, ans=0.125 2023-09-29 10:00:36,770 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:00:36,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:00:36,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:00:39,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:00:40,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:00:44,139 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 10:00:44,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:00:44,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:00:44,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 10:00:47,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:00:47,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 10:00:47,735 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=329120.0, ans=0.1 2023-09-29 10:00:49,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:00:49,557 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 10:00:51,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 10:00:51,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 10:00:51,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:00:51,437 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=329120.0, ans=0.125 2023-09-29 10:00:52,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:00:55,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:00:58,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 10:00:58,745 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 10:01:08,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:01:11,771 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=329186.6666666667, ans=0.125 2023-09-29 10:01:12,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:01:13,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 10:01:13,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:01:15,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 10:01:20,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 10:01:22,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:01:25,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:01:28,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:01:28,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:01:28,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 10:01:28,780 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=329253.3333333333, ans=0.0 2023-09-29 10:01:29,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:01:31,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:01:32,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:01:34,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 10:01:34,362 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 10:01:37,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:01:39,012 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 1.945e+02 2.181e+02 2.494e+02 3.678e+02, threshold=4.362e+02, percent-clipped=0.0 2023-09-29 10:01:44,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 10:01:49,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:01:50,485 INFO [train.py:1039] (1/4) Epoch 10, batch 1600, loss[loss=0.1942, simple_loss=0.2599, pruned_loss=0.0642, over 23708.00 frames. ], tot_loss[loss=0.2076, simple_loss=0.2767, pruned_loss=0.06923, over 4714730.37 frames. ], batch size: 149, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 10:01:50,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:01:52,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 10:01:54,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:01:56,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:01:56,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:01:56,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:01:57,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:02:01,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:02:01,765 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=329386.6666666667, ans=0.1 2023-09-29 10:02:02,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 10:02:03,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 10:02:06,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 10:02:09,139 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:02:09,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 10:02:10,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:02:11,335 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=329453.3333333333, ans=0.2 2023-09-29 10:02:12,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:02:19,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:02:22,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 10:02:25,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:02:25,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 10:02:25,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:02:27,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 10:02:34,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 10:02:42,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:02:42,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 10:02:43,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:02:44,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:02:44,017 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:02:45,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 10:02:50,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 10:02:50,568 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=329586.6666666667, ans=0.125 2023-09-29 10:02:53,733 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:02:53,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:02:53,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:02:55,321 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:02:57,021 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:02:57,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:02:58,732 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:03:04,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:03:06,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:03:07,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 10:03:07,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:03:09,314 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 10:03:13,619 INFO [train.py:1039] (1/4) Epoch 10, batch 1650, loss[loss=0.2053, simple_loss=0.2613, pruned_loss=0.0747, over 23435.00 frames. ], tot_loss[loss=0.2086, simple_loss=0.2778, pruned_loss=0.06972, over 4705882.21 frames. ], batch size: 285, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 10:03:13,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:03:15,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:03:15,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:03:15,611 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 10:03:15,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 10:03:17,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 10:03:17,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 10:03:20,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:03:21,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:03:21,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:03:21,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 10:03:24,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:03:25,776 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 10:03:30,077 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:03:30,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:03:30,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:03:30,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:03:31,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 10:03:32,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 10:03:39,139 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=329786.6666666667, ans=0.0 2023-09-29 10:03:39,146 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=329786.6666666667, ans=0.2 2023-09-29 10:03:40,451 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 10:03:42,035 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=329786.6666666667, ans=0.0 2023-09-29 10:03:43,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:03:49,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 10:03:51,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:03:54,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 10:03:55,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:04:00,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:04:00,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:04:01,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:04:02,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:04:03,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:04:04,132 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=329920.0, ans=0.125 2023-09-29 10:04:05,786 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=329920.0, ans=0.1 2023-09-29 10:04:07,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:04:08,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:04:08,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:04:09,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:04:09,400 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=329920.0, ans=0.05 2023-09-29 10:04:11,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:04:11,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:04:16,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:04:17,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 10:04:20,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:04:20,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 10:04:21,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 10:04:21,039 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 10:04:22,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:04:22,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:04:22,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:04:23,884 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.669e+02 2.033e+02 2.438e+02 2.787e+02 4.126e+02, threshold=4.877e+02, percent-clipped=0.0 2023-09-29 10:04:24,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:04:24,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 10:04:27,229 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 10:04:28,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:04:30,029 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:04:30,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:04:33,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 10:04:35,234 INFO [train.py:1039] (1/4) Epoch 10, batch 1700, loss[loss=0.2176, simple_loss=0.2699, pruned_loss=0.08267, over 23814.00 frames. ], tot_loss[loss=0.2079, simple_loss=0.2769, pruned_loss=0.06944, over 4713450.61 frames. ], batch size: 212, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 10:04:37,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:04:37,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:04:37,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 10:04:38,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:04:38,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:04:38,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:04:42,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:04:42,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:04:43,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 10:04:46,571 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 10:04:51,111 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.13 vs. limit=22.5 2023-09-29 10:04:55,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:04:56,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:04:57,444 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.82 vs. limit=10.0 2023-09-29 10:05:01,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 10:05:01,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:05:03,639 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:05:03,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:05:08,132 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 10:05:09,707 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:05:09,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:05:11,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 10:05:12,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 10:05:16,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 10:05:16,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 10:05:17,266 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=330186.6666666667, ans=0.0 2023-09-29 10:05:18,430 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:05:19,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 10:05:22,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:05:30,341 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=330253.3333333333, ans=0.07 2023-09-29 10:05:31,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:05:32,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:05:33,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:05:35,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 10:05:35,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 10:05:35,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:05:37,661 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:05:37,662 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 10:05:37,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:05:37,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:05:39,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:05:39,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:05:41,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:05:41,550 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:05:41,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:05:43,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:05:43,417 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=330320.0, ans=0.0 2023-09-29 10:05:44,499 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:05:48,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:05:50,202 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 10:05:51,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:05:53,272 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:05:54,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 10:05:57,745 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.48 vs. limit=15.0 2023-09-29 10:05:58,421 INFO [train.py:1039] (1/4) Epoch 10, batch 1750, loss[loss=0.1969, simple_loss=0.278, pruned_loss=0.05787, over 24394.00 frames. ], tot_loss[loss=0.2064, simple_loss=0.2749, pruned_loss=0.06891, over 4700152.72 frames. ], batch size: 69, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 10:06:01,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:06:04,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:06:04,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 10:06:06,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 10:06:06,181 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:06:06,519 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=330386.6666666667, ans=0.1 2023-09-29 10:06:09,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:06:09,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:06:14,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 10:06:16,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:06:17,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 10:06:17,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:06:20,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:06:23,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 10:06:26,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 10:06:28,218 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:06:28,260 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 10:06:32,387 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=330520.0, ans=0.125 2023-09-29 10:06:35,200 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:06:38,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:06:38,110 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:06:41,323 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:06:41,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:06:44,859 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:06:47,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:06:49,483 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:06:50,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:06:53,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 10:06:54,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:06:56,654 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=330586.6666666667, ans=0.2 2023-09-29 10:06:57,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 10:06:59,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:07:01,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:07:01,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:07:05,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 10:07:05,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 10:07:07,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:07:10,063 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.729e+02 2.086e+02 2.387e+02 2.992e+02 5.082e+02, threshold=4.774e+02, percent-clipped=1.0 2023-09-29 10:07:10,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:07:11,975 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=330653.3333333333, ans=0.125 2023-09-29 10:07:13,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:07:16,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:07:17,946 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:07:20,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 10:07:20,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:07:21,610 INFO [train.py:1039] (1/4) Epoch 10, batch 1800, loss[loss=0.2167, simple_loss=0.274, pruned_loss=0.07971, over 23774.00 frames. ], tot_loss[loss=0.2054, simple_loss=0.2744, pruned_loss=0.06818, over 4707117.05 frames. ], batch size: 164, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 10:07:21,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 10:07:21,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:07:21,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:07:21,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:07:21,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 10:07:24,872 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:07:26,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:07:26,818 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=330720.0, ans=0.125 2023-09-29 10:07:28,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 10:07:30,620 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=330720.0, ans=0.0 2023-09-29 10:07:31,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:07:35,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 10:07:36,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:07:40,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:07:43,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:07:43,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:07:44,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:07:47,998 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:07:48,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 10:07:49,573 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:07:52,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:07:56,593 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 10:07:59,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 10:08:00,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 10:08:01,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:08:01,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:08:01,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:08:02,719 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:08:10,142 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 10:08:11,589 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:08:11,903 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=330920.0, ans=0.2 2023-09-29 10:08:13,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:08:14,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 10:08:16,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 10:08:16,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:08:18,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:08:19,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 10:08:23,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 10:08:31,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:08:31,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 10:08:32,594 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:08:32,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:08:32,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:08:34,152 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 10:08:37,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:08:37,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:08:40,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 10:08:40,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:08:42,870 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:08:42,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 10:08:42,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:08:44,202 INFO [train.py:1039] (1/4) Epoch 10, batch 1850, loss[loss=0.2412, simple_loss=0.2914, pruned_loss=0.09549, over 23713.00 frames. ], tot_loss[loss=0.2062, simple_loss=0.2755, pruned_loss=0.06847, over 4717474.62 frames. ], batch size: 232, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 10:08:44,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:08:45,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:08:47,461 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:08:48,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:08:52,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:08:52,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:08:58,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:09:00,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 10:09:05,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 10:09:07,397 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.24 vs. limit=15.0 2023-09-29 10:09:08,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 10:09:11,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:09:11,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 10:09:11,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 10:09:13,515 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.25 vs. limit=15.0 2023-09-29 10:09:21,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:09:22,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 10:09:25,759 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.79 vs. limit=22.5 2023-09-29 10:09:26,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:09:26,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:09:31,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 10:09:31,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:09:31,207 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 10:09:34,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:09:34,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:09:37,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:09:42,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:09:43,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:09:43,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 10:09:43,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:09:45,523 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:09:47,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:09:51,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 10:09:52,936 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:09:55,854 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.969e+02 2.199e+02 2.550e+02 3.875e+02, threshold=4.397e+02, percent-clipped=0.0 2023-09-29 10:09:57,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:09:59,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:09:59,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 10:09:59,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 10:10:01,248 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 10:10:02,707 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 10:10:04,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 10:10:04,364 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:10:04,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:10:05,652 INFO [train.py:1039] (1/4) Epoch 10, batch 1900, loss[loss=0.1911, simple_loss=0.2595, pruned_loss=0.06139, over 24367.00 frames. ], tot_loss[loss=0.2065, simple_loss=0.276, pruned_loss=0.06853, over 4732725.39 frames. ], batch size: 56, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 10:10:05,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:10:05,835 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 10:10:05,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:10:07,337 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:10:07,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 10:10:07,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 10:10:09,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:10:10,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 10:10:12,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:10:12,126 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 10:10:12,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:10:12,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:10:17,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:10:20,569 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=331453.3333333333, ans=0.125 2023-09-29 10:10:20,967 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.78 vs. limit=12.0 2023-09-29 10:10:21,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:10:21,828 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 10:10:23,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 10:10:23,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:10:24,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:10:24,896 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 10:10:24,974 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 10:10:28,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 10:10:32,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:10:35,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 10:10:35,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 10:10:46,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 10:10:49,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 10:10:49,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:10:51,291 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 10:10:51,300 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 10:10:52,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 10:10:52,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 10:10:52,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:10:54,650 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=331586.6666666667, ans=0.125 2023-09-29 10:10:57,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 10:11:00,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:11:04,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:11:04,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 10:11:05,340 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=331586.6666666667, ans=0.125 2023-09-29 10:11:06,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:11:06,822 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=331586.6666666667, ans=0.0 2023-09-29 10:11:11,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 10:11:11,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 10:11:12,926 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=331653.3333333333, ans=0.0 2023-09-29 10:11:17,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:11:17,266 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:11:17,288 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:11:17,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:11:19,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 10:11:19,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 10:11:21,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:11:22,918 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:11:22,921 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:11:27,155 INFO [train.py:1039] (1/4) Epoch 10, batch 1950, loss[loss=0.2203, simple_loss=0.2848, pruned_loss=0.07792, over 23737.00 frames. ], tot_loss[loss=0.2072, simple_loss=0.2768, pruned_loss=0.06882, over 4740201.97 frames. ], batch size: 232, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 10:11:27,243 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:11:27,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:11:27,319 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 10:11:28,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:11:31,906 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:11:35,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:11:36,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:11:36,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 10:11:40,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 10:11:41,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 10:11:41,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:11:42,193 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=331786.6666666667, ans=0.1 2023-09-29 10:11:43,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:11:45,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:11:45,289 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=331786.6666666667, ans=0.2 2023-09-29 10:11:46,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:11:46,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:11:46,797 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=331786.6666666667, ans=0.1 2023-09-29 10:11:49,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:11:53,309 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:11:53,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 10:11:53,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:11:53,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:11:58,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:12:01,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:12:01,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:12:01,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 10:12:01,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 10:12:01,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 10:12:01,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:12:02,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:12:06,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:12:09,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:12:14,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 10:12:18,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:12:18,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:12:18,431 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 10:12:19,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:12:24,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:12:25,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:12:27,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 10:12:34,276 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:12:34,553 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=331986.6666666667, ans=0.2 2023-09-29 10:12:35,954 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:12:37,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:12:38,990 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.745e+02 2.098e+02 2.334e+02 2.724e+02 3.808e+02, threshold=4.669e+02, percent-clipped=0.0 2023-09-29 10:12:40,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:12:42,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:12:44,252 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:12:45,716 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 10:12:45,726 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 10:12:45,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:12:47,290 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 10:12:47,695 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=332053.3333333333, ans=0.0 2023-09-29 10:12:48,041 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.81 vs. limit=22.5 2023-09-29 10:12:48,792 INFO [train.py:1039] (1/4) Epoch 10, batch 2000, loss[loss=0.1713, simple_loss=0.2439, pruned_loss=0.04938, over 21911.00 frames. ], tot_loss[loss=0.2086, simple_loss=0.278, pruned_loss=0.06955, over 4728638.78 frames. ], batch size: 48, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 10:12:48,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:12:53,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 10:12:54,285 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=332053.3333333333, ans=0.2 2023-09-29 10:12:55,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:12:55,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:12:58,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:13:00,048 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:13:03,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 10:13:05,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:13:06,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:13:08,439 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 10:13:10,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 10:13:10,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:13:12,470 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.50 vs. limit=22.5 2023-09-29 10:13:13,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:13:14,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 10:13:14,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:13:16,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:13:16,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:13:18,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 10:13:18,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 10:13:20,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 10:13:21,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:13:21,935 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=332186.6666666667, ans=0.125 2023-09-29 10:13:26,374 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:13:27,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 10:13:27,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:13:27,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:13:28,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:13:29,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 10:13:32,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 10:13:34,603 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:13:34,617 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:13:35,090 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=332186.6666666667, ans=0.125 2023-09-29 10:13:39,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:13:39,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:13:39,515 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 10:13:40,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:13:42,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:13:42,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:13:44,066 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 10:13:44,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:13:45,608 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:13:48,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:13:49,029 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=332253.3333333333, ans=0.2 2023-09-29 10:13:50,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 10:13:52,094 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.39 vs. limit=15.0 2023-09-29 10:13:54,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 10:13:55,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:13:56,362 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=332320.0, ans=0.125 2023-09-29 10:13:58,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:13:58,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:14:04,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:14:07,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:14:07,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:14:09,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 10:14:09,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:14:10,532 INFO [train.py:1039] (1/4) Epoch 10, batch 2050, loss[loss=0.2092, simple_loss=0.2893, pruned_loss=0.06453, over 24464.00 frames. ], tot_loss[loss=0.2075, simple_loss=0.2768, pruned_loss=0.06915, over 4728027.62 frames. ], batch size: 69, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:14:12,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:14:12,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:14:15,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:14:15,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:14:21,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:14:24,913 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:14:25,013 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:14:25,106 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:14:28,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 10:14:28,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:14:31,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:14:31,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 10:14:40,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:14:40,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:14:42,586 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 10:14:45,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:14:45,763 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=332520.0, ans=0.0 2023-09-29 10:14:47,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 10:14:47,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:14:50,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:14:51,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:14:53,394 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 10:14:54,137 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:14:55,605 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:14:57,121 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:14:57,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:15:00,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:15:02,610 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:15:04,185 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:15:06,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:15:10,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 10:15:16,225 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:15:16,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 10:15:17,228 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.65 vs. limit=15.0 2023-09-29 10:15:21,695 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.30 vs. limit=15.0 2023-09-29 10:15:22,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:15:22,570 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:15:24,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:15:26,038 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 1.999e+02 2.333e+02 2.741e+02 4.462e+02, threshold=4.667e+02, percent-clipped=0.0 2023-09-29 10:15:27,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 10:15:30,105 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.47 vs. limit=15.0 2023-09-29 10:15:30,863 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 10:15:30,864 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:15:30,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:15:32,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 10:15:34,224 INFO [train.py:1039] (1/4) Epoch 10, batch 2100, loss[loss=0.1838, simple_loss=0.2642, pruned_loss=0.05176, over 24657.00 frames. ], tot_loss[loss=0.2066, simple_loss=0.2753, pruned_loss=0.06892, over 4710002.74 frames. ], batch size: 65, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:15:34,351 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:15:34,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 10:15:34,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 10:15:37,930 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 10:15:39,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:15:41,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:15:41,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:15:42,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:15:42,786 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 10:15:44,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:15:45,687 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 10:15:45,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 10:15:47,987 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=332720.0, ans=0.07 2023-09-29 10:15:49,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:15:49,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:15:49,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 10:15:49,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 10:15:55,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 10:15:55,206 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 10:15:58,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:15:59,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:16:03,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:16:03,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 10:16:05,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:16:05,160 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 10:16:06,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 10:16:06,895 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:16:08,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 10:16:09,044 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 10:16:09,125 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 10:16:12,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 10:16:13,062 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=332853.3333333333, ans=0.125 2023-09-29 10:16:14,300 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:16:17,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 10:16:17,689 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=332853.3333333333, ans=0.0 2023-09-29 10:16:20,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 10:16:22,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:16:24,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:16:24,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 10:16:24,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:16:24,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:16:24,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:16:24,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 10:16:27,201 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 10:16:28,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 10:16:29,500 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.59 vs. limit=15.0 2023-09-29 10:16:33,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:16:34,889 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:16:36,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 10:16:40,252 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=332986.6666666667, ans=0.0 2023-09-29 10:16:41,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:16:44,125 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.58 vs. limit=15.0 2023-09-29 10:16:45,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:16:45,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:16:45,516 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:16:45,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 10:16:46,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 10:16:47,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:16:47,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 10:16:48,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:16:48,617 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:16:50,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 10:16:51,746 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 10:16:51,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:16:54,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:16:54,797 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:16:54,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:16:56,175 INFO [train.py:1039] (1/4) Epoch 10, batch 2150, loss[loss=0.1997, simple_loss=0.2731, pruned_loss=0.06317, over 24444.00 frames. ], tot_loss[loss=0.2052, simple_loss=0.2747, pruned_loss=0.06787, over 4725784.22 frames. ], batch size: 63, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:16:56,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:17:03,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 10:17:04,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:17:06,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:17:07,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:17:07,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:17:07,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:17:09,492 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:17:11,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:17:11,517 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:17:14,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:17:14,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 10:17:21,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:17:22,771 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:17:24,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:17:24,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:17:24,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:17:25,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 10:17:25,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:17:25,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:17:27,299 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:17:27,679 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=333186.6666666667, ans=0.0 2023-09-29 10:17:28,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 10:17:30,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:17:31,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:17:31,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:17:34,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 10:17:35,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:17:37,245 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:17:38,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:17:40,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:17:40,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 10:17:40,364 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 10:17:43,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:17:44,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:17:47,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:17:47,409 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=333253.3333333333, ans=0.0 2023-09-29 10:17:48,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 10:17:50,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:17:50,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:17:50,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 10:17:52,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 10:17:52,631 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=333253.3333333333, ans=0.125 2023-09-29 10:17:53,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:17:53,804 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 10:17:53,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:17:53,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:17:55,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 10:17:55,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:17:55,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 10:17:56,972 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 10:17:56,972 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 10:17:57,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 10:17:58,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:17:59,953 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:17:59,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:18:01,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:18:01,756 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=333320.0, ans=0.125 2023-09-29 10:18:01,847 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=333320.0, ans=0.0 2023-09-29 10:18:04,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 10:18:04,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:18:04,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:18:10,129 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.515e+02 1.840e+02 1.996e+02 2.223e+02 3.215e+02, threshold=3.992e+02, percent-clipped=0.0 2023-09-29 10:18:13,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:18:13,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 10:18:18,068 INFO [train.py:1039] (1/4) Epoch 10, batch 2200, loss[loss=0.2084, simple_loss=0.2856, pruned_loss=0.06559, over 23958.00 frames. ], tot_loss[loss=0.2054, simple_loss=0.2751, pruned_loss=0.06782, over 4728264.06 frames. ], batch size: 86, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:18:18,153 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:18:23,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:18:25,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:18:25,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:18:27,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 10:18:27,299 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=333386.6666666667, ans=0.125 2023-09-29 10:18:30,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:18:31,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:18:31,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 10:18:37,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 10:18:37,767 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=333453.3333333333, ans=0.0 2023-09-29 10:18:40,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 10:18:45,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 10:18:48,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:18:49,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:18:50,495 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:18:52,305 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:18:52,355 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 10:18:57,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 10:18:59,611 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:18:59,716 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 10:19:04,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:19:05,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:19:05,817 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=333586.6666666667, ans=0.125 2023-09-29 10:19:07,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:19:08,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:19:11,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 10:19:11,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:19:13,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 10:19:14,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:19:14,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 10:19:17,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:19:18,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:19:20,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:19:20,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:19:20,103 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:19:21,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 10:19:21,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:19:23,375 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 10:19:26,514 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 10:19:26,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:19:30,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:19:30,940 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 10:19:34,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:19:34,550 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 10:19:36,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 10:19:36,132 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 10:19:37,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:19:39,093 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 10:19:40,616 INFO [train.py:1039] (1/4) Epoch 10, batch 2250, loss[loss=0.1957, simple_loss=0.2609, pruned_loss=0.06526, over 17998.00 frames. ], tot_loss[loss=0.2055, simple_loss=0.2755, pruned_loss=0.06772, over 4728413.30 frames. ], batch size: 39, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:19:40,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:19:41,471 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.90 vs. limit=10.0 2023-09-29 10:19:42,276 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 10:19:42,653 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=333720.0, ans=0.0 2023-09-29 10:19:43,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:19:45,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:19:52,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:19:53,937 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:19:54,118 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=333720.0, ans=0.1 2023-09-29 10:19:57,248 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=333786.6666666667, ans=0.125 2023-09-29 10:19:58,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:19:58,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:19:59,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:20:01,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 10:20:01,578 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:20:03,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:20:07,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 10:20:07,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:20:07,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:20:09,287 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:20:09,651 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=333786.6666666667, ans=0.125 2023-09-29 10:20:12,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:20:14,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 10:20:14,159 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 10:20:16,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 10:20:18,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:20:20,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:20:25,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:20:27,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:20:28,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:20:28,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:20:31,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:20:34,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:20:38,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:20:40,370 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 10:20:45,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 10:20:45,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:20:46,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:20:48,595 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=333986.6666666667, ans=0.2 2023-09-29 10:20:51,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 10:20:54,189 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.640e+02 2.009e+02 2.238e+02 2.511e+02 3.719e+02, threshold=4.476e+02, percent-clipped=0.0 2023-09-29 10:20:54,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 10:20:54,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 10:20:54,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:20:56,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:20:59,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 10:21:02,438 INFO [train.py:1039] (1/4) Epoch 10, batch 2300, loss[loss=0.1859, simple_loss=0.2719, pruned_loss=0.0499, over 24397.00 frames. ], tot_loss[loss=0.2065, simple_loss=0.2762, pruned_loss=0.06846, over 4712709.73 frames. ], batch size: 77, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:21:02,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:21:02,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:21:07,675 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=334053.3333333333, ans=0.1 2023-09-29 10:21:08,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:21:08,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:21:12,961 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 10:21:15,140 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=29.90 vs. limit=15.0 2023-09-29 10:21:16,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:21:25,461 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:21:25,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 10:21:25,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:21:26,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:21:26,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 10:21:27,185 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=334120.0, ans=0.07 2023-09-29 10:21:28,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:21:31,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:21:31,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:21:34,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:21:35,138 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=334186.6666666667, ans=0.0 2023-09-29 10:21:36,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:21:39,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:21:43,096 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=334186.6666666667, ans=0.04949747468305833 2023-09-29 10:21:44,651 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=334186.6666666667, ans=0.0 2023-09-29 10:21:47,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:21:47,098 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:21:47,507 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=334186.6666666667, ans=0.125 2023-09-29 10:21:50,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:21:53,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:21:54,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:21:56,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 10:21:56,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:21:56,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 10:21:58,717 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=334253.3333333333, ans=0.125 2023-09-29 10:22:00,070 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 10:22:00,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:22:01,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:22:02,819 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:22:02,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:22:04,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 10:22:04,393 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 10:22:05,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 10:22:05,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:22:05,821 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:22:05,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 10:22:06,251 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=334320.0, ans=0.125 2023-09-29 10:22:12,153 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:22:16,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:22:22,668 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:22:22,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:22:22,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 10:22:24,222 INFO [train.py:1039] (1/4) Epoch 10, batch 2350, loss[loss=0.1763, simple_loss=0.2496, pruned_loss=0.05148, over 24634.00 frames. ], tot_loss[loss=0.209, simple_loss=0.2774, pruned_loss=0.07027, over 4694234.62 frames. ], batch size: 60, lr: 1.04e-02, grad_scale: 8.0 2023-09-29 10:22:24,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 10:22:24,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:22:24,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 10:22:25,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 10:22:32,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:22:32,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 10:22:38,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 10:22:41,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:22:45,814 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.97 vs. limit=10.0 2023-09-29 10:22:46,946 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:22:46,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:22:46,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:22:47,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:22:48,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 10:22:52,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:22:57,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 10:23:00,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:23:01,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:23:01,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:23:04,813 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:23:06,360 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 10:23:06,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:23:08,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:23:08,072 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:23:08,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:23:13,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:23:14,887 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.min_positive, batch_count=334586.6666666667, ans=0.05 2023-09-29 10:23:15,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 10:23:15,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:23:17,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:23:17,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:23:20,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 10:23:22,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:23:22,595 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=334586.6666666667, ans=0.0 2023-09-29 10:23:25,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 10:23:25,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:23:30,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 10:23:31,373 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.85 vs. limit=10.0 2023-09-29 10:23:35,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 10:23:35,609 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=334653.3333333333, ans=0.0 2023-09-29 10:23:36,036 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=6.24 vs. limit=12.0 2023-09-29 10:23:36,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:23:36,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 10:23:36,768 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 10:23:36,797 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 10:23:38,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 10:23:39,745 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.645e+02 2.129e+02 2.355e+02 2.700e+02 4.237e+02, threshold=4.711e+02, percent-clipped=0.0 2023-09-29 10:23:41,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:23:44,719 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:23:46,760 INFO [train.py:1039] (1/4) Epoch 10, batch 2400, loss[loss=0.204, simple_loss=0.2793, pruned_loss=0.0643, over 24301.00 frames. ], tot_loss[loss=0.2082, simple_loss=0.2768, pruned_loss=0.06986, over 4686284.41 frames. ], batch size: 77, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:23:49,020 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:23:50,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:23:50,858 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 10:23:52,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 10:23:59,559 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 10:23:59,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:24:01,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 10:24:02,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:24:04,031 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:24:04,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 10:24:08,916 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:24:11,843 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 10:24:12,092 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=334786.6666666667, ans=0.125 2023-09-29 10:24:12,173 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=334786.6666666667, ans=0.0 2023-09-29 10:24:17,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 10:24:23,685 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 10:24:26,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:24:28,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:24:32,618 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=334853.3333333333, ans=0.1 2023-09-29 10:24:33,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:24:33,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 10:24:33,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:24:39,631 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.61 vs. limit=22.5 2023-09-29 10:24:41,637 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:24:43,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:24:46,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:24:47,799 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:24:47,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 10:24:47,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:24:49,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:24:49,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:24:49,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 10:24:49,657 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=334920.0, ans=0.125 2023-09-29 10:24:52,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:24:52,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 10:24:54,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 10:24:56,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 10:24:56,356 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=334986.6666666667, ans=0.1 2023-09-29 10:24:57,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:24:57,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:24:57,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 10:24:57,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 10:24:57,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 10:24:57,921 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 10:25:01,300 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 10:25:01,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:25:04,925 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:25:04,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:25:06,435 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 10:25:06,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:25:07,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 10:25:08,400 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=335053.3333333333, ans=0.125 2023-09-29 10:25:09,569 INFO [train.py:1039] (1/4) Epoch 10, batch 2450, loss[loss=0.2062, simple_loss=0.2873, pruned_loss=0.06253, over 24585.00 frames. ], tot_loss[loss=0.2076, simple_loss=0.2759, pruned_loss=0.06963, over 4688636.79 frames. ], batch size: 71, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:25:11,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:25:11,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:25:15,715 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:25:15,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:25:17,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 10:25:22,280 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=335053.3333333333, ans=0.2 2023-09-29 10:25:23,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:25:23,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:25:27,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:25:27,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:25:27,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:25:27,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 10:25:32,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:25:33,968 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 10:25:34,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:25:39,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 10:25:39,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:25:39,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:25:39,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:25:42,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 10:25:43,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:25:51,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:25:53,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:25:53,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:25:53,277 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:25:53,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:25:54,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:25:56,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 10:26:01,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:26:01,242 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:26:05,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:26:05,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:26:10,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:26:10,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 10:26:11,145 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:26:11,353 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=335253.3333333333, ans=0.1 2023-09-29 10:26:12,067 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.31 vs. limit=15.0 2023-09-29 10:26:12,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:26:12,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 10:26:12,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:26:15,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:26:20,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:26:21,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:26:21,878 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:26:24,596 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 2.022e+02 2.367e+02 2.913e+02 5.353e+02, threshold=4.733e+02, percent-clipped=2.0 2023-09-29 10:26:26,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 10:26:26,712 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=335320.0, ans=0.0 2023-09-29 10:26:27,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:26:30,833 INFO [train.py:1039] (1/4) Epoch 10, batch 2500, loss[loss=0.1961, simple_loss=0.2745, pruned_loss=0.0588, over 24430.00 frames. ], tot_loss[loss=0.2059, simple_loss=0.2744, pruned_loss=0.06871, over 4691697.72 frames. ], batch size: 69, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:26:33,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:26:43,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 10:26:43,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:26:45,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:26:45,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 10:26:52,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 10:26:53,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:26:54,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 10:26:54,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 10:26:54,610 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 10:26:56,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:26:57,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:26:59,026 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 10:26:59,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:26:59,191 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 10:27:00,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:27:06,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:27:08,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:27:10,221 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=335520.0, ans=0.125 2023-09-29 10:27:11,518 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 10:27:11,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 10:27:13,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:27:15,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:27:19,514 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:27:25,536 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:27:29,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:27:33,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 10:27:36,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 10:27:38,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:27:38,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 10:27:41,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:27:41,314 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 10:27:41,496 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 10:27:41,497 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 10:27:42,896 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 10:27:44,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:27:46,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 10:27:46,262 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 10:27:48,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:27:48,648 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 10:27:48,894 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=335653.3333333333, ans=0.0 2023-09-29 10:27:49,063 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=335653.3333333333, ans=0.125 2023-09-29 10:27:52,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 10:27:53,795 INFO [train.py:1039] (1/4) Epoch 10, batch 2550, loss[loss=0.2042, simple_loss=0.2887, pruned_loss=0.05984, over 24317.00 frames. ], tot_loss[loss=0.2063, simple_loss=0.2752, pruned_loss=0.06875, over 4690753.84 frames. ], batch size: 74, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:27:56,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:27:56,943 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.64 vs. limit=15.0 2023-09-29 10:27:57,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:27:59,001 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:28:00,634 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:28:02,042 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 10:28:03,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 10:28:03,993 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 10:28:06,605 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 10:28:06,976 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=335720.0, ans=0.1 2023-09-29 10:28:08,132 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:28:09,707 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:28:12,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:28:12,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 10:28:12,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 10:28:14,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:28:14,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:28:14,620 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=335786.6666666667, ans=0.125 2023-09-29 10:28:16,100 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:28:16,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 10:28:17,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 10:28:17,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:28:17,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 10:28:33,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:28:38,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:28:38,706 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:28:40,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:28:40,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 10:28:46,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:28:48,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 10:28:48,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:28:48,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:28:49,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 10:28:49,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 10:28:53,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:28:53,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:28:58,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:28:58,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 10:28:58,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:29:00,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:29:00,439 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:29:01,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 10:29:03,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:29:04,440 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=335986.6666666667, ans=0.125 2023-09-29 10:29:09,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:29:10,868 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.915e+02 2.103e+02 2.425e+02 3.393e+02, threshold=4.205e+02, percent-clipped=0.0 2023-09-29 10:29:11,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:29:14,220 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 10:29:17,125 INFO [train.py:1039] (1/4) Epoch 10, batch 2600, loss[loss=0.2293, simple_loss=0.3051, pruned_loss=0.07669, over 24052.00 frames. ], tot_loss[loss=0.2077, simple_loss=0.2762, pruned_loss=0.06957, over 4699274.77 frames. ], batch size: 86, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:29:17,220 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 10:29:17,266 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:29:17,321 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 10:29:18,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 10:29:18,874 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 10:29:20,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:29:21,914 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 10:29:23,954 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 10:29:24,308 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=336053.3333333333, ans=0.07 2023-09-29 10:29:25,438 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 10:29:26,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:29:29,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 10:29:30,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 10:29:32,224 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 10:29:32,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 10:29:35,441 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 10:29:35,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 10:29:43,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:29:43,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:29:43,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:29:43,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 10:29:45,407 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=336120.0, ans=0.2 2023-09-29 10:29:46,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:29:50,342 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=336186.6666666667, ans=0.0 2023-09-29 10:29:51,449 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 10:29:51,708 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=336186.6666666667, ans=0.125 2023-09-29 10:29:56,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:29:56,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:29:58,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 10:29:58,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:29:58,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:30:00,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 10:30:03,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 10:30:03,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:30:05,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:30:09,779 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 10:30:09,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:30:09,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:30:17,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:30:19,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:30:19,316 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 10:30:19,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:30:22,432 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:30:23,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:30:29,400 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=336320.0, ans=0.2 2023-09-29 10:30:30,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 10:30:32,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:30:33,674 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 10:30:35,449 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=336320.0, ans=0.5 2023-09-29 10:30:38,149 INFO [train.py:1039] (1/4) Epoch 10, batch 2650, loss[loss=0.2017, simple_loss=0.2631, pruned_loss=0.07015, over 23895.00 frames. ], tot_loss[loss=0.2075, simple_loss=0.2761, pruned_loss=0.06938, over 4717135.82 frames. ], batch size: 195, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:30:40,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 10:30:40,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:30:40,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 10:30:40,680 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 10:30:40,874 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=336386.6666666667, ans=0.2 2023-09-29 10:30:42,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:30:45,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:30:47,392 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=336386.6666666667, ans=0.0 2023-09-29 10:30:48,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 10:30:50,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:30:50,449 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=336386.6666666667, ans=0.125 2023-09-29 10:30:52,545 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:30:55,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 10:30:55,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 10:30:55,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:30:58,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 10:30:58,771 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 10:30:59,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=336453.3333333333, ans=0.125 2023-09-29 10:31:01,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:31:05,523 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 10:31:05,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:31:06,982 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 10:31:11,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:31:11,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 10:31:11,588 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:31:11,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:31:16,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 10:31:16,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 10:31:22,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 10:31:24,496 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 10:31:24,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:31:26,038 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:31:26,096 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 10:31:26,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:31:26,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:31:28,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:31:30,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:31:30,194 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:31:31,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:31:33,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:31:33,682 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 10:31:34,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:31:36,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:31:39,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:31:39,505 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=336586.6666666667, ans=0.2 2023-09-29 10:31:41,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:31:41,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 10:31:45,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:31:45,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:31:45,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:31:46,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 10:31:49,275 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=336653.3333333333, ans=0.0 2023-09-29 10:31:50,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:31:53,976 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:31:54,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:31:56,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:31:57,509 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.958e+02 2.205e+02 2.606e+02 4.713e+02, threshold=4.410e+02, percent-clipped=1.0 2023-09-29 10:31:57,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 10:31:59,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:32:00,535 INFO [train.py:1039] (1/4) Epoch 10, batch 2700, loss[loss=0.1717, simple_loss=0.2471, pruned_loss=0.04818, over 24336.00 frames. ], tot_loss[loss=0.2065, simple_loss=0.2759, pruned_loss=0.06849, over 4732056.37 frames. ], batch size: 56, lr: 1.04e-02, grad_scale: 8.0 2023-09-29 10:32:00,837 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=336720.0, ans=0.125 2023-09-29 10:32:01,441 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.15 vs. limit=15.0 2023-09-29 10:32:02,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:32:02,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 10:32:04,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:32:06,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 10:32:07,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:32:07,985 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=336720.0, ans=0.0 2023-09-29 10:32:09,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:32:09,296 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:32:10,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:32:10,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:32:12,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:32:12,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 10:32:12,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 10:32:12,356 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:32:14,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 10:32:16,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 10:32:16,151 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:32:20,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:32:20,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 10:32:22,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:32:24,175 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=336786.6666666667, ans=0.0 2023-09-29 10:32:25,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:32:27,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:32:34,112 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:32:34,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:32:34,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:32:34,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:32:37,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:32:42,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:32:42,361 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:32:42,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:32:49,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:32:49,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:32:51,030 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=336920.0, ans=0.125 2023-09-29 10:32:51,181 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=336920.0, ans=0.125 2023-09-29 10:32:56,346 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=6.69 vs. limit=12.0 2023-09-29 10:32:58,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:32:58,585 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:33:01,699 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:33:01,702 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:33:05,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:33:06,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:33:07,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:33:09,227 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:33:12,062 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:33:12,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:33:13,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:33:15,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:33:15,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:33:20,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 10:33:21,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:33:23,837 INFO [train.py:1039] (1/4) Epoch 10, batch 2750, loss[loss=0.2061, simple_loss=0.2867, pruned_loss=0.06277, over 24371.00 frames. ], tot_loss[loss=0.2059, simple_loss=0.2753, pruned_loss=0.0682, over 4734634.33 frames. ], batch size: 77, lr: 1.04e-02, grad_scale: 8.0 2023-09-29 10:33:23,976 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:33:23,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 10:33:25,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 10:33:25,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:33:28,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:33:28,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:33:32,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:33:32,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 10:33:33,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:33:36,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:33:36,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 10:33:38,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:33:38,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:33:38,175 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 10:33:38,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:33:38,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:33:43,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 10:33:43,827 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=337120.0, ans=0.1 2023-09-29 10:33:46,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:33:48,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:33:48,140 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:33:48,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 10:33:49,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:33:51,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:33:51,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:33:53,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:33:55,478 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=337186.6666666667, ans=0.2 2023-09-29 10:33:57,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 10:33:57,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 10:33:57,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:33:58,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:33:58,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 10:33:59,083 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=337186.6666666667, ans=0.125 2023-09-29 10:33:59,117 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=337186.6666666667, ans=0.125 2023-09-29 10:34:02,684 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.53 vs. limit=22.5 2023-09-29 10:34:06,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:34:08,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 10:34:08,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:34:10,255 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=337186.6666666667, ans=0.125 2023-09-29 10:34:13,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:34:13,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:34:15,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 10:34:18,331 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=337253.3333333333, ans=0.04949747468305833 2023-09-29 10:34:21,190 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:34:21,863 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.33 vs. limit=10.0 2023-09-29 10:34:22,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:34:22,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 10:34:22,870 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=337253.3333333333, ans=0.015 2023-09-29 10:34:25,249 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.30 vs. limit=15.0 2023-09-29 10:34:26,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:34:30,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 10:34:34,924 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 10:34:37,909 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:34:37,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 10:34:38,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:34:41,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:34:41,297 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 10:34:41,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:34:42,599 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.677e+02 2.063e+02 2.307e+02 2.543e+02 4.120e+02, threshold=4.614e+02, percent-clipped=0.0 2023-09-29 10:34:45,503 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.89 vs. limit=15.0 2023-09-29 10:34:46,181 INFO [train.py:1039] (1/4) Epoch 10, batch 2800, loss[loss=0.2176, simple_loss=0.268, pruned_loss=0.08357, over 23785.00 frames. ], tot_loss[loss=0.2057, simple_loss=0.2748, pruned_loss=0.06832, over 4729200.81 frames. ], batch size: 212, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:34:46,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 10:34:46,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:34:48,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:34:50,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 10:34:50,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:34:50,296 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:34:51,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:34:51,906 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 10:34:51,907 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 10:34:56,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:34:59,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 10:34:59,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:35:03,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:35:06,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 10:35:08,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 10:35:08,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 10:35:10,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:35:11,749 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:35:11,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:35:15,228 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=337453.3333333333, ans=0.125 2023-09-29 10:35:16,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:35:16,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:35:16,393 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 10:35:17,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:35:22,006 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=337520.0, ans=0.125 2023-09-29 10:35:26,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:35:28,379 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:35:29,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:35:31,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:35:32,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:35:39,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:35:39,110 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 10:35:40,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:35:40,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:35:40,890 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:35:46,949 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:35:47,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:35:51,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:35:53,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:35:53,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:35:53,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 10:35:54,875 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 10:35:54,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 10:35:56,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:35:56,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 10:35:56,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:35:57,083 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.67 vs. limit=22.5 2023-09-29 10:35:59,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:35:59,344 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:36:00,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 10:36:02,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:36:02,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:36:02,456 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=337653.3333333333, ans=0.2 2023-09-29 10:36:03,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:36:05,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 10:36:08,570 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=337720.0, ans=0.05 2023-09-29 10:36:10,400 INFO [train.py:1039] (1/4) Epoch 10, batch 2850, loss[loss=0.1769, simple_loss=0.2504, pruned_loss=0.05168, over 24293.00 frames. ], tot_loss[loss=0.2049, simple_loss=0.2742, pruned_loss=0.06782, over 4736466.66 frames. ], batch size: 56, lr: 1.04e-02, grad_scale: 8.0 2023-09-29 10:36:12,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:36:12,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 10:36:12,461 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=337720.0, ans=0.125 2023-09-29 10:36:13,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:36:15,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:36:18,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:36:20,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:36:20,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:36:23,465 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:36:23,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:36:25,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:36:25,575 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=337786.6666666667, ans=0.0 2023-09-29 10:36:26,631 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 10:36:31,928 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 10:36:31,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:36:34,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 10:36:35,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:36:35,889 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=337786.6666666667, ans=0.1 2023-09-29 10:36:38,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 10:36:38,868 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 10:36:40,382 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:36:47,171 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=337853.3333333333, ans=0.125 2023-09-29 10:36:55,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:36:56,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:36:56,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:36:56,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 10:36:56,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 10:36:58,069 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:37:00,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 10:37:00,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 10:37:01,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:37:01,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:37:03,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:37:04,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:37:06,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:37:06,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:37:09,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:37:11,436 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:37:14,328 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:37:16,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:37:16,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:37:16,747 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 10:37:18,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:37:19,953 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=337986.6666666667, ans=0.125 2023-09-29 10:37:24,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:37:25,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 10:37:25,616 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 10:37:27,279 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 10:37:28,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:37:28,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 10:37:28,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:37:30,083 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 2.039e+02 2.270e+02 2.590e+02 3.840e+02, threshold=4.540e+02, percent-clipped=0.0 2023-09-29 10:37:30,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:37:30,277 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:37:32,247 INFO [train.py:1039] (1/4) Epoch 10, batch 2900, loss[loss=0.1902, simple_loss=0.2764, pruned_loss=0.05197, over 24443.00 frames. ], tot_loss[loss=0.2042, simple_loss=0.2739, pruned_loss=0.06724, over 4741507.88 frames. ], batch size: 69, lr: 1.04e-02, grad_scale: 8.0 2023-09-29 10:37:32,317 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:37:32,318 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 10:37:32,380 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 10:37:32,385 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:37:32,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:37:37,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 10:37:37,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:37:37,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:37:38,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 10:37:43,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:37:43,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 10:37:45,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 10:37:45,953 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=338053.3333333333, ans=0.125 2023-09-29 10:37:47,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:37:47,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:37:49,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:37:51,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:37:55,555 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:37:55,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:37:55,896 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=338120.0, ans=0.0 2023-09-29 10:37:57,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 10:37:57,510 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=338120.0, ans=0.04949747468305833 2023-09-29 10:37:58,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 10:37:58,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 10:38:00,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:38:03,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 10:38:05,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 10:38:05,969 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=338186.6666666667, ans=0.0 2023-09-29 10:38:07,149 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:38:07,154 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 10:38:07,181 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:38:08,823 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:38:08,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 10:38:10,544 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=338186.6666666667, ans=0.0 2023-09-29 10:38:11,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:38:13,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:38:15,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:38:18,593 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:38:18,879 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=338186.6666666667, ans=10.0 2023-09-29 10:38:20,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 10:38:20,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 10:38:20,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:38:25,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:38:28,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 10:38:28,565 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:38:34,631 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:38:44,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:38:44,441 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:38:45,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 10:38:47,916 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=338320.0, ans=0.0 2023-09-29 10:38:49,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:38:49,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 10:38:49,232 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:38:51,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 10:38:55,097 INFO [train.py:1039] (1/4) Epoch 10, batch 2950, loss[loss=0.2027, simple_loss=0.2883, pruned_loss=0.05855, over 24357.00 frames. ], tot_loss[loss=0.2053, simple_loss=0.2752, pruned_loss=0.06771, over 4738268.09 frames. ], batch size: 74, lr: 1.04e-02, grad_scale: 8.0 2023-09-29 10:38:55,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:38:58,214 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 10:38:58,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:38:58,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:39:01,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:39:02,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:39:04,266 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 10:39:04,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 10:39:04,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 10:39:04,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:39:11,976 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=338453.3333333333, ans=0.1 2023-09-29 10:39:13,090 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:39:14,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:39:16,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:39:16,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:39:19,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:39:19,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:39:19,779 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=338453.3333333333, ans=0.1 2023-09-29 10:39:21,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:39:23,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:39:24,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:39:25,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 10:39:26,407 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=338520.0, ans=0.0 2023-09-29 10:39:31,265 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 10:39:31,331 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 10:39:32,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:39:34,446 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 10:39:35,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 10:39:35,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:39:37,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:39:37,421 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 10:39:37,428 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 10:39:40,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 10:39:42,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:39:42,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:39:46,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:39:47,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:39:47,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:39:49,094 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 10:39:50,458 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:39:50,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 10:39:56,822 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:39:58,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:39:59,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 10:39:59,829 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:40:03,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 10:40:05,604 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=338653.3333333333, ans=0.1 2023-09-29 10:40:07,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:40:08,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:40:08,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:40:10,604 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:40:10,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 10:40:13,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:40:13,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:40:13,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 10:40:13,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 10:40:15,013 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.713e+02 2.095e+02 2.360e+02 2.809e+02 3.974e+02, threshold=4.720e+02, percent-clipped=0.0 2023-09-29 10:40:15,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:40:16,524 INFO [train.py:1039] (1/4) Epoch 10, batch 3000, loss[loss=0.2018, simple_loss=0.2838, pruned_loss=0.05991, over 24491.00 frames. ], tot_loss[loss=0.2065, simple_loss=0.2763, pruned_loss=0.06831, over 4735808.91 frames. ], batch size: 66, lr: 1.04e-02, grad_scale: 8.0 2023-09-29 10:40:16,525 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-29 10:40:31,303 INFO [train.py:1071] (1/4) Epoch 10, validation: loss=0.2858, simple_loss=0.2843, pruned_loss=0.1436, over 1125622.00 frames. 2023-09-29 10:40:31,304 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-29 10:40:31,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:40:33,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:40:33,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 10:40:34,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:40:38,744 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:40:38,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 10:40:43,342 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 10:40:43,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 10:40:45,108 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=338720.0, ans=0.1 2023-09-29 10:40:45,237 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=338720.0, ans=0.125 2023-09-29 10:40:46,290 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:40:46,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:40:47,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 10:40:49,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:40:55,659 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 10:41:06,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:41:14,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 10:41:16,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:41:16,514 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=338853.3333333333, ans=0.125 2023-09-29 10:41:19,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 10:41:19,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:41:19,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:41:21,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:41:21,037 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 10:41:22,607 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 10:41:24,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:41:24,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 10:41:25,847 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=338920.0, ans=0.125 2023-09-29 10:41:28,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:41:28,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:41:28,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:41:28,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:41:34,092 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=338920.0, ans=0.1 2023-09-29 10:41:35,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 10:41:35,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:41:35,318 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:41:36,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:41:39,143 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 10:41:40,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:41:40,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:41:40,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:41:45,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:41:45,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:41:47,330 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 10:41:47,393 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 10:41:47,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:41:49,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 10:41:49,565 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:41:52,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 10:41:54,032 INFO [train.py:1039] (1/4) Epoch 10, batch 3050, loss[loss=0.2264, simple_loss=0.295, pruned_loss=0.07891, over 23227.00 frames. ], tot_loss[loss=0.2079, simple_loss=0.2776, pruned_loss=0.06909, over 4723736.14 frames. ], batch size: 105, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:41:54,216 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 10:41:54,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 10:41:54,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 10:41:55,862 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 10:41:55,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 10:41:57,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:41:57,561 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=339053.3333333333, ans=0.1 2023-09-29 10:41:58,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:41:58,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 10:41:58,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:42:00,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:42:01,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 10:42:05,401 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:42:07,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:42:07,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:42:10,261 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:42:15,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 10:42:22,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 10:42:22,540 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 10:42:22,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:42:27,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:42:30,251 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:42:30,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:42:31,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:42:34,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:42:34,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 10:42:34,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:42:36,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:42:36,972 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:42:37,117 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:42:38,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:42:40,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:42:41,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 10:42:43,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:42:43,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 10:42:44,101 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=14.02 vs. limit=15.0 2023-09-29 10:42:45,240 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=339253.3333333333, ans=0.1 2023-09-29 10:42:46,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:42:46,544 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 10:42:48,030 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:42:48,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:42:51,194 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=339253.3333333333, ans=0.125 2023-09-29 10:42:53,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:42:56,019 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:43:02,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:43:02,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:43:02,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:43:03,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:43:05,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 10:43:05,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:43:07,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 10:43:09,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:43:09,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:43:10,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 10:43:12,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:43:15,089 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 1.990e+02 2.334e+02 2.687e+02 4.208e+02, threshold=4.668e+02, percent-clipped=0.0 2023-09-29 10:43:16,557 INFO [train.py:1039] (1/4) Epoch 10, batch 3100, loss[loss=0.2085, simple_loss=0.2865, pruned_loss=0.06528, over 24133.00 frames. ], tot_loss[loss=0.2083, simple_loss=0.2779, pruned_loss=0.06932, over 4721981.03 frames. ], batch size: 80, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:43:18,344 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:43:19,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:43:21,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 10:43:21,888 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=339386.6666666667, ans=0.125 2023-09-29 10:43:25,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 10:43:28,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 10:43:28,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 10:43:29,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:43:34,544 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:43:34,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:43:36,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 10:43:41,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:43:42,988 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=339453.3333333333, ans=0.125 2023-09-29 10:43:47,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 10:43:50,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 10:43:50,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:43:52,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:43:52,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:43:52,430 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=339520.0, ans=0.125 2023-09-29 10:43:53,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 10:43:55,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:43:55,293 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 10:43:55,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:43:56,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:43:58,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 10:44:00,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:44:04,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:44:05,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 10:44:07,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 10:44:08,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:44:08,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:44:10,184 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:44:10,201 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:44:10,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:44:12,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 10:44:12,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:44:14,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:44:15,613 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:44:15,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:44:15,626 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 10:44:18,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:44:19,336 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=339586.6666666667, ans=0.1 2023-09-29 10:44:19,913 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.60 vs. limit=12.0 2023-09-29 10:44:20,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 10:44:23,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:44:23,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 10:44:25,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:44:25,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:44:25,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 10:44:27,223 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=339653.3333333333, ans=0.1 2023-09-29 10:44:36,352 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=339653.3333333333, ans=0.0 2023-09-29 10:44:37,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 10:44:38,888 INFO [train.py:1039] (1/4) Epoch 10, batch 3150, loss[loss=0.2286, simple_loss=0.2965, pruned_loss=0.08032, over 23340.00 frames. ], tot_loss[loss=0.2068, simple_loss=0.276, pruned_loss=0.06878, over 4698554.41 frames. ], batch size: 93, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:44:41,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:44:41,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:44:43,551 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:44:43,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:44:45,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 10:44:45,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:44:47,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 10:44:48,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 10:44:48,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:44:51,751 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 10:44:54,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 10:44:56,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:44:56,324 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 10:44:57,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 10:44:59,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 10:44:59,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 10:44:59,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 10:44:59,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:44:59,477 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:45:00,945 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:45:04,422 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 10:45:06,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:45:06,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:45:06,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:45:09,953 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 10:45:13,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 10:45:14,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:45:16,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 10:45:16,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:45:16,648 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=339853.3333333333, ans=0.0 2023-09-29 10:45:17,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 10:45:18,214 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=339853.3333333333, ans=0.0 2023-09-29 10:45:22,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 10:45:22,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:45:24,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 10:45:24,089 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 10:45:24,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:45:24,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:45:25,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 10:45:25,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 10:45:27,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 10:45:27,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 10:45:27,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:45:28,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:45:30,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:45:31,716 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 10:45:31,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:45:33,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 10:45:34,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:45:34,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 10:45:37,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 10:45:38,666 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:45:38,785 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=339920.0, ans=0.125 2023-09-29 10:45:40,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:45:40,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 10:45:42,222 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 10:45:43,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:45:44,566 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=10.69 vs. limit=10.0 2023-09-29 10:45:47,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:45:49,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:45:49,265 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:45:52,770 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=339986.6666666667, ans=0.125 2023-09-29 10:45:55,198 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:45:55,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:45:57,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 10:45:58,682 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.693e+02 1.980e+02 2.239e+02 2.536e+02 3.813e+02, threshold=4.479e+02, percent-clipped=0.0 2023-09-29 10:46:00,296 INFO [train.py:1039] (1/4) Epoch 10, batch 3200, loss[loss=0.2074, simple_loss=0.2702, pruned_loss=0.07226, over 23533.00 frames. ], tot_loss[loss=0.2054, simple_loss=0.2741, pruned_loss=0.06835, over 4705660.29 frames. ], batch size: 256, lr: 1.03e-02, grad_scale: 16.0 2023-09-29 10:46:05,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:46:05,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 10:46:08,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:46:10,363 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:46:10,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 10:46:11,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:46:17,447 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 10:46:21,136 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:46:31,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:46:42,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 10:46:42,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:46:44,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 10:46:45,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 10:46:49,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:46:49,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 10:46:51,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:46:56,556 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 10:46:58,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 10:46:59,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 10:47:01,834 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 10:47:02,094 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=340253.3333333333, ans=0.125 2023-09-29 10:47:04,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 10:47:05,103 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=340253.3333333333, ans=0.1 2023-09-29 10:47:09,510 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:47:10,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:47:10,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:47:12,469 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 10:47:12,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 10:47:15,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:47:18,669 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 10:47:18,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 10:47:19,049 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=340320.0, ans=0.5 2023-09-29 10:47:20,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 10:47:21,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 10:47:23,680 INFO [train.py:1039] (1/4) Epoch 10, batch 3250, loss[loss=0.2017, simple_loss=0.2812, pruned_loss=0.06111, over 24671.00 frames. ], tot_loss[loss=0.2043, simple_loss=0.2737, pruned_loss=0.06742, over 4717726.67 frames. ], batch size: 68, lr: 1.03e-02, grad_scale: 16.0 2023-09-29 10:47:23,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:47:24,824 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=7.07 vs. limit=15.0 2023-09-29 10:47:29,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 10:47:29,663 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 10:47:29,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:47:29,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:47:32,553 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 10:47:37,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:47:40,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:47:47,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:47:47,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 10:47:48,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:47:48,696 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:47:48,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:47:50,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:47:50,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 10:47:53,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:47:53,613 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=340453.3333333333, ans=0.125 2023-09-29 10:47:54,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:47:56,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:47:56,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:47:56,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:47:56,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:47:59,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:48:00,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:48:03,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:48:03,886 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:48:04,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:48:05,400 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:48:05,418 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:48:11,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 10:48:11,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:48:11,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:48:12,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:48:14,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 10:48:18,492 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.95 vs. limit=10.0 2023-09-29 10:48:19,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:48:27,008 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 10:48:28,099 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:48:29,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:48:29,553 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 10:48:29,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:48:29,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 10:48:29,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:48:29,979 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=340653.3333333333, ans=0.0 2023-09-29 10:48:31,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 10:48:31,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 10:48:32,305 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=340653.3333333333, ans=0.5 2023-09-29 10:48:32,356 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=340653.3333333333, ans=0.1 2023-09-29 10:48:33,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:48:34,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:48:36,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:48:36,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 10:48:36,719 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=340653.3333333333, ans=0.0 2023-09-29 10:48:37,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:48:38,875 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=12.48 vs. limit=15.0 2023-09-29 10:48:39,908 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.96 vs. limit=22.5 2023-09-29 10:48:42,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:48:42,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:48:44,370 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.513e+02 2.128e+02 2.401e+02 2.998e+02 4.766e+02, threshold=4.802e+02, percent-clipped=1.0 2023-09-29 10:48:44,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 10:48:44,541 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:48:46,474 INFO [train.py:1039] (1/4) Epoch 10, batch 3300, loss[loss=0.2019, simple_loss=0.2647, pruned_loss=0.06956, over 23647.00 frames. ], tot_loss[loss=0.2052, simple_loss=0.2748, pruned_loss=0.06777, over 4706637.26 frames. ], batch size: 149, lr: 1.03e-02, grad_scale: 16.0 2023-09-29 10:48:46,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:48:46,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 10:48:49,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:48:51,301 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 10:48:52,860 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 10:48:52,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 10:48:53,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:48:56,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:48:57,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:48:57,889 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:48:59,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 10:49:00,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 10:49:03,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:49:05,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:49:10,779 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 10:49:12,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:49:12,168 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:49:14,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:49:15,726 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 10:49:17,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:49:17,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 10:49:19,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 10:49:19,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:49:20,929 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 10:49:22,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:49:22,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 10:49:24,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:49:24,364 WARNING [train.py:1197] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 10:49:25,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 10:49:25,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:49:26,209 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=340853.3333333333, ans=0.0 2023-09-29 10:49:27,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:49:30,464 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 10:49:33,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 10:49:33,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:49:35,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 10:49:38,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:49:39,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:49:40,547 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.58 vs. limit=15.0 2023-09-29 10:49:41,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:49:45,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:49:45,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:49:45,514 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:49:45,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:49:48,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:49:48,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:49:49,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:49:50,665 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 10:49:52,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 10:49:55,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 10:49:57,167 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:49:57,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:49:58,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:49:58,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:50:00,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 10:50:00,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:50:00,408 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 10:50:00,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:50:03,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 10:50:05,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 10:50:05,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:50:06,759 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:50:08,279 INFO [train.py:1039] (1/4) Epoch 10, batch 3350, loss[loss=0.2166, simple_loss=0.2853, pruned_loss=0.07402, over 23449.00 frames. ], tot_loss[loss=0.2052, simple_loss=0.2747, pruned_loss=0.06791, over 4716144.53 frames. ], batch size: 93, lr: 1.03e-02, grad_scale: 16.0 2023-09-29 10:50:08,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:50:08,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:50:09,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:50:11,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:50:11,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:50:15,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:50:17,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:50:19,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:50:23,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:50:24,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:50:26,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:50:27,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:50:28,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 10:50:30,650 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 10:50:30,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:50:35,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 10:50:35,035 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 10:50:35,197 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:50:35,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:50:38,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:50:38,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 10:50:38,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:50:38,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:50:41,247 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:50:42,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:50:44,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:50:45,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:50:47,608 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=341186.6666666667, ans=0.125 2023-09-29 10:50:47,616 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=341186.6666666667, ans=0.0 2023-09-29 10:50:49,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:50:52,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:50:53,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:50:56,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:50:56,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:51:00,049 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:51:00,063 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:51:01,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:51:03,376 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=341253.3333333333, ans=0.125 2023-09-29 10:51:05,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 10:51:05,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 10:51:05,170 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 10:51:05,232 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:51:06,728 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 10:51:07,116 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=341253.3333333333, ans=0.1 2023-09-29 10:51:08,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:51:09,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:51:14,056 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.51 vs. limit=15.0 2023-09-29 10:51:16,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:51:17,913 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 10:51:17,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 10:51:20,910 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:51:22,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:51:28,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:51:29,309 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=341320.0, ans=0.2 2023-09-29 10:51:30,303 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.651e+02 2.041e+02 2.250e+02 2.635e+02 4.628e+02, threshold=4.499e+02, percent-clipped=0.0 2023-09-29 10:51:30,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 10:51:31,828 INFO [train.py:1039] (1/4) Epoch 10, batch 3400, loss[loss=0.2196, simple_loss=0.2775, pruned_loss=0.08084, over 23897.00 frames. ], tot_loss[loss=0.2065, simple_loss=0.2764, pruned_loss=0.06832, over 4728366.38 frames. ], batch size: 164, lr: 1.03e-02, grad_scale: 16.0 2023-09-29 10:51:31,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 10:51:31,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:51:32,269 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=341386.6666666667, ans=0.125 2023-09-29 10:51:33,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:51:35,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 10:51:37,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:51:37,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 10:51:38,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:51:39,420 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.44 vs. limit=6.0 2023-09-29 10:51:40,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:51:40,387 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 10:51:41,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:51:41,940 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 10:51:43,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 10:51:43,886 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 10:51:45,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:51:50,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:51:50,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 10:51:51,381 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:51:52,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:51:59,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:51:59,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 10:52:05,130 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:52:06,728 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=341520.0, ans=0.125 2023-09-29 10:52:08,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:52:09,895 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:52:11,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 10:52:12,373 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=341520.0, ans=0.0 2023-09-29 10:52:16,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:52:17,387 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.07 vs. limit=15.0 2023-09-29 10:52:19,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 10:52:24,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:52:25,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:52:25,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 10:52:25,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:52:27,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:52:27,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:52:28,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:52:32,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:52:32,607 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=341586.6666666667, ans=0.125 2023-09-29 10:52:35,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 10:52:35,697 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:52:38,114 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=341653.3333333333, ans=0.0 2023-09-29 10:52:41,201 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=341653.3333333333, ans=0.2 2023-09-29 10:52:42,957 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:52:44,558 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 10:52:51,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 10:52:53,943 INFO [train.py:1039] (1/4) Epoch 10, batch 3450, loss[loss=0.2062, simple_loss=0.2829, pruned_loss=0.06474, over 24678.00 frames. ], tot_loss[loss=0.2082, simple_loss=0.2778, pruned_loss=0.06933, over 4710083.62 frames. ], batch size: 65, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:52:54,223 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=341720.0, ans=0.125 2023-09-29 10:52:55,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 10:52:58,974 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=341720.0, ans=0.2 2023-09-29 10:52:59,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 10:53:00,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:53:00,921 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.48 vs. limit=15.0 2023-09-29 10:53:01,742 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:53:01,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 10:53:03,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:53:06,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 10:53:10,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:53:12,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:53:13,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 10:53:13,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:53:16,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:53:24,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 10:53:28,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 10:53:28,909 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 10:53:28,982 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:53:30,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:53:32,126 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=341853.3333333333, ans=0.2 2023-09-29 10:53:32,214 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 10:53:36,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 10:53:37,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:53:39,915 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=341853.3333333333, ans=0.05 2023-09-29 10:53:39,947 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=341853.3333333333, ans=0.1 2023-09-29 10:53:42,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:53:42,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:53:43,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 10:53:45,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:53:45,626 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=341920.0, ans=0.0 2023-09-29 10:53:46,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 10:53:46,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:53:49,198 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=341920.0, ans=0.0 2023-09-29 10:53:50,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:53:52,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:53:55,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 10:53:55,820 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=341920.0, ans=0.0 2023-09-29 10:53:59,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:54:04,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:54:06,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:54:09,139 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:54:12,117 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.93 vs. limit=6.0 2023-09-29 10:54:12,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:54:12,792 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:54:12,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:54:12,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:54:16,391 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.481e+02 2.010e+02 2.253e+02 2.520e+02 3.608e+02, threshold=4.507e+02, percent-clipped=0.0 2023-09-29 10:54:16,434 INFO [train.py:1039] (1/4) Epoch 10, batch 3500, loss[loss=0.1983, simple_loss=0.2366, pruned_loss=0.08001, over 19505.00 frames. ], tot_loss[loss=0.2062, simple_loss=0.275, pruned_loss=0.06867, over 4694879.12 frames. ], batch size: 388, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:54:18,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:54:21,334 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:54:23,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 10:54:25,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 10:54:28,725 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 10:54:30,579 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=342053.3333333333, ans=0.2 2023-09-29 10:54:31,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:54:31,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 10:54:35,060 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:54:36,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:54:38,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:54:38,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:54:38,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 10:54:39,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:54:39,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:54:39,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 10:54:42,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:54:43,002 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=342120.0, ans=0.09899494936611666 2023-09-29 10:54:44,080 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 10:54:45,227 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=342120.0, ans=0.125 2023-09-29 10:54:46,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:54:51,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:54:51,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 10:54:51,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:54:54,441 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:54:55,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:54:58,014 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:54:59,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:54:59,672 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:55:01,258 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 10:55:01,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 10:55:02,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 10:55:04,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:55:04,721 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=342253.3333333333, ans=0.2 2023-09-29 10:55:05,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:55:05,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:55:07,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:55:07,687 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=342253.3333333333, ans=0.125 2023-09-29 10:55:10,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 10:55:10,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:55:15,327 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:55:16,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 10:55:16,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 10:55:16,802 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:55:18,213 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.58 vs. limit=15.0 2023-09-29 10:55:20,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:55:20,697 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:55:22,265 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:55:25,283 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 10:55:26,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:55:27,008 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:55:29,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 10:55:30,645 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 10:55:33,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:55:35,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:55:35,110 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:55:35,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:55:37,246 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=5.79 vs. limit=15.0 2023-09-29 10:55:38,068 INFO [train.py:1039] (1/4) Epoch 10, batch 3550, loss[loss=0.2088, simple_loss=0.2717, pruned_loss=0.07298, over 23139.00 frames. ], tot_loss[loss=0.205, simple_loss=0.2734, pruned_loss=0.0683, over 4695252.48 frames. ], batch size: 105, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:55:39,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:55:44,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:55:46,747 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=342386.6666666667, ans=0.0 2023-09-29 10:55:48,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 10:55:48,941 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=342386.6666666667, ans=0.125 2023-09-29 10:55:50,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:55:52,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:55:55,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:55:55,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:55:55,503 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=342453.3333333333, ans=0.025 2023-09-29 10:55:55,526 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=342453.3333333333, ans=0.0 2023-09-29 10:55:56,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 10:56:00,242 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:56:00,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:56:00,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:56:00,389 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 10:56:02,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 10:56:08,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:56:08,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:56:10,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:56:10,062 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:56:11,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:56:11,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 10:56:11,753 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:56:13,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:56:13,641 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=342520.0, ans=0.2 2023-09-29 10:56:14,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 10:56:19,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:56:19,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:56:21,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:56:24,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 10:56:24,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:56:26,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 10:56:27,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:56:30,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:56:30,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:56:33,206 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 10:56:35,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:56:40,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:56:42,381 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 10:56:42,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:56:45,793 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=342653.3333333333, ans=0.125 2023-09-29 10:56:47,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:56:48,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 10:56:54,519 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=342653.3333333333, ans=0.0 2023-09-29 10:56:55,559 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 10:56:55,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:56:55,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:56:57,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:56:58,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:56:58,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:57:00,741 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_positive, batch_count=342720.0, ans=0.05 2023-09-29 10:57:01,106 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=6.40 vs. limit=12.0 2023-09-29 10:57:02,291 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.640e+02 2.051e+02 2.303e+02 2.669e+02 4.347e+02, threshold=4.606e+02, percent-clipped=0.0 2023-09-29 10:57:02,334 INFO [train.py:1039] (1/4) Epoch 10, batch 3600, loss[loss=0.2055, simple_loss=0.2689, pruned_loss=0.07104, over 23696.00 frames. ], tot_loss[loss=0.2051, simple_loss=0.2735, pruned_loss=0.06839, over 4692861.53 frames. ], batch size: 232, lr: 1.03e-02, grad_scale: 16.0 2023-09-29 10:57:04,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:57:05,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:57:07,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:57:07,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:57:09,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:57:09,353 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 10:57:14,341 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 10:57:15,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:57:20,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:57:23,333 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:57:24,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 10:57:24,893 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:57:26,253 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 10:57:26,360 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:57:28,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:57:29,467 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:57:31,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:57:31,996 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=342786.6666666667, ans=0.1 2023-09-29 10:57:33,324 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:57:33,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:57:36,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 10:57:43,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:57:45,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 10:57:45,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 10:57:49,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:57:49,430 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=342853.3333333333, ans=0.07 2023-09-29 10:57:53,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:57:56,741 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:58:02,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:58:02,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:58:02,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 10:58:05,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 10:58:05,414 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 10:58:05,653 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=342920.0, ans=0.2 2023-09-29 10:58:08,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:58:08,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:58:10,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 10:58:11,846 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:58:11,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:58:11,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:58:13,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 10:58:15,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 10:58:18,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:58:18,916 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 10:58:25,481 INFO [train.py:1039] (1/4) Epoch 10, batch 3650, loss[loss=0.1759, simple_loss=0.2595, pruned_loss=0.04617, over 24673.00 frames. ], tot_loss[loss=0.206, simple_loss=0.2751, pruned_loss=0.0685, over 4694616.56 frames. ], batch size: 65, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:58:25,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 10:58:27,139 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:58:27,607 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=343053.3333333333, ans=0.0 2023-09-29 10:58:29,027 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=343053.3333333333, ans=0.0 2023-09-29 10:58:30,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 10:58:33,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 10:58:35,219 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=343053.3333333333, ans=0.2 2023-09-29 10:58:36,249 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:58:36,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:58:36,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 10:58:41,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 10:58:41,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:58:42,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 10:58:44,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 10:58:45,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:58:45,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 10:58:46,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 10:58:48,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:58:48,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:58:49,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:58:52,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 10:58:54,362 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 10:58:54,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:58:56,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 10:58:59,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:58:59,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 10:59:02,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:59:04,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:59:04,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 10:59:06,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:59:06,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:59:07,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:59:10,246 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=343186.6666666667, ans=0.2 2023-09-29 10:59:12,820 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:59:14,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:59:14,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:59:16,053 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 10:59:17,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 10:59:19,287 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:59:19,384 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:59:24,436 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 10:59:25,662 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 10:59:29,418 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:59:29,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:59:30,981 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 10:59:31,072 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:59:31,402 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=343320.0, ans=0.125 2023-09-29 10:59:33,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:59:35,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:59:37,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 10:59:37,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:59:40,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 10:59:42,534 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=343320.0, ans=0.2 2023-09-29 10:59:43,780 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:59:43,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:59:47,288 INFO [train.py:1039] (1/4) Epoch 10, batch 3700, loss[loss=0.2117, simple_loss=0.2719, pruned_loss=0.07572, over 23521.00 frames. ], tot_loss[loss=0.2071, simple_loss=0.2761, pruned_loss=0.0691, over 4696654.60 frames. ], batch size: 134, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:59:47,393 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:59:47,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 10:59:47,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:59:48,846 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.980e+02 2.218e+02 2.465e+02 4.377e+02, threshold=4.435e+02, percent-clipped=0.0 2023-09-29 10:59:48,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 10:59:49,014 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 10:59:53,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 10:59:57,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:59:58,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:59:58,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:59:58,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:00:00,370 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 11:00:02,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:00:04,367 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 11:00:13,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:00:15,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 11:00:17,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:00:17,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 11:00:17,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:00:19,136 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=343520.0, ans=0.0 2023-09-29 11:00:20,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:00:20,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 11:00:22,116 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:00:24,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:00:25,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:00:25,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 11:00:26,245 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=343520.0, ans=0.0 2023-09-29 11:00:28,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 11:00:32,619 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:00:32,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 11:00:34,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:00:34,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 11:00:38,845 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=343586.6666666667, ans=0.125 2023-09-29 11:00:40,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:00:40,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:00:43,117 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=343586.6666666667, ans=0.0 2023-09-29 11:00:44,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:00:44,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 11:00:45,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:00:45,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 11:00:47,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:00:47,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:00:49,713 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.63 vs. limit=15.0 2023-09-29 11:00:50,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:00:52,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 11:00:52,353 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 11:00:53,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 11:00:55,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:00:55,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:00:57,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:00:58,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:01:00,454 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=343653.3333333333, ans=0.95 2023-09-29 11:01:01,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:01:01,909 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=343653.3333333333, ans=0.0 2023-09-29 11:01:03,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:01:04,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:01:07,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 11:01:08,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 11:01:10,106 INFO [train.py:1039] (1/4) Epoch 10, batch 3750, loss[loss=0.2287, simple_loss=0.287, pruned_loss=0.08522, over 23814.00 frames. ], tot_loss[loss=0.2085, simple_loss=0.2772, pruned_loss=0.0699, over 4701435.58 frames. ], batch size: 179, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 11:01:11,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 11:01:11,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 11:01:13,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:01:14,866 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.86 vs. limit=15.0 2023-09-29 11:01:15,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:01:17,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:01:17,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:01:18,180 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.71 vs. limit=15.0 2023-09-29 11:01:22,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:01:26,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:01:27,337 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=343786.6666666667, ans=0.125 2023-09-29 11:01:28,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 11:01:30,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:01:32,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:01:32,900 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=343786.6666666667, ans=0.1 2023-09-29 11:01:33,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 11:01:33,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:01:35,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:01:35,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:01:35,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_abs, batch_count=343786.6666666667, ans=0.5 2023-09-29 11:01:38,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 11:01:43,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 11:01:44,331 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=343853.3333333333, ans=0.0 2023-09-29 11:01:45,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:01:45,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:01:47,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:01:51,083 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=343853.3333333333, ans=0.125 2023-09-29 11:01:52,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:01:55,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 11:01:58,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 11:02:01,425 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.34 vs. limit=15.0 2023-09-29 11:02:02,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:02:02,710 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=10.75 vs. limit=12.0 2023-09-29 11:02:08,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:02:08,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:02:09,183 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.60 vs. limit=10.0 2023-09-29 11:02:11,683 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 11:02:16,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 11:02:17,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 11:02:21,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:02:23,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:02:25,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 11:02:33,596 INFO [train.py:1039] (1/4) Epoch 10, batch 3800, loss[loss=0.2505, simple_loss=0.3019, pruned_loss=0.09959, over 23843.00 frames. ], tot_loss[loss=0.2081, simple_loss=0.277, pruned_loss=0.06964, over 4710674.81 frames. ], batch size: 195, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 11:02:35,060 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 2.030e+02 2.447e+02 3.016e+02 6.033e+02, threshold=4.894e+02, percent-clipped=2.0 2023-09-29 11:02:35,400 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:02:38,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:02:38,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 11:02:40,262 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 11:02:40,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:02:43,912 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:02:44,054 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 11:02:45,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 11:02:45,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:02:47,026 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:02:48,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:02:48,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:02:50,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:02:50,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 11:02:53,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 11:02:53,960 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:02:58,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:03:00,722 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=344120.0, ans=0.125 2023-09-29 11:03:02,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:03:03,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:03:04,203 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=344120.0, ans=0.2 2023-09-29 11:03:05,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 11:03:05,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:03:08,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:03:10,189 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=344186.6666666667, ans=0.125 2023-09-29 11:03:11,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:03:16,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 11:03:16,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 11:03:18,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:03:21,815 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=344253.3333333333, ans=0.125 2023-09-29 11:03:23,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:03:28,694 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=344253.3333333333, ans=0.1 2023-09-29 11:03:29,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:03:33,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 11:03:34,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 11:03:36,271 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:03:37,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:03:39,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:03:39,618 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=344320.0, ans=0.1 2023-09-29 11:03:41,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 11:03:44,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 11:03:44,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 11:03:44,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:03:45,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:03:51,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:03:52,663 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 11:03:55,614 INFO [train.py:1039] (1/4) Epoch 10, batch 3850, loss[loss=0.1851, simple_loss=0.2511, pruned_loss=0.05959, over 24444.00 frames. ], tot_loss[loss=0.2064, simple_loss=0.2749, pruned_loss=0.069, over 4693208.96 frames. ], batch size: 58, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 11:03:58,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:04:00,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 11:04:02,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:04:03,908 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:04:07,769 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 11:04:10,754 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:04:12,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 11:04:12,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 11:04:17,978 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:04:19,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:04:21,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:04:21,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:04:25,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:04:25,141 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:04:25,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:04:25,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:04:25,466 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=344453.3333333333, ans=0.0 2023-09-29 11:04:28,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:04:31,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:04:31,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:04:31,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:04:32,611 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=8.98 vs. limit=15.0 2023-09-29 11:04:33,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 11:04:33,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 11:04:35,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:04:35,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:04:40,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:04:40,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:04:40,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 11:04:43,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 11:04:44,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:04:47,794 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 11:04:49,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 11:04:54,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:04:55,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:04:59,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:05:01,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 11:05:04,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 11:05:06,098 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=344653.3333333333, ans=0.125 2023-09-29 11:05:07,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:05:07,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:05:11,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 11:05:11,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 11:05:12,515 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:05:14,020 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:05:14,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:05:14,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 11:05:15,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:05:17,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 11:05:17,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:05:19,118 INFO [train.py:1039] (1/4) Epoch 10, batch 3900, loss[loss=0.2022, simple_loss=0.2861, pruned_loss=0.05918, over 24642.00 frames. ], tot_loss[loss=0.2047, simple_loss=0.2727, pruned_loss=0.0684, over 4692364.98 frames. ], batch size: 73, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 11:05:19,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:05:19,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:05:20,644 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.918e+02 2.148e+02 2.537e+02 4.144e+02, threshold=4.296e+02, percent-clipped=0.0 2023-09-29 11:05:20,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:05:22,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:05:22,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:05:22,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:05:23,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:05:23,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 11:05:23,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:05:29,460 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:05:29,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 11:05:30,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:05:31,401 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=344720.0, ans=0.125 2023-09-29 11:05:32,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:05:34,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 11:05:34,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:05:35,096 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 11:05:36,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 11:05:36,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:05:39,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 11:05:39,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:05:39,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 11:05:42,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 11:05:46,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:05:48,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:05:48,322 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:05:48,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:05:51,102 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=344786.6666666667, ans=0.1 2023-09-29 11:05:52,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:05:55,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:05:56,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:05:56,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:05:58,354 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:06:02,589 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=344853.3333333333, ans=0.125 2023-09-29 11:06:03,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:06:03,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:06:10,825 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=344920.0, ans=0.125 2023-09-29 11:06:12,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 11:06:14,271 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:06:24,235 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:06:28,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:06:30,187 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 11:06:30,260 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 11:06:30,280 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:06:33,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 11:06:34,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:06:34,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 11:06:42,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:06:42,656 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 11:06:43,935 INFO [train.py:1039] (1/4) Epoch 10, batch 3950, loss[loss=0.2296, simple_loss=0.2857, pruned_loss=0.08678, over 23817.00 frames. ], tot_loss[loss=0.2045, simple_loss=0.2732, pruned_loss=0.06788, over 4711745.23 frames. ], batch size: 195, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 11:06:44,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:06:47,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:06:48,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:06:53,351 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.43 vs. limit=10.0 2023-09-29 11:06:58,412 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 11:06:58,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:06:58,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 11:07:00,597 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 11:07:00,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:07:03,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:07:03,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:07:03,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:07:07,048 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 11:07:08,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:07:10,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:07:10,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:07:10,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:07:10,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:07:10,621 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=345120.0, ans=0.2 2023-09-29 11:07:18,077 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=345186.6666666667, ans=0.125 2023-09-29 11:07:20,374 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.68 vs. limit=15.0 2023-09-29 11:07:22,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:07:22,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:07:23,297 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.12 vs. limit=15.0 2023-09-29 11:07:29,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 11:07:34,651 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 11:07:34,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 11:07:36,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:07:37,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:07:47,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 11:07:47,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 11:07:47,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:07:47,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:07:47,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 11:07:56,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:07:57,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:07:59,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 11:08:07,699 INFO [train.py:1039] (1/4) Epoch 10, batch 4000, loss[loss=0.2765, simple_loss=0.3154, pruned_loss=0.1188, over 19110.00 frames. ], tot_loss[loss=0.2053, simple_loss=0.2747, pruned_loss=0.06793, over 4722258.86 frames. ], batch size: 388, lr: 1.03e-02, grad_scale: 16.0 2023-09-29 11:08:09,120 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 2.046e+02 2.407e+02 2.777e+02 6.014e+02, threshold=4.814e+02, percent-clipped=1.0 2023-09-29 11:08:10,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:08:16,413 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 11:08:17,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:08:22,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:08:23,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:08:23,985 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:08:24,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 11:08:25,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 11:08:26,477 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.63 vs. limit=22.5 2023-09-29 11:08:26,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 11:08:26,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:08:26,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 11:08:31,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:08:34,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:08:34,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:08:34,171 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:08:34,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:08:34,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 11:08:37,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:08:39,365 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 11:08:40,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:08:41,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:08:44,016 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 11:08:44,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 11:08:44,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:08:49,457 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=345520.0, ans=0.0 2023-09-29 11:08:52,186 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 11:08:53,693 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:08:55,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:08:56,892 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 11:08:58,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:08:58,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 11:08:58,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:08:59,168 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.13 vs. limit=12.0 2023-09-29 11:09:00,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:09:01,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:09:03,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:09:03,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:09:04,190 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=345586.6666666667, ans=0.125 2023-09-29 11:09:05,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:09:08,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 11:09:08,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:09:09,773 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.72 vs. limit=10.0 2023-09-29 11:09:10,466 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 11:09:16,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:09:18,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 11:09:20,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:09:21,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:09:23,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:09:24,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:09:29,483 INFO [train.py:1039] (1/4) Epoch 10, batch 4050, loss[loss=0.199, simple_loss=0.2787, pruned_loss=0.05968, over 24483.00 frames. ], tot_loss[loss=0.2049, simple_loss=0.2746, pruned_loss=0.06755, over 4726176.04 frames. ], batch size: 66, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:09:29,680 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:09:32,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 11:09:34,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 11:09:37,202 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:09:37,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:09:38,792 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:09:38,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 11:09:40,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:09:41,632 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=345720.0, ans=0.1 2023-09-29 11:09:44,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:09:49,213 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:09:49,298 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 11:09:50,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:09:50,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:09:55,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:09:57,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 11:09:59,411 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=345786.6666666667, ans=0.1 2023-09-29 11:10:00,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 11:10:02,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 11:10:02,475 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=345853.3333333333, ans=0.0 2023-09-29 11:10:03,770 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 11:10:05,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:10:10,420 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=345853.3333333333, ans=0.125 2023-09-29 11:10:10,495 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=345853.3333333333, ans=0.04949747468305833 2023-09-29 11:10:11,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 11:10:11,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:10:12,415 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.86 vs. limit=12.0 2023-09-29 11:10:16,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:10:20,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:10:21,048 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=345920.0, ans=0.0 2023-09-29 11:10:22,245 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:10:22,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:10:25,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:10:28,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 11:10:28,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 11:10:29,891 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:10:31,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 11:10:36,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:10:36,821 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=345986.6666666667, ans=0.125 2023-09-29 11:10:44,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 11:10:45,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:10:45,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:10:47,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 11:10:47,237 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 11:10:47,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:10:50,272 INFO [train.py:1039] (1/4) Epoch 10, batch 4100, loss[loss=0.2364, simple_loss=0.2917, pruned_loss=0.09055, over 22685.00 frames. ], tot_loss[loss=0.2052, simple_loss=0.2754, pruned_loss=0.06749, over 4731834.66 frames. ], batch size: 322, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:10:50,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:10:52,478 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.639e+02 1.996e+02 2.168e+02 2.450e+02 3.987e+02, threshold=4.335e+02, percent-clipped=0.0 2023-09-29 11:10:52,606 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:10:52,667 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:10:57,708 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.41 vs. limit=15.0 2023-09-29 11:11:00,498 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=346053.3333333333, ans=0.1 2023-09-29 11:11:01,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 11:11:03,316 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=346053.3333333333, ans=0.125 2023-09-29 11:11:04,429 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 11:11:05,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 11:11:08,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 11:11:08,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:11:09,676 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:11:09,733 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:11:09,756 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:11:11,314 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 11:11:13,227 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=346120.0, ans=0.0 2023-09-29 11:11:14,340 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:11:14,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:11:15,813 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:11:17,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:11:20,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 11:11:21,840 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:11:23,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:11:23,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 11:11:23,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:11:23,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:11:23,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:11:23,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:11:24,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 11:11:27,358 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:11:28,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 11:11:30,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:11:32,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:11:32,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 11:11:35,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:11:35,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:11:37,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:11:38,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 11:11:40,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:11:40,475 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:11:44,102 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 11:11:44,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:11:45,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 11:11:48,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:11:53,432 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:11:56,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:11:58,611 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:12:06,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:12:06,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:12:09,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:12:12,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:12:13,809 INFO [train.py:1039] (1/4) Epoch 10, batch 4150, loss[loss=0.1889, simple_loss=0.2631, pruned_loss=0.05739, over 24677.00 frames. ], tot_loss[loss=0.2056, simple_loss=0.2755, pruned_loss=0.0678, over 4723308.79 frames. ], batch size: 65, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:12:17,545 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 11:12:19,154 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 11:12:19,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:12:19,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:12:23,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 11:12:23,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:12:23,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 11:12:23,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 11:12:25,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 11:12:26,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:12:31,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:12:31,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:12:36,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:12:38,516 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:12:38,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 11:12:38,889 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=346453.3333333333, ans=0.125 2023-09-29 11:12:40,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 11:12:40,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:12:42,421 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 11:12:43,950 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=346453.3333333333, ans=0.125 2023-09-29 11:12:46,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:12:50,960 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=346520.0, ans=0.125 2023-09-29 11:12:52,223 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:12:52,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 11:12:56,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 11:12:56,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:12:56,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 11:12:56,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:12:56,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:12:58,587 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=346520.0, ans=0.125 2023-09-29 11:13:00,812 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.47 vs. limit=22.5 2023-09-29 11:13:01,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:13:01,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:13:04,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 11:13:07,012 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 11:13:09,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:13:10,621 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 11:13:11,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:13:13,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 11:13:15,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:13:18,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:13:18,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:13:18,544 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=346653.3333333333, ans=0.125 2023-09-29 11:13:19,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 11:13:19,920 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:13:19,926 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 11:13:21,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 11:13:29,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 11:13:29,053 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:13:29,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 11:13:29,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 11:13:30,498 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 11:13:31,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:13:31,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 11:13:32,008 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:13:33,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:13:33,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 11:13:35,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 11:13:36,708 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=346653.3333333333, ans=0.125 2023-09-29 11:13:39,487 INFO [train.py:1039] (1/4) Epoch 10, batch 4200, loss[loss=0.194, simple_loss=0.235, pruned_loss=0.07649, over 19336.00 frames. ], tot_loss[loss=0.2058, simple_loss=0.2754, pruned_loss=0.0681, over 4708615.51 frames. ], batch size: 388, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:13:39,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:13:41,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 11:13:42,742 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 2.147e+02 2.478e+02 3.007e+02 3.865e+02, threshold=4.955e+02, percent-clipped=0.0 2023-09-29 11:13:43,662 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 11:13:46,613 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:13:48,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:13:49,522 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:13:49,524 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:13:51,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 11:13:53,567 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=346720.0, ans=0.0 2023-09-29 11:13:54,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 11:13:54,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:13:56,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:13:59,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:14:03,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 11:14:06,077 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:14:06,116 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:14:07,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 11:14:07,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:14:09,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:14:10,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:14:10,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:14:12,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:14:15,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 11:14:15,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:14:20,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 11:14:20,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:14:22,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:14:22,590 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=346853.3333333333, ans=0.125 2023-09-29 11:14:23,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:14:25,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:14:25,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 11:14:27,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:14:27,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:14:32,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:14:35,503 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:14:35,649 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=346920.0, ans=0.1 2023-09-29 11:14:42,053 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.96 vs. limit=15.0 2023-09-29 11:14:42,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 11:14:44,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 11:14:47,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:14:53,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 11:14:54,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:14:55,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 11:14:56,825 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=346986.6666666667, ans=0.0 2023-09-29 11:14:58,446 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=346986.6666666667, ans=0.125 2023-09-29 11:15:01,580 INFO [train.py:1039] (1/4) Epoch 10, batch 4250, loss[loss=0.2151, simple_loss=0.2882, pruned_loss=0.07096, over 24671.00 frames. ], tot_loss[loss=0.2056, simple_loss=0.2751, pruned_loss=0.06809, over 4714606.06 frames. ], batch size: 65, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:15:03,160 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 11:15:06,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:15:07,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 11:15:09,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:15:12,223 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.90 vs. limit=10.0 2023-09-29 11:15:13,438 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=347053.3333333333, ans=0.125 2023-09-29 11:15:14,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:15:14,796 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 11:15:14,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:15:17,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:15:22,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:15:26,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:15:28,339 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:15:29,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:15:29,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:15:31,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:15:33,033 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:15:35,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:15:38,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:15:39,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:15:41,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 11:15:43,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 11:15:43,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:15:45,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:15:45,250 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:15:46,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:15:46,744 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:15:48,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:15:51,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 11:15:51,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:15:56,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:15:58,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:16:00,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 11:16:00,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:16:01,253 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=6.32 vs. limit=12.0 2023-09-29 11:16:01,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 11:16:03,471 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 11:16:05,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 11:16:06,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:16:06,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:16:10,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 11:16:11,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 11:16:11,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 11:16:15,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:16:16,943 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=347320.0, ans=0.125 2023-09-29 11:16:18,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:16:21,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:16:22,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:16:24,810 INFO [train.py:1039] (1/4) Epoch 10, batch 4300, loss[loss=0.2001, simple_loss=0.2763, pruned_loss=0.06194, over 24492.00 frames. ], tot_loss[loss=0.2044, simple_loss=0.2742, pruned_loss=0.06732, over 4712258.66 frames. ], batch size: 63, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:16:24,922 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:16:26,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:16:27,853 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.628e+02 1.974e+02 2.213e+02 2.611e+02 3.799e+02, threshold=4.426e+02, percent-clipped=0.0 2023-09-29 11:16:27,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:16:27,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 11:16:29,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:16:32,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:16:34,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:16:38,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:16:45,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:16:45,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 11:16:48,211 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:16:49,811 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:16:49,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:16:49,897 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 11:16:54,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 11:16:56,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 11:16:59,596 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 11:16:59,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:16:59,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 11:17:02,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 11:17:04,459 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:17:07,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:17:07,428 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:17:09,615 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:17:11,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:17:11,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:17:13,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 11:17:13,375 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 11:17:15,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:17:18,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:17:18,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 11:17:18,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:17:18,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:17:18,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 11:17:18,752 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 11:17:20,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 11:17:20,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:17:20,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 11:17:21,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 11:17:25,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:17:27,071 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 11:17:28,548 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:17:28,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:17:28,800 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:17:31,803 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 11:17:33,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 11:17:33,358 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:17:33,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:17:33,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:17:33,577 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:17:35,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:17:38,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:17:39,059 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.83 vs. limit=12.0 2023-09-29 11:17:40,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:17:40,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:17:44,256 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=347653.3333333333, ans=0.0 2023-09-29 11:17:47,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 11:17:48,398 INFO [train.py:1039] (1/4) Epoch 10, batch 4350, loss[loss=0.1831, simple_loss=0.268, pruned_loss=0.04913, over 24635.00 frames. ], tot_loss[loss=0.2053, simple_loss=0.2753, pruned_loss=0.06768, over 4718369.19 frames. ], batch size: 73, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:17:48,519 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 11:17:52,470 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=347720.0, ans=0.1 2023-09-29 11:17:53,723 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:17:54,017 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=347720.0, ans=0.0 2023-09-29 11:17:57,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:17:59,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:17:59,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:18:02,326 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=347720.0, ans=0.125 2023-09-29 11:18:05,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:18:08,404 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:18:11,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:18:11,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:18:16,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:18:19,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:18:21,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 11:18:27,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 11:18:28,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:18:28,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:18:30,453 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=347853.3333333333, ans=0.0 2023-09-29 11:18:33,669 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=347853.3333333333, ans=0.125 2023-09-29 11:18:35,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:18:36,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 11:18:39,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:18:41,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 11:18:42,267 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.93 vs. limit=15.0 2023-09-29 11:18:45,975 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 11:18:46,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:18:47,537 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 11:18:47,674 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 11:18:49,120 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 11:18:49,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:18:50,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:18:50,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:18:50,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:18:52,881 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:18:52,963 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:18:55,972 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 11:18:55,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:18:55,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:18:56,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:18:56,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 11:18:58,024 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 11:18:59,927 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 11:18:59,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 11:19:03,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:19:04,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:19:04,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:19:06,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:19:08,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 11:19:08,321 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=347986.6666666667, ans=0.2 2023-09-29 11:19:09,538 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 11:19:09,549 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:19:10,993 INFO [train.py:1039] (1/4) Epoch 10, batch 4400, loss[loss=0.2141, simple_loss=0.2727, pruned_loss=0.07779, over 23837.00 frames. ], tot_loss[loss=0.2072, simple_loss=0.2769, pruned_loss=0.06878, over 4717852.75 frames. ], batch size: 150, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:19:11,424 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=348053.3333333333, ans=0.125 2023-09-29 11:19:14,004 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.629e+02 2.070e+02 2.289e+02 2.714e+02 5.548e+02, threshold=4.577e+02, percent-clipped=2.0 2023-09-29 11:19:14,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:19:14,177 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:19:17,035 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:19:18,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 11:19:18,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 11:19:20,198 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 11:19:20,241 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 11:19:21,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 11:19:21,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:19:24,714 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 11:19:26,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:19:28,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:19:28,850 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 11:19:30,602 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:19:30,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 11:19:30,664 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 11:19:31,369 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.63 vs. limit=15.0 2023-09-29 11:19:35,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 11:19:36,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 11:19:37,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 11:19:37,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:19:38,726 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:19:38,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:19:39,629 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.94 vs. limit=15.0 2023-09-29 11:19:40,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:19:42,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 11:19:42,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 11:19:43,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:19:46,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:19:46,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:19:48,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:19:48,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:19:48,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 11:19:48,516 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 11:19:50,140 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=348186.6666666667, ans=0.125 2023-09-29 11:19:51,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:19:58,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:20:01,820 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 11:20:04,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:20:06,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:20:10,223 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:20:10,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 11:20:10,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:20:10,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:20:10,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:20:11,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 11:20:15,141 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=348320.0, ans=0.0 2023-09-29 11:20:16,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 11:20:18,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 11:20:19,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 11:20:19,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:20:19,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 11:20:21,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:20:23,210 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=348320.0, ans=0.025 2023-09-29 11:20:24,754 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=348320.0, ans=0.0 2023-09-29 11:20:25,900 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:20:28,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 11:20:32,540 INFO [train.py:1039] (1/4) Epoch 10, batch 4450, loss[loss=0.2012, simple_loss=0.2825, pruned_loss=0.06, over 24326.00 frames. ], tot_loss[loss=0.2085, simple_loss=0.2783, pruned_loss=0.06936, over 4716409.56 frames. ], batch size: 61, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:20:32,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:20:35,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:20:36,643 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:20:38,646 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=348386.6666666667, ans=0.2 2023-09-29 11:20:44,754 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:20:44,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:20:49,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:20:52,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:20:55,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:20:55,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:20:56,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 11:20:57,509 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:20:59,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:20:59,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:20:59,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:21:02,078 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 11:21:06,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:21:07,503 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:21:09,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:21:09,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:21:10,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:21:15,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 11:21:17,378 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 11:21:17,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 11:21:17,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:21:22,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:21:23,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 11:21:26,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 11:21:28,658 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=348586.6666666667, ans=0.125 2023-09-29 11:21:32,040 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:21:33,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 11:21:33,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:21:33,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:21:33,599 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:21:33,610 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:21:35,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:21:37,098 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=348653.3333333333, ans=0.2 2023-09-29 11:21:38,311 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 11:21:39,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 11:21:41,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 11:21:43,572 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:21:43,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:21:44,069 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=348653.3333333333, ans=0.125 2023-09-29 11:21:47,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:21:47,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 11:21:48,442 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.36 vs. limit=12.0 2023-09-29 11:21:48,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 11:21:51,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 11:21:53,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:21:54,815 INFO [train.py:1039] (1/4) Epoch 10, batch 4500, loss[loss=0.2196, simple_loss=0.2712, pruned_loss=0.08397, over 23825.00 frames. ], tot_loss[loss=0.2083, simple_loss=0.2773, pruned_loss=0.06966, over 4711715.78 frames. ], batch size: 164, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:21:58,632 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.956e+02 2.459e+02 2.945e+02 4.663e+02, threshold=4.917e+02, percent-clipped=1.0 2023-09-29 11:21:58,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:22:00,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 11:22:00,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 11:22:01,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:22:07,100 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=348720.0, ans=0.125 2023-09-29 11:22:07,487 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.58 vs. limit=15.0 2023-09-29 11:22:08,389 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:22:08,479 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:22:09,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 11:22:10,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:22:10,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:22:11,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:22:22,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:22:23,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:22:25,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:22:27,044 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:22:27,345 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=348853.3333333333, ans=0.125 2023-09-29 11:22:28,619 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:22:29,016 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=348853.3333333333, ans=0.0 2023-09-29 11:22:32,537 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=348853.3333333333, ans=0.2 2023-09-29 11:22:35,383 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 11:22:40,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:22:45,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 11:22:49,019 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:22:50,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 11:22:52,036 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:22:52,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:22:53,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:22:55,041 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:22:56,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:22:56,800 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 11:22:56,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 11:22:56,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:23:02,329 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=348986.6666666667, ans=0.1 2023-09-29 11:23:03,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:23:03,497 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:23:07,279 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:23:10,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 11:23:10,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:23:12,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 11:23:12,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 11:23:12,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 11:23:17,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 11:23:18,491 INFO [train.py:1039] (1/4) Epoch 10, batch 4550, loss[loss=0.2154, simple_loss=0.2956, pruned_loss=0.06764, over 24457.00 frames. ], tot_loss[loss=0.2071, simple_loss=0.2762, pruned_loss=0.06899, over 4714961.92 frames. ], batch size: 69, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:23:20,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 11:23:20,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:23:23,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:23:25,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:23:28,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:23:32,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:23:34,758 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2.whitening_limit, batch_count=349120.0, ans=15.0 2023-09-29 11:23:35,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:23:38,612 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:23:38,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:23:38,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:23:41,603 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:23:43,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:23:45,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:23:49,045 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 11:23:50,719 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 11:23:52,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:23:53,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 11:23:56,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 11:23:57,031 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=349186.6666666667, ans=0.125 2023-09-29 11:23:57,792 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=349186.6666666667, ans=0.0 2023-09-29 11:23:59,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:24:02,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 11:24:05,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 11:24:08,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:24:08,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:24:08,273 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 11:24:10,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 11:24:13,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:24:15,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:24:15,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:24:16,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:24:17,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 11:24:17,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 11:24:19,127 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:24:19,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 11:24:20,764 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 11:24:22,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:24:23,761 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:24:23,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:24:25,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:24:25,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:24:26,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 11:24:27,135 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=349320.0, ans=0.0 2023-09-29 11:24:28,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 11:24:30,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:24:30,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 11:24:32,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 11:24:32,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:24:32,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 11:24:34,706 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.69 vs. limit=10.0 2023-09-29 11:24:35,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:24:35,444 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:24:37,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:24:38,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:24:38,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 11:24:41,250 INFO [train.py:1039] (1/4) Epoch 10, batch 4600, loss[loss=0.1862, simple_loss=0.2624, pruned_loss=0.055, over 24466.00 frames. ], tot_loss[loss=0.2054, simple_loss=0.2746, pruned_loss=0.06813, over 4711847.94 frames. ], batch size: 66, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:24:41,384 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:24:44,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 11:24:46,322 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.915e+02 2.143e+02 2.405e+02 4.065e+02, threshold=4.286e+02, percent-clipped=0.0 2023-09-29 11:24:47,267 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.96 vs. limit=15.0 2023-09-29 11:24:48,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:24:48,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:24:51,181 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:24:51,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:24:51,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:24:53,482 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 11:24:55,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:24:59,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:24:59,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:25:01,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:25:09,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 11:25:12,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:25:15,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:25:17,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:25:17,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:25:24,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 11:25:24,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 11:25:24,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:25:29,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:25:29,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:25:31,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:25:37,285 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 11:25:38,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 11:25:41,342 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=349586.6666666667, ans=0.1 2023-09-29 11:25:42,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:25:45,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:25:48,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:25:48,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 11:25:48,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:25:48,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 11:25:50,134 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:25:50,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:25:51,756 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:25:51,867 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:25:53,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:25:53,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 11:25:54,080 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=349653.3333333333, ans=0.125 2023-09-29 11:25:55,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 11:25:55,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 11:25:55,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:25:56,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:25:58,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:25:58,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:26:03,463 INFO [train.py:1039] (1/4) Epoch 10, batch 4650, loss[loss=0.1961, simple_loss=0.2698, pruned_loss=0.06121, over 24599.00 frames. ], tot_loss[loss=0.2041, simple_loss=0.2736, pruned_loss=0.06733, over 4724015.68 frames. ], batch size: 60, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:26:09,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:26:12,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:26:12,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:26:12,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:26:14,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:26:14,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:26:15,098 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:26:18,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 11:26:21,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:26:22,812 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 11:26:24,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:26:24,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 11:26:24,339 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:26:25,747 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 11:26:25,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 11:26:25,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:26:25,960 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=349786.6666666667, ans=0.2 2023-09-29 11:26:27,858 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:26:31,386 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:26:32,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:26:32,993 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 11:26:35,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:26:36,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 11:26:39,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:26:40,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:26:41,107 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 11:26:42,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:26:42,931 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=349853.3333333333, ans=0.0 2023-09-29 11:26:45,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:26:50,181 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.05 vs. limit=15.0 2023-09-29 11:26:52,310 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:26:55,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:26:57,954 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.44 vs. limit=15.0 2023-09-29 11:26:58,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:27:00,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:27:02,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:27:02,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 11:27:03,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 11:27:03,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 11:27:03,906 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 11:27:06,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:27:12,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:27:12,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:27:14,618 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 11:27:14,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:27:16,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:27:16,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:27:17,818 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:27:20,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:27:20,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:27:22,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:27:25,777 INFO [train.py:1039] (1/4) Epoch 10, batch 4700, loss[loss=0.1909, simple_loss=0.2778, pruned_loss=0.05202, over 24315.00 frames. ], tot_loss[loss=0.2047, simple_loss=0.2745, pruned_loss=0.06746, over 4731892.26 frames. ], batch size: 74, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:27:26,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:27:26,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:27:26,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 11:27:27,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 11:27:27,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 11:27:27,937 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=350053.3333333333, ans=0.125 2023-09-29 11:27:29,328 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 11:27:30,693 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.991e+02 2.178e+02 2.659e+02 4.780e+02, threshold=4.356e+02, percent-clipped=1.0 2023-09-29 11:27:39,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:27:40,867 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:27:40,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:27:43,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:27:44,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 11:27:45,006 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=350120.0, ans=0.04949747468305833 2023-09-29 11:27:49,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 11:27:49,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 11:27:51,413 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=350120.0, ans=0.125 2023-09-29 11:27:52,536 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:27:54,060 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:27:54,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:27:59,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:28:06,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 11:28:08,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 11:28:10,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:28:10,925 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=350186.6666666667, ans=0.125 2023-09-29 11:28:15,957 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=350253.3333333333, ans=0.0 2023-09-29 11:28:17,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 11:28:18,644 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:28:20,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:28:24,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 11:28:27,003 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:28:32,955 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:28:33,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 11:28:33,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:28:33,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:28:36,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:28:36,446 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:28:36,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 11:28:38,928 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 11:28:39,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:28:40,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:28:40,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:28:40,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 11:28:42,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:28:45,023 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.78 vs. limit=15.0 2023-09-29 11:28:47,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 11:28:49,344 INFO [train.py:1039] (1/4) Epoch 10, batch 4750, loss[loss=0.1966, simple_loss=0.2699, pruned_loss=0.06165, over 24660.00 frames. ], tot_loss[loss=0.2049, simple_loss=0.2752, pruned_loss=0.06735, over 4737902.36 frames. ], batch size: 65, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:28:49,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=350386.6666666667, ans=0.0 2023-09-29 11:28:50,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:28:52,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:28:56,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:28:56,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:28:59,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 11:28:59,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:29:03,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 11:29:05,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:29:05,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:29:05,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:29:06,317 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.22 vs. limit=6.0 2023-09-29 11:29:11,255 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=350453.3333333333, ans=0.125 2023-09-29 11:29:12,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 11:29:18,088 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:29:19,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 11:29:19,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:29:22,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:29:22,872 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:29:24,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:29:25,123 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 11:29:25,128 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 11:29:27,033 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=350520.0, ans=0.0 2023-09-29 11:29:30,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 11:29:34,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:29:34,868 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=350520.0, ans=10.0 2023-09-29 11:29:36,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:29:39,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 11:29:39,127 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 11:29:39,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:29:42,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:29:45,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:29:47,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 11:29:47,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 11:29:47,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:29:48,796 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:29:48,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:29:50,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 11:29:50,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 11:29:55,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 11:29:57,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:30:01,104 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:30:01,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 11:30:02,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:30:04,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:30:04,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:30:06,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:30:06,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 11:30:06,626 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=350653.3333333333, ans=0.0 2023-09-29 11:30:09,532 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:30:09,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 11:30:09,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 11:30:11,131 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 11:30:11,415 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=350720.0, ans=0.0 2023-09-29 11:30:12,437 INFO [train.py:1039] (1/4) Epoch 10, batch 4800, loss[loss=0.216, simple_loss=0.2747, pruned_loss=0.07861, over 23283.00 frames. ], tot_loss[loss=0.2071, simple_loss=0.2764, pruned_loss=0.06889, over 4716530.18 frames. ], batch size: 119, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:30:15,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 11:30:15,358 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:30:16,051 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.49 vs. limit=15.0 2023-09-29 11:30:16,721 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.911e+02 2.173e+02 2.490e+02 3.366e+02, threshold=4.345e+02, percent-clipped=0.0 2023-09-29 11:30:16,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 11:30:22,145 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:30:23,567 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:30:28,841 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:30:28,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:30:30,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:30:30,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 11:30:31,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:30:31,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:30:35,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:30:39,271 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:30:39,530 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=350786.6666666667, ans=0.1 2023-09-29 11:30:39,738 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.76 vs. limit=15.0 2023-09-29 11:30:40,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:30:42,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:30:43,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:30:43,969 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 11:30:44,004 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:30:45,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:30:49,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:30:51,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:30:52,212 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.62 vs. limit=15.0 2023-09-29 11:30:53,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:30:53,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:30:55,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 11:30:56,348 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.84 vs. limit=15.0 2023-09-29 11:30:56,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:30:58,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 11:30:58,471 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 11:30:59,900 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:30:59,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:30:59,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:31:00,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:31:02,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:31:05,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:31:05,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:31:07,221 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=350920.0, ans=0.125 2023-09-29 11:31:09,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:31:11,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:31:13,586 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:31:18,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 11:31:20,225 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:31:21,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:31:21,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 11:31:23,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:31:26,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:31:28,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:31:28,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:31:28,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:31:29,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:31:30,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:31:33,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:31:33,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:31:33,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:31:34,459 INFO [train.py:1039] (1/4) Epoch 10, batch 4850, loss[loss=0.1904, simple_loss=0.2528, pruned_loss=0.06398, over 24322.00 frames. ], tot_loss[loss=0.2071, simple_loss=0.2767, pruned_loss=0.06875, over 4722390.86 frames. ], batch size: 56, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:31:34,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 11:31:37,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 11:31:37,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:31:37,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:31:37,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:31:37,777 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:31:38,411 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=9.76 vs. limit=15.0 2023-09-29 11:31:41,439 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:31:51,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 11:31:51,593 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:31:54,925 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=351120.0, ans=0.125 2023-09-29 11:31:56,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:31:57,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 11:31:57,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:32:01,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:32:02,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:32:04,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:32:04,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 11:32:07,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:32:10,996 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:32:11,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 11:32:11,158 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 11:32:11,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 11:32:11,320 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=351186.6666666667, ans=0.125 2023-09-29 11:32:11,510 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=351186.6666666667, ans=0.1 2023-09-29 11:32:14,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:32:15,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:32:16,105 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=351186.6666666667, ans=0.125 2023-09-29 11:32:18,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:32:18,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 11:32:21,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 11:32:22,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 11:32:29,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:32:30,920 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 11:32:32,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:32:32,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:32:34,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 11:32:36,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 11:32:36,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:32:36,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 11:32:37,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:32:37,859 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:32:39,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 11:32:44,092 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.62 vs. limit=15.0 2023-09-29 11:32:49,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:32:54,085 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:32:54,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:32:57,383 INFO [train.py:1039] (1/4) Epoch 10, batch 4900, loss[loss=0.1932, simple_loss=0.2469, pruned_loss=0.06977, over 23523.00 frames. ], tot_loss[loss=0.2067, simple_loss=0.2755, pruned_loss=0.0689, over 4697809.35 frames. ], batch size: 256, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:32:59,914 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.29 vs. limit=12.0 2023-09-29 11:33:00,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 11:33:00,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:33:02,124 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.635e+02 2.045e+02 2.293e+02 2.550e+02 3.770e+02, threshold=4.586e+02, percent-clipped=0.0 2023-09-29 11:33:06,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:33:08,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:33:09,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:33:12,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 11:33:17,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 11:33:22,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 11:33:23,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 11:33:23,752 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:33:23,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:33:25,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:33:25,201 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:33:25,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:33:25,351 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 11:33:26,184 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.89 vs. limit=15.0 2023-09-29 11:33:30,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 11:33:30,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 11:33:32,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 11:33:32,630 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.36 vs. limit=22.5 2023-09-29 11:33:34,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:33:35,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:33:35,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:33:37,330 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:33:37,347 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 11:33:38,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:33:42,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:33:42,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 11:33:42,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 11:33:45,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 11:33:47,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:33:50,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:33:50,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:33:52,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:33:52,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 11:33:52,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:33:52,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 11:33:55,452 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:33:56,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 11:33:58,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:34:00,309 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=351586.6666666667, ans=0.125 2023-09-29 11:34:01,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 11:34:03,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:34:03,245 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 11:34:03,388 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=351653.3333333333, ans=0.125 2023-09-29 11:34:04,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 11:34:11,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:34:13,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:34:14,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 11:34:15,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 11:34:15,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:34:16,318 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.14 vs. limit=15.0 2023-09-29 11:34:17,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:34:20,064 INFO [train.py:1039] (1/4) Epoch 10, batch 4950, loss[loss=0.177, simple_loss=0.2568, pruned_loss=0.04858, over 24322.00 frames. ], tot_loss[loss=0.2052, simple_loss=0.2741, pruned_loss=0.06811, over 4708159.64 frames. ], batch size: 61, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:34:20,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:34:20,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:34:22,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:34:22,361 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 11:34:23,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 11:34:25,548 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:34:25,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 11:34:28,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 11:34:29,040 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=351720.0, ans=0.05 2023-09-29 11:34:30,170 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 11:34:30,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 11:34:31,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 11:34:31,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:34:31,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:34:31,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:34:31,912 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=351720.0, ans=0.125 2023-09-29 11:34:33,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:34:34,761 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:34:36,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:34:37,676 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:34:37,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:34:39,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:34:41,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:34:44,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 11:34:48,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:34:49,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:34:52,532 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:34:52,614 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:34:54,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:34:56,377 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 11:34:57,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 11:34:59,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:35:00,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:35:00,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:35:03,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:35:03,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:35:05,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 11:35:08,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:35:10,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 11:35:11,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:35:13,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:35:13,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:35:15,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 11:35:15,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:35:17,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:35:21,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:35:23,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:35:23,421 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:35:23,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:35:24,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:35:26,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:35:28,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:35:28,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:35:30,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:35:31,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 11:35:35,147 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=351986.6666666667, ans=0.125 2023-09-29 11:35:36,329 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:35:36,825 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=351986.6666666667, ans=0.0 2023-09-29 11:35:41,015 INFO [train.py:1039] (1/4) Epoch 10, batch 5000, loss[loss=0.2058, simple_loss=0.2836, pruned_loss=0.06403, over 24343.00 frames. ], tot_loss[loss=0.205, simple_loss=0.2738, pruned_loss=0.06804, over 4710733.96 frames. ], batch size: 77, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:35:41,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 11:35:41,123 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 11:35:46,479 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:35:47,803 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 2.017e+02 2.302e+02 2.737e+02 4.823e+02, threshold=4.603e+02, percent-clipped=1.0 2023-09-29 11:35:47,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:35:49,968 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 11:35:51,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 11:35:51,735 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=352053.3333333333, ans=0.125 2023-09-29 11:35:53,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:35:54,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 11:35:54,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:35:54,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 11:35:56,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 11:35:57,813 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:35:59,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:36:01,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 11:36:01,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:36:01,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:36:01,670 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=352120.0, ans=0.0 2023-09-29 11:36:02,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 11:36:03,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 11:36:03,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:36:03,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 11:36:03,253 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 11:36:03,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:36:04,764 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 11:36:04,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 11:36:04,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 11:36:06,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 11:36:06,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:36:07,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:36:09,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 11:36:09,547 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:36:11,141 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:36:12,655 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:36:12,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 11:36:14,391 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 11:36:15,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:36:15,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:36:16,192 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=352186.6666666667, ans=0.125 2023-09-29 11:36:21,597 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 11:36:24,614 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:36:26,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:36:26,199 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:36:32,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 11:36:32,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:36:32,561 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:36:32,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:36:34,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 11:36:35,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:36:37,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:36:38,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:36:44,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 11:36:49,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:36:58,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:36:59,304 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=352320.0, ans=0.125 2023-09-29 11:37:00,376 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:37:00,388 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:37:00,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:37:01,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 11:37:01,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:37:03,145 INFO [train.py:1039] (1/4) Epoch 10, batch 5050, loss[loss=0.2083, simple_loss=0.28, pruned_loss=0.06834, over 23712.00 frames. ], tot_loss[loss=0.2052, simple_loss=0.2739, pruned_loss=0.06823, over 4716854.88 frames. ], batch size: 179, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:37:03,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:37:07,306 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=352386.6666666667, ans=0.0 2023-09-29 11:37:08,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:37:08,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 11:37:10,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:37:13,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:37:16,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:37:16,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 11:37:17,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:37:17,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:37:20,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 11:37:22,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:37:22,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 11:37:24,950 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.40 vs. limit=15.0 2023-09-29 11:37:27,302 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=352453.3333333333, ans=0.1 2023-09-29 11:37:33,246 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=352453.3333333333, ans=0.125 2023-09-29 11:37:34,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 11:37:36,020 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 11:37:36,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:37:36,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 11:37:36,281 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:37:36,524 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=352520.0, ans=0.5 2023-09-29 11:37:39,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:37:39,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:37:41,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:37:41,345 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 11:37:42,883 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 11:37:44,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:37:46,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:37:49,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:37:51,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 11:37:51,585 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=352586.6666666667, ans=0.2 2023-09-29 11:37:52,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:37:55,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 11:37:57,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:37:57,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:37:58,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:37:59,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:38:00,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:38:02,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:38:03,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:38:03,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:38:03,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:38:06,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 11:38:07,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:38:09,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:38:12,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:38:12,593 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 11:38:12,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 11:38:14,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:38:14,213 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:38:14,260 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 11:38:17,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:38:17,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 11:38:17,904 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:38:21,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:38:23,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:38:23,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 11:38:24,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 11:38:26,303 INFO [train.py:1039] (1/4) Epoch 10, batch 5100, loss[loss=0.2214, simple_loss=0.2767, pruned_loss=0.08301, over 23695.00 frames. ], tot_loss[loss=0.2066, simple_loss=0.2751, pruned_loss=0.06906, over 4716143.10 frames. ], batch size: 232, lr: 1.01e-02, grad_scale: 8.0 2023-09-29 11:38:26,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:38:27,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:38:28,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:38:31,051 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 11:38:32,350 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 1.939e+02 2.293e+02 2.682e+02 4.893e+02, threshold=4.586e+02, percent-clipped=1.0 2023-09-29 11:38:34,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:38:34,465 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=352720.0, ans=0.125 2023-09-29 11:38:37,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 11:38:39,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 11:38:39,300 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_abs, batch_count=352720.0, ans=0.5 2023-09-29 11:38:41,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:38:42,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:38:44,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:38:44,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 11:38:44,451 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 11:38:50,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:38:50,592 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:38:57,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:38:59,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 11:38:59,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:39:01,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:39:02,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 11:39:05,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:39:06,054 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:39:06,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 11:39:07,676 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 11:39:09,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:39:09,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 11:39:09,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 11:39:15,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:39:21,423 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:39:22,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 11:39:22,991 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 11:39:23,004 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 11:39:24,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 11:39:24,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:39:29,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 11:39:33,628 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 11:39:37,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 11:39:39,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:39:41,091 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 11:39:44,166 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 11:39:44,228 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 11:39:48,460 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=353053.3333333333, ans=0.0 2023-09-29 11:39:49,540 INFO [train.py:1039] (1/4) Epoch 10, batch 5150, loss[loss=0.222, simple_loss=0.2831, pruned_loss=0.08046, over 23602.00 frames. ], tot_loss[loss=0.208, simple_loss=0.2767, pruned_loss=0.06963, over 4710473.26 frames. ], batch size: 256, lr: 1.01e-02, grad_scale: 8.0 2023-09-29 11:39:49,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:39:50,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:39:50,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:39:51,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:39:51,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 11:39:51,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:39:52,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 11:39:52,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 11:39:53,068 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=353053.3333333333, ans=0.1 2023-09-29 11:39:54,298 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 11:39:54,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:39:54,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 11:39:55,861 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:39:55,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 11:39:57,547 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:39:58,982 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:40:06,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 11:40:06,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 11:40:07,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:40:07,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:40:07,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 11:40:07,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:40:08,147 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=353120.0, ans=0.2 2023-09-29 11:40:09,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:40:09,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:40:09,360 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:40:10,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 11:40:11,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:40:12,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:40:12,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 11:40:15,610 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 11:40:15,740 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=353120.0, ans=0.0 2023-09-29 11:40:18,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:40:23,471 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.64 vs. limit=15.0 2023-09-29 11:40:24,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:40:24,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 11:40:29,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:40:34,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:40:35,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:40:39,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:40:41,218 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:40:43,037 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=353253.3333333333, ans=0.125 2023-09-29 11:40:44,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 11:40:49,672 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:40:51,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:40:51,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:40:52,127 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.98 vs. limit=15.0 2023-09-29 11:40:53,061 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=353253.3333333333, ans=0.04949747468305833 2023-09-29 11:40:54,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:40:55,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:40:57,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 11:41:00,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:41:02,529 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 11:41:04,204 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:41:04,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:41:05,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 11:41:05,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 11:41:05,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:41:05,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:41:09,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:41:11,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:41:13,144 INFO [train.py:1039] (1/4) Epoch 10, batch 5200, loss[loss=0.1941, simple_loss=0.254, pruned_loss=0.06717, over 23441.00 frames. ], tot_loss[loss=0.208, simple_loss=0.2764, pruned_loss=0.06981, over 4692699.32 frames. ], batch size: 285, lr: 1.01e-02, grad_scale: 16.0 2023-09-29 11:41:14,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:41:19,174 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 2.032e+02 2.395e+02 2.917e+02 4.034e+02, threshold=4.790e+02, percent-clipped=0.0 2023-09-29 11:41:19,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 11:41:19,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:41:21,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:41:21,957 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=353386.6666666667, ans=0.125 2023-09-29 11:41:24,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:41:24,866 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=353386.6666666667, ans=0.0 2023-09-29 11:41:26,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:41:26,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:41:27,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 11:41:30,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 11:41:32,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:41:35,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 11:41:38,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 11:41:39,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:41:40,178 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=353453.3333333333, ans=0.125 2023-09-29 11:41:41,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 11:41:41,388 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 11:41:41,686 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=353453.3333333333, ans=0.2 2023-09-29 11:41:41,922 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.75 vs. limit=10.0 2023-09-29 11:41:44,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 11:41:46,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:41:46,369 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 11:41:46,381 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:41:47,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:41:48,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:41:48,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 11:41:49,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:41:53,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:41:56,221 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 11:41:56,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 11:41:56,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 11:42:03,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 11:42:04,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 11:42:08,326 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.81 vs. limit=22.5 2023-09-29 11:42:09,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 11:42:09,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:42:10,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 11:42:10,974 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:42:12,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 11:42:12,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:42:12,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 11:42:15,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:42:17,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:42:20,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:42:22,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:42:22,141 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:42:28,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:42:30,312 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 11:42:32,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:42:32,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:42:32,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=353720.0, ans=0.025 2023-09-29 11:42:34,466 INFO [train.py:1039] (1/4) Epoch 10, batch 5250, loss[loss=0.2174, simple_loss=0.2952, pruned_loss=0.0698, over 23794.00 frames. ], tot_loss[loss=0.2069, simple_loss=0.2752, pruned_loss=0.06931, over 4696479.43 frames. ], batch size: 85, lr: 1.01e-02, grad_scale: 16.0 2023-09-29 11:42:34,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:42:36,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 11:42:36,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:42:40,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:42:42,574 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=353720.0, ans=0.125 2023-09-29 11:42:43,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:42:45,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:42:45,368 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:42:47,287 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=353720.0, ans=0.1 2023-09-29 11:42:50,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:42:51,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:42:51,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=353786.6666666667, ans=0.1 2023-09-29 11:42:56,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:42:58,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 11:42:58,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 11:42:58,405 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:42:59,869 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:43:04,778 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=353853.3333333333, ans=0.1 2023-09-29 11:43:12,309 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.66 vs. limit=6.0 2023-09-29 11:43:15,116 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=353853.3333333333, ans=0.0 2023-09-29 11:43:19,892 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.70 vs. limit=15.0 2023-09-29 11:43:48,676 INFO [train.py:1039] (1/4) Epoch 10, batch 5300, loss[loss=0.1744, simple_loss=0.2456, pruned_loss=0.05159, over 24551.00 frames. ], tot_loss[loss=0.2056, simple_loss=0.2737, pruned_loss=0.06869, over 4699360.67 frames. ], batch size: 60, lr: 1.01e-02, grad_scale: 16.0 2023-09-29 11:43:54,371 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.719e+02 1.989e+02 2.153e+02 2.436e+02 4.114e+02, threshold=4.306e+02, percent-clipped=0.0 2023-09-29 11:44:03,887 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=354120.0, ans=0.5 2023-09-29 11:44:05,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:44:05,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 11:44:05,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 11:44:05,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:44:06,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:44:06,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:44:06,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:44:06,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:44:06,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:44:06,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:44:06,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 11:44:06,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:44:07,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 11:44:07,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 11:44:07,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 11:44:07,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 11:44:07,477 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 11:44:07,604 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 11:44:07,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:44:08,272 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:44:08,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:44:08,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:44:08,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:44:09,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:44:09,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:44:09,532 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:44:09,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:44:09,716 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:44:09,723 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:44:09,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:44:09,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:44:10,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 11:44:10,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:44:11,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:44:11,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 11:44:11,384 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 11:44:11,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 11:44:11,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:44:11,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 11:44:11,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 11:44:11,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 11:44:12,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:44:13,413 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:44:13,569 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 11:44:13,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 11:44:13,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:44:13,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:44:14,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 11:44:14,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 11:44:14,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 11:44:14,440 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 11:44:22,141 INFO [train.py:1039] (1/4) Epoch 11, batch 0, loss[loss=0.2186, simple_loss=0.2796, pruned_loss=0.07884, over 22802.00 frames. ], tot_loss[loss=0.2186, simple_loss=0.2796, pruned_loss=0.07884, over 22802.00 frames. ], batch size: 322, lr: 9.67e-03, grad_scale: 32.0 2023-09-29 11:44:22,142 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-29 11:44:36,240 INFO [train.py:1071] (1/4) Epoch 11, validation: loss=0.3103, simple_loss=0.2886, pruned_loss=0.166, over 1125622.00 frames. 2023-09-29 11:44:36,241 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-29 11:44:38,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 11:44:38,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:44:42,066 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:44:45,265 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=354140.0, ans=0.125 2023-09-29 11:44:48,558 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:44:48,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:44:48,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:44:48,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 11:44:50,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 11:44:53,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:44:54,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:44:55,569 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=6.47 vs. limit=15.0 2023-09-29 11:44:57,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:44:57,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:44:59,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:44:59,098 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:45:00,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 11:45:02,236 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:45:11,561 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:45:11,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:45:12,531 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=354273.3333333333, ans=0.125 2023-09-29 11:45:13,794 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 11:45:14,093 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=354273.3333333333, ans=0.1 2023-09-29 11:45:18,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:45:18,952 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:45:20,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:45:22,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=354273.3333333333, ans=0.125 2023-09-29 11:45:24,356 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=354340.0, ans=0.0 2023-09-29 11:45:26,922 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:45:27,790 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.51 vs. limit=15.0 2023-09-29 11:45:30,503 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.21 vs. limit=15.0 2023-09-29 11:45:33,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:45:36,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 11:45:39,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 11:45:41,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:45:41,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:45:42,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:45:42,953 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=354406.6666666667, ans=0.0 2023-09-29 11:45:44,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:45:45,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 11:45:49,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:45:50,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:45:54,431 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:45:57,370 INFO [train.py:1039] (1/4) Epoch 11, batch 50, loss[loss=0.1794, simple_loss=0.2501, pruned_loss=0.05428, over 24357.00 frames. ], tot_loss[loss=0.2081, simple_loss=0.2783, pruned_loss=0.069, over 1066102.58 frames. ], batch size: 56, lr: 9.67e-03, grad_scale: 16.0 2023-09-29 11:45:57,559 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 11:46:00,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:46:02,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:46:05,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:46:05,846 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=354473.3333333333, ans=0.125 2023-09-29 11:46:07,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 11:46:07,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 11:46:08,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:46:08,779 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=354473.3333333333, ans=0.2 2023-09-29 11:46:10,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:46:11,566 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:46:14,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:46:17,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 11:46:17,613 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:46:24,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 11:46:25,135 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=354540.0, ans=0.1 2023-09-29 11:46:28,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 11:46:30,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 11:46:30,943 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=354606.6666666667, ans=0.125 2023-09-29 11:46:32,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:46:33,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:46:33,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:46:33,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:46:35,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 11:46:35,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 11:46:35,999 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:46:43,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:46:45,446 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:46:45,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 11:46:46,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 11:46:48,503 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:46:49,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:46:49,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 11:46:50,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:46:51,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 11:47:01,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:47:01,114 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:47:02,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:47:02,950 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=354740.0, ans=0.1 2023-09-29 11:47:04,553 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 1.921e+02 2.105e+02 2.466e+02 3.711e+02, threshold=4.210e+02, percent-clipped=0.0 2023-09-29 11:47:04,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:47:04,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 11:47:07,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 11:47:07,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 11:47:09,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:47:09,865 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 11:47:11,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:47:12,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:47:12,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 11:47:14,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 11:47:14,413 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 11:47:15,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:47:17,312 INFO [train.py:1039] (1/4) Epoch 11, batch 100, loss[loss=0.1754, simple_loss=0.2479, pruned_loss=0.05151, over 24460.00 frames. ], tot_loss[loss=0.2076, simple_loss=0.2782, pruned_loss=0.06846, over 1883906.58 frames. ], batch size: 58, lr: 9.66e-03, grad_scale: 16.0 2023-09-29 11:47:17,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:47:18,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 11:47:18,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 11:47:19,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:47:20,545 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:47:22,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 11:47:22,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:47:26,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:47:28,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:47:32,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:47:34,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 11:47:34,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:47:36,512 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:47:36,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:47:38,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:47:38,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:47:38,783 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:47:40,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 11:47:42,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:47:42,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:47:42,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:47:42,496 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:47:47,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 11:47:47,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:47:48,040 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=354873.3333333333, ans=0.2 2023-09-29 11:47:49,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:47:49,426 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 11:47:51,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 11:47:51,476 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=354940.0, ans=0.04949747468305833 2023-09-29 11:47:54,444 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 11:47:54,468 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 11:47:54,955 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.20 vs. limit=10.0 2023-09-29 11:47:56,009 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:47:56,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:48:00,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 11:48:03,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:48:05,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:48:09,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:48:11,189 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 11:48:12,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 11:48:17,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:48:17,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:48:19,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:48:22,682 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:48:25,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:48:27,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:48:30,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:48:30,683 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.66 vs. limit=15.0 2023-09-29 11:48:31,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:48:33,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:48:33,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:48:33,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:48:33,369 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=355073.3333333333, ans=0.0 2023-09-29 11:48:34,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 11:48:34,657 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 11:48:34,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:48:36,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:48:37,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:48:37,673 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:48:38,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 11:48:38,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 11:48:38,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 11:48:38,517 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:48:39,665 INFO [train.py:1039] (1/4) Epoch 11, batch 150, loss[loss=0.2414, simple_loss=0.2933, pruned_loss=0.09482, over 22677.00 frames. ], tot_loss[loss=0.2076, simple_loss=0.2776, pruned_loss=0.06879, over 2511721.97 frames. ], batch size: 322, lr: 9.66e-03, grad_scale: 16.0 2023-09-29 11:48:39,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:48:41,959 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:48:43,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:48:43,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:48:45,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:48:47,579 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=15.42 vs. limit=15.0 2023-09-29 11:48:48,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:48:48,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:48:48,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:48:52,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:48:53,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:48:58,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:48:59,575 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:49:02,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 11:49:02,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 11:49:02,751 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 11:49:07,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:49:07,127 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:49:07,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:49:08,788 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:49:08,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:49:10,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:49:10,258 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:49:11,815 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 11:49:13,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:49:20,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:49:25,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:49:26,800 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 11:49:28,573 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=355340.0, ans=0.0 2023-09-29 11:49:29,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:49:29,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:49:29,996 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:49:33,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:49:35,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:49:36,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:49:36,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:49:36,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 11:49:37,105 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=355340.0, ans=0.125 2023-09-29 11:49:41,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:49:42,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:49:42,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:49:42,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:49:43,286 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.max_abs, batch_count=355406.6666666667, ans=10.0 2023-09-29 11:49:44,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:49:46,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 11:49:46,396 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=355406.6666666667, ans=0.2 2023-09-29 11:49:47,844 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.912e+02 2.159e+02 2.654e+02 4.388e+02, threshold=4.317e+02, percent-clipped=1.0 2023-09-29 11:49:48,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:49:50,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:49:54,055 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:49:55,630 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:49:55,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 11:49:57,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:49:57,086 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 11:50:01,509 INFO [train.py:1039] (1/4) Epoch 11, batch 200, loss[loss=0.1674, simple_loss=0.2386, pruned_loss=0.04812, over 24296.00 frames. ], tot_loss[loss=0.2073, simple_loss=0.2777, pruned_loss=0.06843, over 3017843.18 frames. ], batch size: 56, lr: 9.65e-03, grad_scale: 16.0 2023-09-29 11:50:01,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:50:02,003 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=355473.3333333333, ans=0.1 2023-09-29 11:50:05,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:50:05,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:50:05,665 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=355473.3333333333, ans=0.125 2023-09-29 11:50:08,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 11:50:09,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:50:09,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:50:13,139 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 11:50:14,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 11:50:16,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:50:17,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:50:20,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:50:22,323 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:50:22,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:50:43,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:50:43,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:50:44,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:50:44,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:50:46,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 11:50:46,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 11:50:46,913 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=355606.6666666667, ans=0.0 2023-09-29 11:50:47,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:50:48,855 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.40 vs. limit=15.0 2023-09-29 11:50:49,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:50:50,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:50:50,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:50:52,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 11:50:53,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 11:50:53,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:50:54,253 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=355673.3333333333, ans=0.125 2023-09-29 11:50:55,713 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=355673.3333333333, ans=0.1 2023-09-29 11:50:57,289 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=355673.3333333333, ans=0.2 2023-09-29 11:51:01,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:51:05,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:51:12,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:51:14,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:51:22,375 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:51:23,777 INFO [train.py:1039] (1/4) Epoch 11, batch 250, loss[loss=0.1732, simple_loss=0.2473, pruned_loss=0.04956, over 24288.00 frames. ], tot_loss[loss=0.2068, simple_loss=0.2767, pruned_loss=0.06844, over 3392143.75 frames. ], batch size: 56, lr: 9.65e-03, grad_scale: 16.0 2023-09-29 11:51:25,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 11:51:25,468 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:51:25,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:51:25,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:51:26,979 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:51:28,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 11:51:29,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:51:29,980 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 11:51:31,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:51:33,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:51:34,532 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:51:34,826 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=355806.6666666667, ans=0.0 2023-09-29 11:51:35,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:51:37,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:51:37,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:51:40,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:51:42,427 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=355873.3333333333, ans=0.1 2023-09-29 11:51:44,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:51:53,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:51:57,122 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:51:57,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:52:03,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 11:52:04,152 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=355940.0, ans=0.05 2023-09-29 11:52:05,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 11:52:05,460 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=355940.0, ans=0.2 2023-09-29 11:52:07,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:52:07,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:52:07,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 11:52:07,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:52:07,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:52:10,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:52:12,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 11:52:12,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:52:15,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:52:15,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:52:15,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:52:17,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:52:19,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:52:19,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:52:20,669 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:52:23,635 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:52:23,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:52:27,526 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 11:52:30,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:52:32,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:52:33,631 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 1.944e+02 2.181e+02 2.498e+02 3.489e+02, threshold=4.363e+02, percent-clipped=0.0 2023-09-29 11:52:38,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:52:41,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:52:44,310 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.40 vs. limit=6.0 2023-09-29 11:52:45,048 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 11:52:46,471 INFO [train.py:1039] (1/4) Epoch 11, batch 300, loss[loss=0.2015, simple_loss=0.2863, pruned_loss=0.0584, over 24644.00 frames. ], tot_loss[loss=0.2066, simple_loss=0.2754, pruned_loss=0.06885, over 3674797.47 frames. ], batch size: 68, lr: 9.64e-03, grad_scale: 16.0 2023-09-29 11:52:46,541 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:52:46,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:52:48,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 11:52:48,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 11:52:49,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:52:49,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 11:52:53,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:52:55,007 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:53:00,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:53:00,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 11:53:01,804 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:53:03,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 11:53:03,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 11:53:03,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:53:08,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:53:11,761 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:53:11,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 11:53:15,672 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 11:53:15,757 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:53:18,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:53:20,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:53:20,323 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 11:53:20,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:53:23,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:53:25,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:53:25,620 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=356273.3333333333, ans=0.2 2023-09-29 11:53:26,830 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:53:33,092 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 11:53:33,099 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 11:53:33,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:53:33,735 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.14 vs. limit=15.0 2023-09-29 11:53:36,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:53:37,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 11:53:39,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:53:42,684 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:53:47,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:53:47,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 11:53:51,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:53:51,768 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 11:53:54,881 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:53:56,356 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:53:56,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 11:53:57,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 11:53:59,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:54:00,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 11:54:02,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:54:03,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:54:05,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:54:05,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:54:06,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:54:09,945 INFO [train.py:1039] (1/4) Epoch 11, batch 350, loss[loss=0.2003, simple_loss=0.2689, pruned_loss=0.06583, over 23401.00 frames. ], tot_loss[loss=0.2036, simple_loss=0.2724, pruned_loss=0.06742, over 3911787.56 frames. ], batch size: 119, lr: 9.64e-03, grad_scale: 16.0 2023-09-29 11:54:11,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:54:11,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 11:54:14,640 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:54:19,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:54:21,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:54:22,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:54:27,825 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 11:54:29,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:54:29,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 11:54:33,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:54:33,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 11:54:35,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:54:37,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 11:54:39,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:54:41,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:54:42,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:54:43,373 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=356606.6666666667, ans=0.125 2023-09-29 11:54:44,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:54:44,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:54:44,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:54:44,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:54:46,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:54:46,485 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=356606.6666666667, ans=0.125 2023-09-29 11:54:47,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:54:47,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:54:57,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:54:57,293 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 11:54:57,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:54:57,417 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:55:02,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 11:55:02,608 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:55:05,962 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=356673.3333333333, ans=0.125 2023-09-29 11:55:08,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:55:08,697 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:55:08,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:55:10,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 11:55:12,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:55:14,070 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 11:55:16,215 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 11:55:16,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:55:19,168 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.719e+02 1.993e+02 2.217e+02 2.521e+02 3.405e+02, threshold=4.434e+02, percent-clipped=0.0 2023-09-29 11:55:19,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:55:19,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 11:55:22,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:55:25,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:55:25,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:55:25,672 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=356740.0, ans=0.1 2023-09-29 11:55:26,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:55:26,980 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:55:30,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:55:32,074 INFO [train.py:1039] (1/4) Epoch 11, batch 400, loss[loss=0.18, simple_loss=0.2565, pruned_loss=0.05173, over 24455.00 frames. ], tot_loss[loss=0.203, simple_loss=0.2718, pruned_loss=0.06713, over 4081539.29 frames. ], batch size: 63, lr: 9.64e-03, grad_scale: 32.0 2023-09-29 11:55:33,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:55:37,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:55:37,451 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=356806.6666666667, ans=0.1 2023-09-29 11:55:38,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 11:55:38,834 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:55:38,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:55:41,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:55:42,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:55:43,824 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=356806.6666666667, ans=0.1 2023-09-29 11:55:45,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:55:47,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:55:49,077 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 11:55:51,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 11:55:51,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:55:52,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 11:55:52,894 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=356873.3333333333, ans=0.125 2023-09-29 11:55:53,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:55:57,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:55:57,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:55:57,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 11:55:57,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:55:58,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:55:58,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:56:00,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:56:01,764 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 11:56:01,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 11:56:04,462 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=356940.0, ans=0.09899494936611666 2023-09-29 11:56:06,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:56:10,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:56:10,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 11:56:11,805 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 11:56:13,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:56:15,298 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:56:24,833 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 11:56:27,100 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 11:56:27,500 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=357006.6666666667, ans=0.1 2023-09-29 11:56:28,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 11:56:30,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:56:31,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:56:31,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 11:56:36,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:56:40,349 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=357073.3333333333, ans=0.125 2023-09-29 11:56:41,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 11:56:42,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:56:45,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:56:46,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 11:56:48,099 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 11:56:48,520 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=357073.3333333333, ans=0.1 2023-09-29 11:56:49,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 11:56:52,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:56:52,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:56:54,755 INFO [train.py:1039] (1/4) Epoch 11, batch 450, loss[loss=0.2029, simple_loss=0.2719, pruned_loss=0.06699, over 23251.00 frames. ], tot_loss[loss=0.2035, simple_loss=0.2723, pruned_loss=0.06729, over 4228701.88 frames. ], batch size: 105, lr: 9.63e-03, grad_scale: 32.0 2023-09-29 11:56:54,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 11:56:56,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 11:56:56,622 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=357140.0, ans=0.0 2023-09-29 11:56:58,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:56:59,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 11:56:59,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 11:57:01,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:57:01,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:57:01,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:57:01,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 11:57:03,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:57:03,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:57:06,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:57:07,132 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=21.56 vs. limit=22.5 2023-09-29 11:57:16,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:57:16,161 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:57:19,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 11:57:21,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 11:57:24,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 11:57:24,814 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=357206.6666666667, ans=0.125 2023-09-29 11:57:26,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:57:28,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:57:28,551 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=357273.3333333333, ans=0.125 2023-09-29 11:57:32,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:57:34,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:57:36,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 11:57:37,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 11:57:40,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 11:57:40,742 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:57:40,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:57:42,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:57:44,229 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 11:57:44,246 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 11:57:44,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:57:44,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:57:46,671 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 11:57:48,288 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 11:57:49,793 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:57:51,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 11:57:52,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 11:57:54,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:57:57,959 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 11:57:58,008 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 11:58:00,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 11:58:04,476 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.885e+02 2.166e+02 2.451e+02 4.204e+02, threshold=4.332e+02, percent-clipped=0.0 2023-09-29 11:58:04,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:58:06,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 11:58:07,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 11:58:09,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:58:15,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:58:17,677 INFO [train.py:1039] (1/4) Epoch 11, batch 500, loss[loss=0.1803, simple_loss=0.2481, pruned_loss=0.05624, over 14768.00 frames. ], tot_loss[loss=0.204, simple_loss=0.2726, pruned_loss=0.0677, over 4320222.84 frames. ], batch size: 31, lr: 9.63e-03, grad_scale: 32.0 2023-09-29 11:58:17,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:58:19,389 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:58:19,445 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 11:58:24,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:58:24,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:58:26,139 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:58:26,155 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 11:58:27,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 11:58:27,662 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:58:30,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 11:58:32,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 11:58:35,817 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:58:36,735 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.68 vs. limit=15.0 2023-09-29 11:58:37,886 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:58:37,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:58:39,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:58:49,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:58:50,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 11:58:50,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:58:50,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:58:52,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 11:58:52,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:58:55,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:58:56,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:58:57,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:58:57,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:58:58,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 11:59:02,580 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 11:59:06,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:59:06,619 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 11:59:07,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:59:09,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:59:09,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:59:09,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 11:59:10,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 11:59:11,174 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=357673.3333333333, ans=0.125 2023-09-29 11:59:15,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:59:16,332 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=357673.3333333333, ans=0.125 2023-09-29 11:59:17,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:59:20,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:59:24,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:59:31,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:59:35,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 11:59:35,248 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:59:35,268 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:59:38,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 11:59:39,826 INFO [train.py:1039] (1/4) Epoch 11, batch 550, loss[loss=0.1776, simple_loss=0.2551, pruned_loss=0.05008, over 24443.00 frames. ], tot_loss[loss=0.2048, simple_loss=0.274, pruned_loss=0.06778, over 4406309.87 frames. ], batch size: 63, lr: 9.62e-03, grad_scale: 32.0 2023-09-29 11:59:39,914 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 11:59:41,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:59:46,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 11:59:48,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 11:59:49,016 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:59:50,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 11:59:50,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:59:50,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:59:52,046 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:59:52,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:59:52,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:59:53,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:59:56,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:59:56,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 11:59:58,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:00:03,056 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:00:04,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:00:04,894 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=357873.3333333333, ans=0.125 2023-09-29 12:00:06,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:00:06,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:00:10,104 WARNING [train.py:1197] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 12:00:10,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 12:00:13,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:00:16,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:00:16,880 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:00:19,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:00:23,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:00:23,347 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 12:00:23,486 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:00:25,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 12:00:28,190 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:00:28,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:00:28,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:00:30,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:00:31,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 12:00:33,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 12:00:34,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:00:34,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:00:34,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:00:34,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:00:38,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:00:40,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:00:43,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:00:43,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:00:43,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 12:00:46,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:00:48,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:00:49,661 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 12:00:49,749 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:00:51,042 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.720e+02 2.069e+02 2.330e+02 2.802e+02 5.186e+02, threshold=4.661e+02, percent-clipped=1.0 2023-09-29 12:00:51,483 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=358073.3333333333, ans=0.1 2023-09-29 12:00:53,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 12:00:53,200 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 12:00:59,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 12:01:02,380 INFO [train.py:1039] (1/4) Epoch 11, batch 600, loss[loss=0.1941, simple_loss=0.2584, pruned_loss=0.06487, over 23533.00 frames. ], tot_loss[loss=0.2057, simple_loss=0.2747, pruned_loss=0.06834, over 4462964.63 frames. ], batch size: 134, lr: 9.62e-03, grad_scale: 16.0 2023-09-29 12:01:02,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 12:01:04,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:01:04,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 12:01:04,301 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=358140.0, ans=0.125 2023-09-29 12:01:06,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:01:14,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:01:14,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 12:01:17,498 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 12:01:20,456 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:01:20,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:01:22,219 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:01:22,753 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.79 vs. limit=15.0 2023-09-29 12:01:25,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 12:01:25,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:01:31,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 12:01:34,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:01:34,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:01:34,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:01:39,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:01:39,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:01:41,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:01:49,378 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:01:54,032 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:01:54,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:01:54,052 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:02:02,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 12:02:07,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 12:02:07,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:02:12,715 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=358406.6666666667, ans=0.125 2023-09-29 12:02:13,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 12:02:13,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:02:17,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 12:02:17,064 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:02:17,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:02:25,006 INFO [train.py:1039] (1/4) Epoch 11, batch 650, loss[loss=0.1909, simple_loss=0.2418, pruned_loss=0.06996, over 22604.00 frames. ], tot_loss[loss=0.2042, simple_loss=0.2739, pruned_loss=0.0672, over 4520921.68 frames. ], batch size: 322, lr: 9.61e-03, grad_scale: 16.0 2023-09-29 12:02:25,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 12:02:26,683 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 12:02:29,053 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.92 vs. limit=10.0 2023-09-29 12:02:29,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:02:31,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:02:33,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:02:35,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 12:02:37,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:02:42,939 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.59 vs. limit=22.5 2023-09-29 12:02:43,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 12:02:43,923 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:02:44,249 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=358540.0, ans=0.0 2023-09-29 12:02:47,293 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:02:49,678 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.78 vs. limit=22.5 2023-09-29 12:02:51,721 WARNING [train.py:1197] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 12:02:53,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:02:55,200 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:02:57,631 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.00 vs. limit=15.0 2023-09-29 12:02:58,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:02:58,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 12:03:01,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:03:01,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:03:02,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 12:03:03,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:03:05,197 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:03:09,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 12:03:09,491 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 12:03:09,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:03:09,542 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:03:13,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:03:14,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:03:14,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:03:14,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:03:16,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 12:03:18,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:03:18,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:03:18,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 12:03:18,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:03:20,501 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=358673.3333333333, ans=0.125 2023-09-29 12:03:21,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 12:03:23,274 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 12:03:24,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 12:03:24,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:03:24,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:03:24,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:03:26,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:03:28,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:03:33,287 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:03:33,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:03:35,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:03:37,065 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.924e+02 2.251e+02 2.757e+02 4.294e+02, threshold=4.503e+02, percent-clipped=0.0 2023-09-29 12:03:37,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:03:37,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 12:03:37,327 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:03:46,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 12:03:46,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:03:46,317 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:03:46,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:03:48,396 INFO [train.py:1039] (1/4) Epoch 11, batch 700, loss[loss=0.2071, simple_loss=0.2741, pruned_loss=0.07004, over 23313.00 frames. ], tot_loss[loss=0.2026, simple_loss=0.272, pruned_loss=0.06665, over 4546990.02 frames. ], batch size: 106, lr: 9.61e-03, grad_scale: 16.0 2023-09-29 12:03:52,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 12:03:52,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 12:03:55,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 12:03:56,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:03:58,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:04:00,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 12:04:04,038 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:04:09,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:04:10,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:04:12,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 12:04:12,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:04:15,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:04:17,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 12:04:17,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:04:20,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 12:04:23,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 12:04:27,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:04:28,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:04:30,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:04:35,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:04:35,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 12:04:41,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:04:41,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 12:04:42,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 12:04:45,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:04:46,182 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=359006.6666666667, ans=0.125 2023-09-29 12:04:47,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:04:51,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:04:52,071 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=359006.6666666667, ans=0.0 2023-09-29 12:04:55,592 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.75 vs. limit=12.0 2023-09-29 12:04:57,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:04:57,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 12:05:00,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 12:05:00,105 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 12:05:00,228 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=359073.3333333333, ans=0.035 2023-09-29 12:05:05,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:05:05,453 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=359073.3333333333, ans=0.0 2023-09-29 12:05:06,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:05:08,918 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:05:10,466 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:05:10,475 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 12:05:12,362 INFO [train.py:1039] (1/4) Epoch 11, batch 750, loss[loss=0.2117, simple_loss=0.2892, pruned_loss=0.06709, over 24574.00 frames. ], tot_loss[loss=0.2023, simple_loss=0.2718, pruned_loss=0.06635, over 4585421.19 frames. ], batch size: 71, lr: 9.60e-03, grad_scale: 16.0 2023-09-29 12:05:15,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 12:05:15,634 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 12:05:15,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 12:05:17,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 12:05:17,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 12:05:17,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:05:18,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 12:05:20,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:05:20,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:05:20,930 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=359140.0, ans=0.125 2023-09-29 12:05:23,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:05:26,291 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:05:26,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 12:05:26,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:05:28,147 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:05:29,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:05:31,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:05:32,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:05:34,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:05:35,206 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=359206.6666666667, ans=0.05 2023-09-29 12:05:36,478 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 12:05:36,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:05:39,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:05:39,653 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:05:41,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 12:05:43,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 12:05:43,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:05:45,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 12:05:45,148 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 12:05:47,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 12:05:47,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 12:05:47,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 12:05:50,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:05:57,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:05:57,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:05:57,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 12:06:00,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:06:02,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:06:04,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 12:06:05,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:06:06,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 12:06:07,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:06:12,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:06:14,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 12:06:14,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:06:18,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:06:19,652 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.87 vs. limit=15.0 2023-09-29 12:06:21,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:06:22,360 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.648e+02 2.014e+02 2.278e+02 2.730e+02 4.361e+02, threshold=4.557e+02, percent-clipped=0.0 2023-09-29 12:06:22,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:06:24,721 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=359406.6666666667, ans=0.125 2023-09-29 12:06:25,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 12:06:28,632 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.58 vs. limit=6.0 2023-09-29 12:06:29,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 12:06:29,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:06:29,502 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=359406.6666666667, ans=0.125 2023-09-29 12:06:30,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:06:32,434 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:06:33,829 INFO [train.py:1039] (1/4) Epoch 11, batch 800, loss[loss=0.1999, simple_loss=0.2638, pruned_loss=0.06799, over 23636.00 frames. ], tot_loss[loss=0.2014, simple_loss=0.2713, pruned_loss=0.06571, over 4616429.11 frames. ], batch size: 120, lr: 9.60e-03, grad_scale: 32.0 2023-09-29 12:06:33,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:06:35,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:06:35,736 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 12:06:40,520 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=359473.3333333333, ans=0.0 2023-09-29 12:06:45,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:06:45,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:06:45,601 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_positive, batch_count=359473.3333333333, ans=0.05 2023-09-29 12:06:47,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:06:47,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:06:50,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:06:50,063 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:06:52,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:06:55,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:06:57,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:07:01,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 12:07:02,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:07:02,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:07:02,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:07:04,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:07:04,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 12:07:04,255 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:07:04,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 12:07:04,638 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=359540.0, ans=10.0 2023-09-29 12:07:07,734 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=359606.6666666667, ans=0.0 2023-09-29 12:07:08,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:07:11,913 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:07:13,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:07:13,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:07:13,853 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=359606.6666666667, ans=0.2 2023-09-29 12:07:16,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:07:16,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:07:20,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:07:20,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:07:22,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 12:07:23,703 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 12:07:23,761 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 12:07:23,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 12:07:23,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:07:27,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:07:27,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:07:32,516 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 12:07:32,672 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=359673.3333333333, ans=0.125 2023-09-29 12:07:33,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 12:07:34,767 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=12.32 vs. limit=15.0 2023-09-29 12:07:35,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 12:07:37,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 12:07:41,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:07:43,618 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=359740.0, ans=0.125 2023-09-29 12:07:44,774 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:07:46,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 12:07:47,717 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:07:50,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 12:07:55,865 INFO [train.py:1039] (1/4) Epoch 11, batch 850, loss[loss=0.2561, simple_loss=0.306, pruned_loss=0.1031, over 19414.00 frames. ], tot_loss[loss=0.2024, simple_loss=0.2724, pruned_loss=0.06619, over 4643552.62 frames. ], batch size: 388, lr: 9.60e-03, grad_scale: 16.0 2023-09-29 12:07:56,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:07:57,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:07:59,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 12:08:00,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:08:00,712 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:08:01,010 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=359806.6666666667, ans=0.0 2023-09-29 12:08:02,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 12:08:02,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:08:05,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:08:06,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:08:08,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 12:08:10,230 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:08:10,375 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 12:08:11,763 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 12:08:11,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 12:08:13,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:08:13,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:08:15,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:08:15,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:08:16,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 12:08:21,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:08:21,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:08:21,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 12:08:24,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 12:08:29,618 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:08:31,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 12:08:34,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 12:08:36,325 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 12:08:40,072 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 12:08:40,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:08:40,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:08:40,131 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 12:08:43,766 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:08:45,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:08:46,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 12:08:48,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:08:50,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:08:50,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:08:50,340 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 12:08:51,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:08:53,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 12:08:54,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 12:08:58,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:08:58,037 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:08:59,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:08:59,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:09:01,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:09:03,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:09:03,579 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=360073.3333333333, ans=0.125 2023-09-29 12:09:04,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 12:09:06,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 12:09:07,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:09:07,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 12:09:09,283 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.651e+02 2.084e+02 2.353e+02 2.728e+02 3.950e+02, threshold=4.707e+02, percent-clipped=0.0 2023-09-29 12:09:12,523 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=360073.3333333333, ans=0.125 2023-09-29 12:09:16,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 12:09:17,124 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=360073.3333333333, ans=0.0 2023-09-29 12:09:18,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:09:19,132 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten.whitening_limit, batch_count=360140.0, ans=15.0 2023-09-29 12:09:19,681 INFO [train.py:1039] (1/4) Epoch 11, batch 900, loss[loss=0.2065, simple_loss=0.2821, pruned_loss=0.06539, over 24413.00 frames. ], tot_loss[loss=0.2037, simple_loss=0.2738, pruned_loss=0.06682, over 4662736.08 frames. ], batch size: 77, lr: 9.59e-03, grad_scale: 16.0 2023-09-29 12:09:19,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 12:09:19,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:09:19,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:09:22,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 12:09:26,319 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=360140.0, ans=0.2 2023-09-29 12:09:27,672 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:09:30,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:09:32,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 12:09:35,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:09:37,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 12:09:38,787 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 12:09:38,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:09:38,940 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:09:40,397 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 12:09:40,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:09:50,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:09:52,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:09:52,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 12:09:52,593 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=360273.3333333333, ans=0.95 2023-09-29 12:09:56,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:10:02,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 12:10:04,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:10:07,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 12:10:07,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:10:09,128 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 12:10:11,083 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 12:10:15,847 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 12:10:15,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:10:17,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 12:10:24,109 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:10:24,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:10:27,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 12:10:27,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:10:28,664 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 12:10:30,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:10:31,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:10:31,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:10:31,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:10:32,921 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.54 vs. limit=5.0 2023-09-29 12:10:37,884 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 12:10:37,955 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 12:10:39,405 INFO [train.py:1039] (1/4) Epoch 11, batch 950, loss[loss=0.207, simple_loss=0.2766, pruned_loss=0.06864, over 23595.00 frames. ], tot_loss[loss=0.2031, simple_loss=0.2733, pruned_loss=0.06641, over 4676956.23 frames. ], batch size: 120, lr: 9.59e-03, grad_scale: 16.0 2023-09-29 12:10:40,919 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 12:10:40,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 12:10:43,203 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:10:44,909 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=360473.3333333333, ans=0.0 2023-09-29 12:10:46,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 12:10:50,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:10:52,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:10:54,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:10:54,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 12:10:55,125 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=360540.0, ans=0.125 2023-09-29 12:10:58,468 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 12:11:00,880 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.28 vs. limit=22.5 2023-09-29 12:11:01,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:11:01,742 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:11:03,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:11:03,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:11:03,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 12:11:04,835 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 12:11:06,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:11:08,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 12:11:08,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:11:08,437 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 12:11:12,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:11:12,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:11:12,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:11:14,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 12:11:16,162 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 12:11:16,470 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=360606.6666666667, ans=0.125 2023-09-29 12:11:17,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:11:19,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:11:24,080 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:11:24,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:11:31,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 12:11:31,330 WARNING [train.py:1197] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 12:11:31,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:11:31,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:11:31,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:11:31,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:11:37,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 12:11:37,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:11:42,055 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:11:42,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:11:42,211 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 12:11:43,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:11:43,538 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:11:43,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 12:11:48,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 12:11:50,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:11:51,687 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.639e+02 1.923e+02 2.189e+02 2.546e+02 4.043e+02, threshold=4.378e+02, percent-clipped=0.0 2023-09-29 12:11:53,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:11:53,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 12:11:53,981 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=360740.0, ans=0.0 2023-09-29 12:11:55,211 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 12:12:00,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:12:02,423 INFO [train.py:1039] (1/4) Epoch 11, batch 1000, loss[loss=0.209, simple_loss=0.2456, pruned_loss=0.0862, over 19284.00 frames. ], tot_loss[loss=0.2026, simple_loss=0.2722, pruned_loss=0.06648, over 4668009.09 frames. ], batch size: 388, lr: 9.58e-03, grad_scale: 16.0 2023-09-29 12:12:05,537 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 12:12:05,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:12:10,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:12:11,807 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 12:12:11,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 12:12:12,493 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.11 vs. limit=6.0 2023-09-29 12:12:16,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:12:16,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:12:19,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:12:21,010 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 12:12:24,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 12:12:26,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 12:12:26,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:12:29,788 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 12:12:31,861 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 12:12:31,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 12:12:33,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:12:35,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:12:44,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:12:44,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:12:46,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:12:47,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:12:47,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 12:12:47,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:12:48,983 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:12:49,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:12:49,161 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 12:12:55,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 12:12:55,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 12:12:56,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 12:12:59,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:13:09,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:13:09,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:13:10,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:13:10,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:13:12,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 12:13:13,833 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:13:13,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 12:13:15,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 12:13:16,916 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:13:16,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:13:18,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:13:20,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 12:13:21,843 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:13:23,150 INFO [train.py:1039] (1/4) Epoch 11, batch 1050, loss[loss=0.2108, simple_loss=0.2602, pruned_loss=0.08071, over 22768.00 frames. ], tot_loss[loss=0.2018, simple_loss=0.2715, pruned_loss=0.06602, over 4679383.08 frames. ], batch size: 322, lr: 9.58e-03, grad_scale: 16.0 2023-09-29 12:13:24,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:13:26,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:13:27,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 12:13:29,346 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:13:32,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:13:35,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:13:36,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 12:13:39,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:13:42,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:13:42,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:13:43,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:13:44,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 12:13:45,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:13:46,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 12:13:46,905 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=361206.6666666667, ans=0.125 2023-09-29 12:13:49,560 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:13:49,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 12:13:49,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 12:13:55,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:13:56,033 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 12:13:57,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 12:13:57,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:14:00,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 12:14:00,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 12:14:00,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:14:03,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 12:14:06,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 12:14:08,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:14:12,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 12:14:13,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 12:14:14,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:14:16,446 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:14:21,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:14:24,511 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 12:14:26,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 12:14:26,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 12:14:26,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:14:27,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:14:27,826 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 12:14:32,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:14:33,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:14:33,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:14:35,261 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 1.858e+02 2.245e+02 2.592e+02 4.386e+02, threshold=4.489e+02, percent-clipped=1.0 2023-09-29 12:14:35,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:14:35,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:14:39,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:14:39,970 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 12:14:40,251 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=361406.6666666667, ans=0.125 2023-09-29 12:14:42,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:14:42,025 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 12:14:42,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 12:14:43,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:14:43,768 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=361473.3333333333, ans=0.1 2023-09-29 12:14:44,798 INFO [train.py:1039] (1/4) Epoch 11, batch 1100, loss[loss=0.2008, simple_loss=0.2666, pruned_loss=0.06753, over 23768.00 frames. ], tot_loss[loss=0.2018, simple_loss=0.2714, pruned_loss=0.06604, over 4683194.56 frames. ], batch size: 179, lr: 9.57e-03, grad_scale: 16.0 2023-09-29 12:14:47,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:14:51,900 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.07 vs. limit=22.5 2023-09-29 12:14:52,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:14:54,970 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=361473.3333333333, ans=0.125 2023-09-29 12:14:56,641 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=361473.3333333333, ans=0.0 2023-09-29 12:14:57,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 12:14:59,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:14:59,276 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:14:59,574 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=361473.3333333333, ans=0.125 2023-09-29 12:15:00,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 12:15:00,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:15:02,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 12:15:05,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:15:08,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:15:08,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 12:15:10,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 12:15:11,618 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:15:11,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:15:13,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:15:15,488 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 12:15:19,997 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:15:23,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 12:15:24,374 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 12:15:24,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:15:28,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:15:29,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 12:15:29,503 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:15:31,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 12:15:31,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:15:31,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:15:31,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:15:32,616 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:15:32,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 12:15:39,086 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:15:39,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 12:15:40,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 12:15:44,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 12:15:50,055 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 12:15:50,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 12:15:50,439 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=361673.3333333333, ans=0.125 2023-09-29 12:15:51,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:15:54,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:15:54,851 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=361740.0, ans=0.125 2023-09-29 12:15:56,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:15:57,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 12:15:59,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:15:59,178 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:16:00,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 12:16:02,667 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:16:02,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 12:16:04,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:16:04,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:16:05,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:16:08,100 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.68 vs. limit=12.0 2023-09-29 12:16:08,844 INFO [train.py:1039] (1/4) Epoch 11, batch 1150, loss[loss=0.2237, simple_loss=0.2822, pruned_loss=0.0826, over 23356.00 frames. ], tot_loss[loss=0.2032, simple_loss=0.2725, pruned_loss=0.06691, over 4676773.15 frames. ], batch size: 285, lr: 9.57e-03, grad_scale: 16.0 2023-09-29 12:16:09,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:16:10,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:16:12,467 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=361806.6666666667, ans=0.0 2023-09-29 12:16:13,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:16:13,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:16:13,604 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 12:16:15,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:16:17,310 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.24 vs. limit=15.0 2023-09-29 12:16:18,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 12:16:18,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:16:18,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 12:16:25,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 12:16:27,423 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:16:30,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:16:32,048 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:16:33,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 12:16:33,413 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 12:16:33,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:16:40,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 12:16:40,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:16:41,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:16:42,085 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=361940.0, ans=0.1 2023-09-29 12:16:43,629 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=361940.0, ans=0.0 2023-09-29 12:16:51,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:16:58,851 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:17:00,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 12:17:00,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:17:00,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:17:08,486 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 12:17:10,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:17:19,597 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 12:17:21,044 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.993e+02 2.217e+02 2.557e+02 3.633e+02, threshold=4.434e+02, percent-clipped=0.0 2023-09-29 12:17:24,189 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:17:25,742 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:17:25,797 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 12:17:25,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:17:27,683 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 12:17:30,722 INFO [train.py:1039] (1/4) Epoch 11, batch 1200, loss[loss=0.2003, simple_loss=0.2788, pruned_loss=0.06096, over 24634.00 frames. ], tot_loss[loss=0.2032, simple_loss=0.2733, pruned_loss=0.06651, over 4699771.71 frames. ], batch size: 68, lr: 9.57e-03, grad_scale: 32.0 2023-09-29 12:17:30,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:17:36,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:17:36,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:17:41,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:17:41,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:17:42,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:17:44,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:17:44,394 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=362140.0, ans=0.125 2023-09-29 12:17:45,632 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 12:17:47,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:17:47,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:17:48,919 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 12:17:49,083 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=362206.6666666667, ans=0.1 2023-09-29 12:17:51,936 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 12:17:55,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:17:58,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:18:00,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:18:01,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:18:01,898 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 12:18:03,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:18:12,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 12:18:12,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:18:12,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 12:18:12,458 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:18:17,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 12:18:20,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 12:18:20,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:18:20,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:18:22,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:18:22,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 12:18:24,369 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=362340.0, ans=0.0 2023-09-29 12:18:25,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:18:25,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:18:25,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:18:27,546 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 12:18:28,221 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.38 vs. limit=10.0 2023-09-29 12:18:28,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 12:18:29,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:18:29,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 12:18:30,665 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:18:30,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:18:35,212 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 12:18:36,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:18:40,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 12:18:46,538 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 12:18:48,019 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:18:51,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:18:52,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:18:52,923 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=362473.3333333333, ans=0.0 2023-09-29 12:18:54,028 INFO [train.py:1039] (1/4) Epoch 11, batch 1250, loss[loss=0.1911, simple_loss=0.2725, pruned_loss=0.05489, over 24549.00 frames. ], tot_loss[loss=0.2059, simple_loss=0.2753, pruned_loss=0.06821, over 4688987.49 frames. ], batch size: 71, lr: 9.56e-03, grad_scale: 16.0 2023-09-29 12:18:54,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:18:57,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 12:18:59,053 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=362473.3333333333, ans=0.125 2023-09-29 12:19:01,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:19:03,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:19:05,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 12:19:07,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:19:08,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 12:19:10,319 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=362540.0, ans=0.125 2023-09-29 12:19:13,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 12:19:13,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:19:14,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 12:19:14,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:19:19,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 12:19:23,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 12:19:23,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 12:19:24,607 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:19:26,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:19:27,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:19:28,159 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=362606.6666666667, ans=0.05 2023-09-29 12:19:30,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:19:32,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 12:19:37,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 12:19:37,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:19:38,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:19:39,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 12:19:40,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:19:40,995 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 12:19:41,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:19:41,041 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:19:41,353 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=362606.6666666667, ans=10.0 2023-09-29 12:19:47,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:19:48,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:19:50,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:19:51,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 12:19:51,966 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 12:19:53,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 12:19:57,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:19:59,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 12:19:59,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:20:02,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 12:20:02,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:20:05,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 12:20:05,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 12:20:05,348 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:20:06,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 12:20:06,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:20:08,085 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 1.898e+02 2.110e+02 2.286e+02 3.124e+02, threshold=4.219e+02, percent-clipped=0.0 2023-09-29 12:20:08,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 12:20:11,505 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:20:12,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:20:14,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 12:20:16,405 INFO [train.py:1039] (1/4) Epoch 11, batch 1300, loss[loss=0.2264, simple_loss=0.2804, pruned_loss=0.08613, over 23797.00 frames. ], tot_loss[loss=0.205, simple_loss=0.2746, pruned_loss=0.06769, over 4702084.36 frames. ], batch size: 164, lr: 9.56e-03, grad_scale: 16.0 2023-09-29 12:20:16,610 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 12:20:21,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:20:21,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 12:20:24,585 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=362806.6666666667, ans=0.125 2023-09-29 12:20:27,256 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:20:29,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 12:20:30,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:20:31,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:20:31,789 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:20:33,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 12:20:39,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:20:40,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 12:20:41,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 12:20:44,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 12:20:46,134 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=362873.3333333333, ans=0.0 2023-09-29 12:20:47,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:20:48,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:20:50,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:20:51,514 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.50 vs. limit=10.0 2023-09-29 12:20:52,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:20:53,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 12:20:55,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 12:20:55,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 12:21:02,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:21:02,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 12:21:03,045 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=362940.0, ans=0.2 2023-09-29 12:21:04,186 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 12:21:04,276 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 12:21:06,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:21:08,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:21:09,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 12:21:09,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:21:09,699 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 12:21:12,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:21:15,621 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:21:15,627 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:21:18,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 12:21:20,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 12:21:20,534 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=363073.3333333333, ans=0.125 2023-09-29 12:21:21,688 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 12:21:25,469 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:21:28,425 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 12:21:29,914 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:21:38,374 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.92 vs. limit=15.0 2023-09-29 12:21:39,081 INFO [train.py:1039] (1/4) Epoch 11, batch 1350, loss[loss=0.222, simple_loss=0.2971, pruned_loss=0.07351, over 23696.00 frames. ], tot_loss[loss=0.2045, simple_loss=0.2741, pruned_loss=0.06743, over 4698607.87 frames. ], batch size: 85, lr: 9.55e-03, grad_scale: 16.0 2023-09-29 12:21:39,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 12:21:42,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:21:44,658 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.13 vs. limit=15.0 2023-09-29 12:21:45,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:21:47,468 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=363140.0, ans=0.125 2023-09-29 12:21:48,626 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:21:48,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:21:50,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:21:50,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:21:54,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:21:56,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 12:21:58,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 12:21:58,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:22:01,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 12:22:03,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:22:03,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:22:03,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 12:22:05,526 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=363206.6666666667, ans=0.125 2023-09-29 12:22:06,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 12:22:09,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 12:22:10,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:22:11,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 12:22:13,832 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=363273.3333333333, ans=0.125 2023-09-29 12:22:18,560 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=363273.3333333333, ans=0.0 2023-09-29 12:22:22,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:22:25,041 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=363273.3333333333, ans=0.1 2023-09-29 12:22:32,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:22:32,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:22:32,649 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 12:22:36,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:22:37,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 12:22:38,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 12:22:39,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:22:41,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:22:42,102 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=363340.0, ans=0.125 2023-09-29 12:22:44,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 12:22:46,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:22:52,345 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.973e+02 2.286e+02 2.663e+02 4.619e+02, threshold=4.571e+02, percent-clipped=1.0 2023-09-29 12:22:52,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 12:22:52,930 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=363406.6666666667, ans=0.125 2023-09-29 12:22:54,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 12:22:57,591 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=363406.6666666667, ans=0.125 2023-09-29 12:22:58,987 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=363473.3333333333, ans=0.07 2023-09-29 12:22:59,928 INFO [train.py:1039] (1/4) Epoch 11, batch 1400, loss[loss=0.1879, simple_loss=0.2715, pruned_loss=0.05216, over 24617.00 frames. ], tot_loss[loss=0.2026, simple_loss=0.2723, pruned_loss=0.06643, over 4693187.48 frames. ], batch size: 68, lr: 9.55e-03, grad_scale: 16.0 2023-09-29 12:23:01,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 12:23:03,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:23:04,753 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:23:04,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:23:13,288 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 12:23:15,498 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 12:23:25,376 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=363540.0, ans=0.1 2023-09-29 12:23:28,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:23:29,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:23:31,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:23:31,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 12:23:35,858 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:23:35,988 WARNING [train.py:1197] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 12:23:40,741 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=363606.6666666667, ans=0.125 2023-09-29 12:23:47,152 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:23:49,162 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:23:53,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 12:23:55,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:23:56,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:23:56,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:23:56,856 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:23:58,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:23:58,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:23:58,487 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:24:01,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 12:24:01,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:24:04,954 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=363740.0, ans=0.125 2023-09-29 12:24:06,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:24:06,452 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=363740.0, ans=0.95 2023-09-29 12:24:06,468 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=363740.0, ans=0.0 2023-09-29 12:24:08,036 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=363740.0, ans=0.125 2023-09-29 12:24:09,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:24:14,163 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 12:24:15,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 12:24:17,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:24:21,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 12:24:21,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:24:22,680 INFO [train.py:1039] (1/4) Epoch 11, batch 1450, loss[loss=0.206, simple_loss=0.2807, pruned_loss=0.06566, over 23423.00 frames. ], tot_loss[loss=0.2022, simple_loss=0.272, pruned_loss=0.06619, over 4707351.97 frames. ], batch size: 93, lr: 9.54e-03, grad_scale: 16.0 2023-09-29 12:24:24,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:24:28,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 12:24:31,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:24:31,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:24:31,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 12:24:36,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:24:37,820 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 12:24:39,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:24:40,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 12:24:42,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 12:24:42,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 12:24:43,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:24:43,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:24:43,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 12:24:45,489 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:24:45,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:24:47,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 12:24:47,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:24:48,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:24:50,227 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:24:53,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:24:57,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:24:57,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:25:00,534 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.97 vs. limit=15.0 2023-09-29 12:25:01,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:25:01,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:25:03,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:25:03,083 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:25:03,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:25:04,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:25:09,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 12:25:11,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:25:15,931 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 12:25:16,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:25:17,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:25:17,882 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=364006.6666666667, ans=0.125 2023-09-29 12:25:19,051 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:25:20,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 12:25:23,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:25:25,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 12:25:27,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 12:25:27,599 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:25:33,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:25:34,536 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:25:36,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 12:25:38,105 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.910e+02 2.158e+02 2.591e+02 3.926e+02, threshold=4.316e+02, percent-clipped=0.0 2023-09-29 12:25:38,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 12:25:39,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 12:25:41,364 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:25:42,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 12:25:45,913 INFO [train.py:1039] (1/4) Epoch 11, batch 1500, loss[loss=0.2082, simple_loss=0.272, pruned_loss=0.07224, over 23728.00 frames. ], tot_loss[loss=0.2016, simple_loss=0.2715, pruned_loss=0.06583, over 4716400.57 frames. ], batch size: 135, lr: 9.54e-03, grad_scale: 16.0 2023-09-29 12:25:53,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 12:25:54,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:25:54,989 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:25:55,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:25:56,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:25:56,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:25:58,170 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 12:26:01,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 12:26:01,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 12:26:01,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:26:02,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:26:04,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:26:06,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:26:12,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:26:12,509 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 12:26:12,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 12:26:14,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:26:14,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:26:17,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 12:26:22,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 12:26:23,553 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:26:24,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 12:26:26,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 12:26:26,706 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=364273.3333333333, ans=0.125 2023-09-29 12:26:29,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:26:30,899 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:26:30,932 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:26:33,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 12:26:34,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:26:34,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:26:35,997 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 12:26:36,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:26:42,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:26:42,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 12:26:45,256 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=364340.0, ans=0.0 2023-09-29 12:26:50,085 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 12:26:51,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 12:26:53,672 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=364406.6666666667, ans=0.05 2023-09-29 12:26:54,814 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 12:26:54,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:26:54,917 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 12:26:55,434 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.59 vs. limit=10.0 2023-09-29 12:26:56,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:26:57,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:26:59,350 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 12:27:00,832 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 12:27:02,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 12:27:05,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:27:06,858 INFO [train.py:1039] (1/4) Epoch 11, batch 1550, loss[loss=0.2121, simple_loss=0.2714, pruned_loss=0.07638, over 23815.00 frames. ], tot_loss[loss=0.2026, simple_loss=0.2728, pruned_loss=0.06618, over 4727662.97 frames. ], batch size: 164, lr: 9.54e-03, grad_scale: 16.0 2023-09-29 12:27:07,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:27:08,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:27:08,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:27:08,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:27:08,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:27:10,794 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 12:27:11,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=364473.3333333333, ans=0.1 2023-09-29 12:27:12,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 12:27:12,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:27:13,606 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 12:27:13,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 12:27:17,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:27:19,347 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:27:19,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:27:21,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:27:21,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:27:22,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:27:25,972 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 12:27:26,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:27:27,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:27:27,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 12:27:30,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:27:30,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 12:27:32,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:27:32,184 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 12:27:33,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 12:27:33,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 12:27:33,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:27:35,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:27:39,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:27:42,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 12:27:42,966 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 12:27:44,712 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=364606.6666666667, ans=0.025 2023-09-29 12:27:51,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:27:57,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:27:57,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 12:27:57,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:27:58,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 12:28:01,144 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.04 vs. limit=6.0 2023-09-29 12:28:01,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 12:28:03,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:28:06,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:28:08,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:28:08,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:28:08,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 12:28:09,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:28:11,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:28:12,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:28:14,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 12:28:14,072 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 12:28:15,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:28:22,361 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.938e+02 2.255e+02 2.720e+02 4.386e+02, threshold=4.510e+02, percent-clipped=1.0 2023-09-29 12:28:22,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 12:28:28,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:28:29,649 INFO [train.py:1039] (1/4) Epoch 11, batch 1600, loss[loss=0.2028, simple_loss=0.2818, pruned_loss=0.0619, over 24485.00 frames. ], tot_loss[loss=0.2034, simple_loss=0.2733, pruned_loss=0.06672, over 4728253.88 frames. ], batch size: 66, lr: 9.53e-03, grad_scale: 16.0 2023-09-29 12:28:29,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:28:31,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 12:28:32,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:28:34,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:28:34,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 12:28:34,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:28:34,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:28:37,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:28:37,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 12:28:39,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 12:28:40,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 12:28:43,885 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:28:45,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 12:28:45,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:28:48,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:28:53,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:28:56,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 12:28:59,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:28:59,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 12:28:59,573 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=364873.3333333333, ans=0.0 2023-09-29 12:29:01,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:29:02,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 12:29:06,397 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=364940.0, ans=0.2 2023-09-29 12:29:08,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 12:29:15,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:29:15,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 12:29:15,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:29:15,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:29:15,380 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:29:18,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 12:29:23,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 12:29:26,191 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:29:26,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:29:28,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:29:30,240 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:29:32,525 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:29:34,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:29:34,979 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.39 vs. limit=15.0 2023-09-29 12:29:35,674 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 12:29:41,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:29:43,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:29:46,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 12:29:46,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:29:46,162 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 12:29:50,778 INFO [train.py:1039] (1/4) Epoch 11, batch 1650, loss[loss=0.1942, simple_loss=0.2745, pruned_loss=0.05698, over 24456.00 frames. ], tot_loss[loss=0.2045, simple_loss=0.2748, pruned_loss=0.06711, over 4724153.58 frames. ], batch size: 69, lr: 9.53e-03, grad_scale: 16.0 2023-09-29 12:29:50,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:29:52,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:29:53,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:29:53,919 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 12:29:53,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 12:29:53,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 12:29:55,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 12:29:58,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:29:58,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:30:00,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:30:00,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 12:30:00,495 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=365140.0, ans=0.0 2023-09-29 12:30:01,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:30:04,193 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 12:30:07,277 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:30:07,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:30:07,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:30:07,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:30:08,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 12:30:08,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 12:30:16,444 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 12:30:18,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 12:30:26,134 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=365273.3333333333, ans=0.125 2023-09-29 12:30:27,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 12:30:27,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:30:29,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 12:30:34,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:30:36,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:30:36,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:30:36,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:30:39,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:30:39,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:30:43,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:30:45,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:30:45,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:30:45,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:30:45,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:30:48,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:30:48,584 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=365340.0, ans=0.0 2023-09-29 12:30:51,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:30:51,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 12:30:52,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:30:54,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 12:30:55,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 12:30:57,248 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 12:30:57,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:30:57,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:30:58,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:30:58,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:30:58,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 12:31:02,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:31:04,947 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.934e+02 2.140e+02 2.424e+02 3.897e+02, threshold=4.280e+02, percent-clipped=0.0 2023-09-29 12:31:05,077 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:31:05,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:31:09,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 12:31:11,716 INFO [train.py:1039] (1/4) Epoch 11, batch 1700, loss[loss=0.2067, simple_loss=0.2652, pruned_loss=0.07409, over 23748.00 frames. ], tot_loss[loss=0.2035, simple_loss=0.2736, pruned_loss=0.0667, over 4726291.18 frames. ], batch size: 212, lr: 9.52e-03, grad_scale: 16.0 2023-09-29 12:31:12,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:31:12,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:31:14,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 12:31:14,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:31:14,388 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 12:31:15,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:31:15,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:31:17,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:31:17,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:31:17,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 12:31:20,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:31:20,570 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=365473.3333333333, ans=0.125 2023-09-29 12:31:22,087 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=365473.3333333333, ans=0.0 2023-09-29 12:31:30,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:31:30,880 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=365540.0, ans=0.0 2023-09-29 12:31:33,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:31:37,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 12:31:37,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:31:38,641 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:31:38,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:31:42,451 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 12:31:42,871 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=365606.6666666667, ans=0.0 2023-09-29 12:31:46,645 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:31:46,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:31:48,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 12:31:49,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 12:31:51,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 12:31:52,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 12:31:54,340 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:31:54,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 12:31:56,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:32:03,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:32:05,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:32:05,473 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=365673.3333333333, ans=0.0 2023-09-29 12:32:06,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:32:08,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 12:32:08,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 12:32:08,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:32:11,440 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:32:11,441 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 12:32:11,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:32:11,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:32:13,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:32:13,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:32:17,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:32:17,583 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:32:17,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:32:19,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:32:19,188 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:32:24,449 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:32:24,600 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 12:32:25,462 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=15.27 vs. limit=15.0 2023-09-29 12:32:26,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:32:27,948 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:32:29,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 12:32:34,317 INFO [train.py:1039] (1/4) Epoch 11, batch 1750, loss[loss=0.191, simple_loss=0.2628, pruned_loss=0.05954, over 20373.00 frames. ], tot_loss[loss=0.2025, simple_loss=0.2725, pruned_loss=0.06622, over 4723535.27 frames. ], batch size: 44, lr: 9.52e-03, grad_scale: 16.0 2023-09-29 12:32:34,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:32:37,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:32:37,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 12:32:38,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 12:32:39,007 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:32:39,861 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.68 vs. limit=15.0 2023-09-29 12:32:42,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:32:42,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:32:42,283 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=365806.6666666667, ans=0.0 2023-09-29 12:32:47,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 12:32:50,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:32:54,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 12:32:54,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:32:56,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:32:59,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 12:33:00,040 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.06 vs. limit=15.0 2023-09-29 12:33:00,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 12:33:02,497 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:33:02,547 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 12:33:02,904 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=365873.3333333333, ans=0.125 2023-09-29 12:33:07,242 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=365940.0, ans=0.015 2023-09-29 12:33:11,729 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:33:14,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:33:14,911 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:33:19,450 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:33:19,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:33:21,086 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:33:25,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:33:27,660 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:33:29,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:33:29,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 12:33:32,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:33:35,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 12:33:35,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:33:38,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:33:39,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:33:43,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 12:33:43,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 12:33:45,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:33:46,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:33:49,675 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 2.016e+02 2.265e+02 2.513e+02 4.125e+02, threshold=4.530e+02, percent-clipped=0.0 2023-09-29 12:33:51,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:33:54,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:33:55,903 INFO [train.py:1039] (1/4) Epoch 11, batch 1800, loss[loss=0.2146, simple_loss=0.2776, pruned_loss=0.07574, over 23478.00 frames. ], tot_loss[loss=0.2016, simple_loss=0.2715, pruned_loss=0.06586, over 4716214.53 frames. ], batch size: 119, lr: 9.51e-03, grad_scale: 16.0 2023-09-29 12:33:56,017 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:33:56,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 12:33:56,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:33:58,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 12:33:58,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:33:58,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:33:59,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:33:59,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:34:04,022 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:34:04,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:34:05,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 12:34:09,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:34:10,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 12:34:13,990 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:34:14,483 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=366206.6666666667, ans=0.1 2023-09-29 12:34:17,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:34:18,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:34:20,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:34:20,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:34:23,487 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:34:23,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 12:34:23,614 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:34:26,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:34:31,326 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 12:34:33,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 12:34:34,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 12:34:34,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:34:36,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:34:36,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:34:37,750 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:34:41,118 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=366273.3333333333, ans=0.0 2023-09-29 12:34:44,455 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 12:34:45,911 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:34:47,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:34:50,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 12:34:50,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 12:34:52,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:34:53,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:34:55,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 12:35:00,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 12:35:04,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:35:06,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 12:35:07,056 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:35:07,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:35:07,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:35:09,203 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 12:35:12,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 12:35:12,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:35:16,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 12:35:16,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:35:19,634 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:35:19,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:35:19,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:35:21,031 INFO [train.py:1039] (1/4) Epoch 11, batch 1850, loss[loss=0.2119, simple_loss=0.2808, pruned_loss=0.07147, over 23263.00 frames. ], tot_loss[loss=0.2024, simple_loss=0.2725, pruned_loss=0.06615, over 4722931.75 frames. ], batch size: 119, lr: 9.51e-03, grad_scale: 16.0 2023-09-29 12:35:21,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:35:21,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:35:24,355 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:35:24,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:35:27,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:35:28,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:35:32,793 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.68 vs. limit=15.0 2023-09-29 12:35:35,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:35:35,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 12:35:40,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 12:35:40,616 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=366540.0, ans=0.125 2023-09-29 12:35:43,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 12:35:48,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:35:48,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 12:35:48,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 12:35:57,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:35:59,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 12:36:03,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:36:03,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:36:05,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 12:36:05,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:36:05,621 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 12:36:07,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:36:10,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:36:12,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:36:17,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 12:36:17,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:36:17,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 12:36:17,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:36:20,694 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:36:22,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:36:26,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 12:36:26,662 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.34 vs. limit=15.0 2023-09-29 12:36:27,592 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:36:32,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:36:32,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:36:32,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 12:36:32,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 12:36:33,752 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 12:36:35,313 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 12:36:35,955 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.18 vs. limit=12.0 2023-09-29 12:36:36,670 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.648e+02 2.114e+02 2.440e+02 2.839e+02 4.239e+02, threshold=4.880e+02, percent-clipped=0.0 2023-09-29 12:36:36,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 12:36:36,901 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:36:36,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:36:38,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:36:38,476 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 12:36:38,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:36:39,848 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:36:39,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 12:36:41,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 12:36:42,844 INFO [train.py:1039] (1/4) Epoch 11, batch 1900, loss[loss=0.208, simple_loss=0.2823, pruned_loss=0.06683, over 23978.00 frames. ], tot_loss[loss=0.203, simple_loss=0.2734, pruned_loss=0.06627, over 4718875.39 frames. ], batch size: 80, lr: 9.51e-03, grad_scale: 16.0 2023-09-29 12:36:43,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:36:43,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 12:36:46,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:36:46,192 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 12:36:46,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:36:47,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:36:55,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:36:56,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:36:58,084 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 12:37:00,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 12:37:00,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:37:01,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:37:01,849 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 12:37:01,891 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 12:37:07,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 12:37:09,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:37:13,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 12:37:15,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 12:37:17,858 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.69 vs. limit=12.0 2023-09-29 12:37:23,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 12:37:25,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 12:37:25,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:37:26,801 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 12:37:26,808 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 12:37:26,875 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 12:37:28,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 12:37:28,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:37:30,558 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=367006.6666666667, ans=0.0 2023-09-29 12:37:33,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 12:37:35,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:37:40,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:37:40,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 12:37:43,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:37:46,753 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=367073.3333333333, ans=0.0 2023-09-29 12:37:47,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 12:37:47,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:37:49,594 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=367073.3333333333, ans=0.125 2023-09-29 12:37:56,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:37:56,235 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:37:56,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:37:57,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:37:59,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 12:37:59,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 12:38:01,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:38:04,381 INFO [train.py:1039] (1/4) Epoch 11, batch 1950, loss[loss=0.2001, simple_loss=0.28, pruned_loss=0.06011, over 24465.00 frames. ], tot_loss[loss=0.2029, simple_loss=0.2735, pruned_loss=0.06612, over 4728740.20 frames. ], batch size: 69, lr: 9.50e-03, grad_scale: 16.0 2023-09-29 12:38:04,618 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:38:04,621 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 12:38:07,727 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:38:07,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:38:07,825 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:38:09,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:38:13,174 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 12:38:16,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:38:16,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:38:16,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 12:38:18,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 12:38:20,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 12:38:20,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:38:21,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:38:24,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:38:24,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:38:25,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:38:27,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:38:32,068 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 12:38:32,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 12:38:32,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:38:32,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:38:35,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:38:39,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 12:38:39,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:38:39,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 12:38:39,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 12:38:39,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 12:38:40,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:38:41,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:38:44,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:38:46,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:38:47,123 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.03 vs. limit=15.0 2023-09-29 12:38:53,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 12:38:54,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:38:56,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:38:56,331 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 12:38:56,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:39:00,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:39:01,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:39:02,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:39:04,296 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=367340.0, ans=0.125 2023-09-29 12:39:09,805 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:39:11,285 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:39:14,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:39:16,025 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=367406.6666666667, ans=0.2 2023-09-29 12:39:17,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:39:19,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:39:19,670 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:39:21,450 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.643e+02 1.994e+02 2.316e+02 2.639e+02 3.669e+02, threshold=4.632e+02, percent-clipped=0.0 2023-09-29 12:39:21,601 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 12:39:21,609 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 12:39:21,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:39:23,277 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 12:39:24,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:39:27,733 INFO [train.py:1039] (1/4) Epoch 11, batch 2000, loss[loss=0.2, simple_loss=0.2629, pruned_loss=0.06858, over 23573.00 frames. ], tot_loss[loss=0.2039, simple_loss=0.2745, pruned_loss=0.06664, over 4726105.12 frames. ], batch size: 134, lr: 9.50e-03, grad_scale: 32.0 2023-09-29 12:39:28,100 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=367473.3333333333, ans=0.2 2023-09-29 12:39:29,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:39:30,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:39:32,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:39:33,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:39:35,451 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:39:37,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 12:39:39,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:39:42,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:39:44,452 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 12:39:46,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 12:39:46,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:39:50,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:39:51,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 12:39:54,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:39:58,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:39:59,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:39:59,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 12:39:59,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 12:40:02,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 12:40:02,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:40:03,877 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:40:03,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 12:40:03,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:40:05,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:40:06,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:40:06,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 12:40:11,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 12:40:11,427 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:40:11,450 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:40:11,786 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=367606.6666666667, ans=10.0 2023-09-29 12:40:15,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:40:18,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:40:18,320 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 12:40:18,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:40:19,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:40:21,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:40:22,844 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 12:40:22,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:40:24,540 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:40:28,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:40:29,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 12:40:33,980 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=367740.0, ans=0.125 2023-09-29 12:40:35,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 12:40:35,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:40:39,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:40:39,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:40:41,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:40:44,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:40:44,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:40:44,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 12:40:44,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:40:48,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:40:50,385 INFO [train.py:1039] (1/4) Epoch 11, batch 2050, loss[loss=0.2073, simple_loss=0.2699, pruned_loss=0.0724, over 23618.00 frames. ], tot_loss[loss=0.2029, simple_loss=0.2737, pruned_loss=0.066, over 4737129.87 frames. ], batch size: 106, lr: 9.49e-03, grad_scale: 32.0 2023-09-29 12:40:50,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:40:54,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:40:55,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:40:59,791 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=367806.6666666667, ans=0.015 2023-09-29 12:41:01,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:41:03,369 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:41:03,479 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:41:05,626 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:41:07,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 12:41:07,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:41:08,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:41:10,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 12:41:18,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:41:18,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:41:21,333 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 12:41:24,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:41:26,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 12:41:26,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:41:26,722 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 12:41:30,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:41:31,335 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 12:41:32,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:41:34,032 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 12:41:34,111 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:41:35,621 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:41:35,765 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:41:36,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:41:40,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:41:41,578 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:41:43,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 12:41:44,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:41:47,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 12:41:55,880 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:41:57,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 12:42:01,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:42:01,581 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=368073.3333333333, ans=0.1 2023-09-29 12:42:02,838 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:42:05,762 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=368073.3333333333, ans=0.015 2023-09-29 12:42:06,929 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.639e+02 2.014e+02 2.317e+02 2.683e+02 4.007e+02, threshold=4.634e+02, percent-clipped=0.0 2023-09-29 12:42:07,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:42:08,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 12:42:09,672 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=368073.3333333333, ans=0.125 2023-09-29 12:42:12,413 INFO [train.py:1039] (1/4) Epoch 11, batch 2100, loss[loss=0.178, simple_loss=0.2576, pruned_loss=0.04919, over 24523.00 frames. ], tot_loss[loss=0.202, simple_loss=0.2716, pruned_loss=0.06615, over 4720656.54 frames. ], batch size: 66, lr: 9.49e-03, grad_scale: 16.0 2023-09-29 12:42:14,050 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 12:42:14,051 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:42:14,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:42:15,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 12:42:15,691 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:42:15,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 12:42:17,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 12:42:18,766 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 12:42:19,244 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=368140.0, ans=0.125 2023-09-29 12:42:21,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:42:21,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:42:24,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:42:25,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:42:25,108 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 12:42:26,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 12:42:28,486 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 12:42:28,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 12:42:28,840 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=368206.6666666667, ans=0.125 2023-09-29 12:42:32,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:42:32,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:42:32,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 12:42:32,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 12:42:34,038 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=368206.6666666667, ans=0.125 2023-09-29 12:42:38,140 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 12:42:38,142 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 12:42:39,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:42:40,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:42:45,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:42:46,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 12:42:46,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:42:46,667 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 12:42:48,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 12:42:48,626 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:42:48,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 12:42:50,057 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 12:42:50,132 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 12:42:51,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:42:53,444 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:42:56,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 12:42:58,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 12:42:59,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:43:01,759 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:43:01,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 12:43:01,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:43:01,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:43:03,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:43:03,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 12:43:04,915 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 12:43:06,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 12:43:06,720 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=368340.0, ans=0.1 2023-09-29 12:43:09,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:43:12,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:43:14,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 12:43:19,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:43:21,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:43:22,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:43:22,957 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:43:22,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 12:43:23,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:43:25,246 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.37 vs. limit=22.5 2023-09-29 12:43:25,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:43:25,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 12:43:26,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:43:27,370 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:43:28,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 12:43:29,192 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=368406.6666666667, ans=0.0 2023-09-29 12:43:30,702 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 12:43:30,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:43:32,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:43:32,402 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:43:34,240 INFO [train.py:1039] (1/4) Epoch 11, batch 2150, loss[loss=0.1847, simple_loss=0.2611, pruned_loss=0.05412, over 24474.00 frames. ], tot_loss[loss=0.2012, simple_loss=0.271, pruned_loss=0.06566, over 4712241.92 frames. ], batch size: 66, lr: 9.48e-03, grad_scale: 16.0 2023-09-29 12:43:34,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:43:34,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:43:41,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 12:43:42,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:43:44,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:43:44,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:43:44,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:43:45,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:43:49,183 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=368540.0, ans=0.1 2023-09-29 12:43:51,023 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:43:51,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:43:51,149 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:43:55,339 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.65 vs. limit=15.0 2023-09-29 12:43:56,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:43:56,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 12:44:02,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:44:04,058 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 12:44:05,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:44:05,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:44:05,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:44:05,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 12:44:07,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:44:07,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:44:07,296 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:44:09,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 12:44:12,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:44:12,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:44:14,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:44:14,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:44:16,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:44:17,932 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:44:17,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 12:44:19,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:44:19,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 12:44:19,548 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:44:22,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:44:22,843 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=368673.3333333333, ans=0.0 2023-09-29 12:44:24,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:44:25,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:44:25,802 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=368673.3333333333, ans=0.125 2023-09-29 12:44:27,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 12:44:29,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:44:29,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:44:29,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 12:44:31,706 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=368673.3333333333, ans=0.125 2023-09-29 12:44:32,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 12:44:32,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:44:32,982 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 12:44:33,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:44:33,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:44:34,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 12:44:34,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:44:34,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 12:44:34,727 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 12:44:34,728 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 12:44:34,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 12:44:36,511 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 12:44:37,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:44:39,163 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:44:39,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:44:39,355 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=368740.0, ans=0.0 2023-09-29 12:44:40,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:44:40,712 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=368740.0, ans=0.0 2023-09-29 12:44:40,763 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=368740.0, ans=0.125 2023-09-29 12:44:42,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 12:44:45,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:44:45,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:44:52,248 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.938e+02 2.164e+02 2.545e+02 3.667e+02, threshold=4.328e+02, percent-clipped=0.0 2023-09-29 12:44:54,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:44:55,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 12:44:56,978 INFO [train.py:1039] (1/4) Epoch 11, batch 2200, loss[loss=0.2158, simple_loss=0.2745, pruned_loss=0.0785, over 23406.00 frames. ], tot_loss[loss=0.2009, simple_loss=0.2714, pruned_loss=0.06517, over 4724258.90 frames. ], batch size: 285, lr: 9.48e-03, grad_scale: 16.0 2023-09-29 12:44:57,216 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:44:57,361 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=368806.6666666667, ans=0.0 2023-09-29 12:45:01,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:45:01,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:45:02,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:45:05,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 12:45:05,891 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=368806.6666666667, ans=0.1 2023-09-29 12:45:07,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:45:08,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:45:08,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 12:45:12,296 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.54 vs. limit=15.0 2023-09-29 12:45:13,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 12:45:15,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 12:45:17,082 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=368873.3333333333, ans=0.2 2023-09-29 12:45:20,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 12:45:24,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:45:24,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:45:26,285 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:45:29,650 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:45:29,687 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 12:45:33,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 12:45:36,376 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:45:36,482 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 12:45:39,472 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.94 vs. limit=22.5 2023-09-29 12:45:40,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 12:45:41,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:45:43,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:45:45,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:45:49,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 12:45:50,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:45:51,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 12:45:55,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:45:55,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 12:45:55,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:45:57,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:45:58,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:45:58,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:46:00,072 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:46:01,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 12:46:01,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:46:04,749 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 12:46:07,878 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 12:46:07,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:46:11,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 12:46:12,638 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 12:46:14,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:46:15,439 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 12:46:17,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 12:46:17,084 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 12:46:19,826 INFO [train.py:1039] (1/4) Epoch 11, batch 2250, loss[loss=0.2279, simple_loss=0.2825, pruned_loss=0.08668, over 22705.00 frames. ], tot_loss[loss=0.2007, simple_loss=0.2716, pruned_loss=0.06494, over 4733354.91 frames. ], batch size: 322, lr: 9.48e-03, grad_scale: 16.0 2023-09-29 12:46:19,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:46:20,063 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 12:46:21,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:46:25,098 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 12:46:25,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:46:26,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 12:46:30,793 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 12:46:33,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:46:33,600 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:46:35,566 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=369206.6666666667, ans=0.2 2023-09-29 12:46:36,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:46:38,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:46:38,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 12:46:41,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 12:46:42,739 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:46:42,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:46:44,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 12:46:46,529 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:46:46,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:46:48,104 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:46:52,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:46:53,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 12:46:53,692 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 12:46:55,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 12:46:56,671 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.52 vs. limit=15.0 2023-09-29 12:46:57,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:46:58,015 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=3.50 vs. limit=12.0 2023-09-29 12:46:59,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:47:05,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:47:06,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:47:07,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:47:07,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:47:07,469 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=369273.3333333333, ans=0.0 2023-09-29 12:47:09,531 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=369340.0, ans=0.0 2023-09-29 12:47:10,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:47:12,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:47:16,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:47:20,562 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 12:47:25,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 12:47:25,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 12:47:25,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:47:26,422 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.57 vs. limit=15.0 2023-09-29 12:47:32,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 12:47:35,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 12:47:35,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 12:47:35,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:47:36,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:47:38,355 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.418e+02 2.043e+02 2.273e+02 2.723e+02 4.405e+02, threshold=4.547e+02, percent-clipped=1.0 2023-09-29 12:47:38,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 12:47:42,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:47:43,471 INFO [train.py:1039] (1/4) Epoch 11, batch 2300, loss[loss=0.2088, simple_loss=0.2882, pruned_loss=0.06477, over 24559.00 frames. ], tot_loss[loss=0.2028, simple_loss=0.2732, pruned_loss=0.06623, over 4730097.58 frames. ], batch size: 71, lr: 9.47e-03, grad_scale: 16.0 2023-09-29 12:47:43,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:47:46,931 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=369473.3333333333, ans=0.04949747468305833 2023-09-29 12:47:48,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:47:49,847 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:47:51,441 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 12:47:52,172 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.15 vs. limit=10.0 2023-09-29 12:47:52,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:48:00,135 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:48:00,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 12:48:01,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:48:03,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:48:03,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 12:48:03,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:48:05,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:48:05,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:48:12,034 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:48:13,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 12:48:16,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:48:18,594 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=369606.6666666667, ans=0.125 2023-09-29 12:48:21,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:48:21,459 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:48:22,321 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.72 vs. limit=15.0 2023-09-29 12:48:24,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:48:26,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:48:29,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:48:29,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 12:48:29,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:48:31,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 12:48:35,717 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=369673.3333333333, ans=0.025 2023-09-29 12:48:38,265 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 12:48:38,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:48:38,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:48:38,383 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:48:38,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:48:40,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 12:48:40,008 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 12:48:41,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 12:48:41,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:48:41,524 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:48:41,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 12:48:49,550 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:48:52,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:48:55,982 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:48:56,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:48:57,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 12:49:00,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 12:49:00,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:49:02,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:49:04,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 12:49:05,506 INFO [train.py:1039] (1/4) Epoch 11, batch 2350, loss[loss=0.1848, simple_loss=0.2548, pruned_loss=0.05741, over 24430.00 frames. ], tot_loss[loss=0.202, simple_loss=0.2722, pruned_loss=0.06593, over 4732833.44 frames. ], batch size: 58, lr: 9.47e-03, grad_scale: 16.0 2023-09-29 12:49:05,903 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=369806.6666666667, ans=0.0 2023-09-29 12:49:08,568 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.97 vs. limit=10.0 2023-09-29 12:49:11,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:49:12,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 12:49:17,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 12:49:22,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:49:25,886 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:49:27,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:49:27,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:49:27,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:49:28,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 12:49:32,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:49:36,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 12:49:38,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:49:41,612 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.19 vs. limit=15.0 2023-09-29 12:49:42,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:49:42,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:49:45,963 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:49:47,498 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 12:49:48,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:49:50,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:49:50,486 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:49:50,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:49:52,437 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=369940.0, ans=0.0 2023-09-29 12:49:55,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:49:57,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 12:49:57,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:50:00,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:50:01,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:50:04,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 12:50:05,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:50:05,883 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=370006.6666666667, ans=0.125 2023-09-29 12:50:07,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 12:50:08,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 12:50:12,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 12:50:15,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 12:50:17,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:50:17,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 12:50:18,003 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 12:50:18,050 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 12:50:19,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 12:50:22,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:50:23,822 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.674e+02 2.084e+02 2.454e+02 3.278e+02 4.890e+02, threshold=4.908e+02, percent-clipped=1.0 2023-09-29 12:50:25,793 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:50:29,137 INFO [train.py:1039] (1/4) Epoch 11, batch 2400, loss[loss=0.2233, simple_loss=0.2942, pruned_loss=0.07624, over 23234.00 frames. ], tot_loss[loss=0.2021, simple_loss=0.2718, pruned_loss=0.0662, over 4728532.61 frames. ], batch size: 105, lr: 9.46e-03, grad_scale: 32.0 2023-09-29 12:50:30,803 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:50:32,949 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:50:33,044 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 12:50:34,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 12:50:40,740 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 12:50:40,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:50:41,315 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.98 vs. limit=15.0 2023-09-29 12:50:43,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 12:50:43,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:50:45,544 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:50:47,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 12:50:51,850 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=370206.6666666667, ans=0.07 2023-09-29 12:50:54,649 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:50:56,405 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 12:51:02,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 12:51:06,207 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=370273.3333333333, ans=0.2 2023-09-29 12:51:07,436 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 12:51:12,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:51:13,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:51:17,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:51:18,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 12:51:20,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:51:27,547 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:51:30,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:51:32,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:51:33,670 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:51:33,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 12:51:33,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:51:33,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:51:35,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:51:35,264 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 12:51:39,037 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=370406.6666666667, ans=0.125 2023-09-29 12:51:40,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:51:40,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 12:51:40,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 12:51:43,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 12:51:44,154 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=370406.6666666667, ans=0.0 2023-09-29 12:51:46,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:51:46,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:51:46,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 12:51:48,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 12:51:48,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 12:51:48,287 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 12:51:49,922 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 12:51:51,315 INFO [train.py:1039] (1/4) Epoch 11, batch 2450, loss[loss=0.1794, simple_loss=0.2538, pruned_loss=0.05248, over 21205.00 frames. ], tot_loss[loss=0.2012, simple_loss=0.2708, pruned_loss=0.0658, over 4728245.75 frames. ], batch size: 46, lr: 9.46e-03, grad_scale: 16.0 2023-09-29 12:51:51,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:51:51,666 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:51:51,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:51:54,651 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 12:51:54,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:51:54,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 12:51:59,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 12:51:59,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:52:03,172 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:52:03,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:52:04,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 12:52:10,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:52:10,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:52:14,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:52:14,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:52:14,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:52:16,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 12:52:19,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:52:21,335 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 12:52:22,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:52:23,037 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=370606.6666666667, ans=0.2 2023-09-29 12:52:25,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 12:52:25,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:52:27,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:52:27,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:52:30,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 12:52:30,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:52:38,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:52:40,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:52:40,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:52:41,703 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:52:41,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:52:45,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:52:46,176 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.34 vs. limit=15.0 2023-09-29 12:52:46,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 12:52:48,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:52:50,415 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:52:53,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:52:53,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:52:53,841 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=370673.3333333333, ans=0.0 2023-09-29 12:52:53,847 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=370673.3333333333, ans=0.0 2023-09-29 12:52:59,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:52:59,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 12:53:01,073 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:53:01,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:53:01,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 12:53:02,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:53:02,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:53:05,239 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=370740.0, ans=0.025 2023-09-29 12:53:07,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:53:11,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:53:11,509 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:53:12,850 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 2.006e+02 2.269e+02 2.537e+02 3.932e+02, threshold=4.538e+02, percent-clipped=0.0 2023-09-29 12:53:14,457 INFO [train.py:1039] (1/4) Epoch 11, batch 2500, loss[loss=0.1924, simple_loss=0.263, pruned_loss=0.0609, over 23612.00 frames. ], tot_loss[loss=0.2007, simple_loss=0.2702, pruned_loss=0.06556, over 4716120.02 frames. ], batch size: 149, lr: 9.45e-03, grad_scale: 8.0 2023-09-29 12:53:16,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 12:53:16,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:53:22,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:53:23,154 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=370806.6666666667, ans=0.09899494936611666 2023-09-29 12:53:32,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 12:53:32,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:53:35,109 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.93 vs. limit=15.0 2023-09-29 12:53:35,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:53:35,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 12:53:44,022 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=370873.3333333333, ans=0.125 2023-09-29 12:53:45,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 12:53:45,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:53:47,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 12:53:47,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 12:53:48,866 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 12:53:50,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:53:50,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:53:50,608 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 12:53:50,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:53:52,121 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 12:53:52,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:53:56,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:53:56,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:54:00,369 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 12:54:00,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 12:54:00,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:54:04,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:54:04,993 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=8.25 vs. limit=15.0 2023-09-29 12:54:07,363 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:54:11,991 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:54:12,228 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=371006.6666666667, ans=0.125 2023-09-29 12:54:15,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:54:20,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 12:54:21,188 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=371073.3333333333, ans=0.1 2023-09-29 12:54:23,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 12:54:23,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:54:25,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 12:54:26,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:54:26,943 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 12:54:28,429 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 12:54:28,430 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 12:54:28,439 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 12:54:31,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:54:33,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 12:54:33,809 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 12:54:35,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:54:36,964 INFO [train.py:1039] (1/4) Epoch 11, batch 2550, loss[loss=0.2205, simple_loss=0.2794, pruned_loss=0.08077, over 23457.00 frames. ], tot_loss[loss=0.2012, simple_loss=0.2707, pruned_loss=0.06579, over 4704939.68 frames. ], batch size: 285, lr: 9.45e-03, grad_scale: 8.0 2023-09-29 12:54:37,102 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 12:54:40,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 12:54:42,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:54:42,326 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=371140.0, ans=0.1 2023-09-29 12:54:45,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:54:45,193 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:54:48,803 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:54:48,916 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 12:54:50,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 12:54:54,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 12:54:55,850 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:54:57,464 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:55:00,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:55:00,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 12:55:00,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 12:55:00,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:55:02,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:55:05,209 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:55:05,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 12:55:05,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 12:55:06,572 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:55:06,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 12:55:07,451 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.62 vs. limit=15.0 2023-09-29 12:55:20,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:55:24,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:55:24,662 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:55:24,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:55:26,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 12:55:34,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:55:37,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 12:55:37,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:55:37,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:55:37,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 12:55:37,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 12:55:38,313 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.53 vs. limit=22.5 2023-09-29 12:55:40,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:55:40,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:55:48,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:55:48,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 12:55:48,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:55:49,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:55:49,789 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:55:51,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 12:55:51,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:55:57,878 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.868e+02 2.141e+02 2.517e+02 4.100e+02, threshold=4.283e+02, percent-clipped=0.0 2023-09-29 12:55:59,463 INFO [train.py:1039] (1/4) Epoch 11, batch 2600, loss[loss=0.2099, simple_loss=0.2735, pruned_loss=0.07316, over 23844.00 frames. ], tot_loss[loss=0.2019, simple_loss=0.2717, pruned_loss=0.06603, over 4718622.87 frames. ], batch size: 195, lr: 9.45e-03, grad_scale: 8.0 2023-09-29 12:55:59,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:56:01,291 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:56:04,876 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 12:56:06,537 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 12:56:06,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 12:56:08,064 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 12:56:08,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 12:56:08,221 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 12:56:11,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:56:11,281 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 12:56:12,035 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.97 vs. limit=10.0 2023-09-29 12:56:12,778 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 12:56:16,360 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 12:56:18,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:56:20,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 12:56:21,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 12:56:23,025 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 12:56:23,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 12:56:25,998 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 12:56:26,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 12:56:26,218 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=371540.0, ans=0.2 2023-09-29 12:56:34,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:56:34,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:56:34,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:56:34,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 12:56:36,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:56:42,651 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 12:56:48,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:56:48,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:56:50,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 12:56:52,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:56:52,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:56:52,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 12:56:55,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 12:56:57,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:56:58,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:57:03,915 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 12:57:03,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:57:03,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:57:08,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:57:10,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:57:10,163 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 12:57:11,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:57:13,138 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:57:13,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:57:13,512 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 12:57:20,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 12:57:20,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:57:21,943 INFO [train.py:1039] (1/4) Epoch 11, batch 2650, loss[loss=0.2788, simple_loss=0.3233, pruned_loss=0.1171, over 19396.00 frames. ], tot_loss[loss=0.2033, simple_loss=0.2729, pruned_loss=0.06689, over 4714868.61 frames. ], batch size: 388, lr: 9.44e-03, grad_scale: 8.0 2023-09-29 12:57:23,611 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 12:57:28,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 12:57:28,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:57:30,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 12:57:30,342 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 12:57:30,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:57:32,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:57:34,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 12:57:35,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:57:38,661 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:57:38,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 12:57:40,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:57:40,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:57:43,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 12:57:44,754 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 12:57:48,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:57:49,815 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 12:57:49,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:57:49,938 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 12:57:55,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:57:55,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 12:57:55,207 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:57:56,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:57:58,538 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=371940.0, ans=0.125 2023-09-29 12:57:58,577 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=371940.0, ans=0.125 2023-09-29 12:58:00,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 12:58:00,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 12:58:03,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 12:58:07,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 12:58:07,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:58:08,680 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:58:10,450 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:58:10,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:58:10,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:58:13,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:58:13,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:58:15,181 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:58:15,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 12:58:16,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:58:20,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:58:20,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:58:21,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:58:23,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:58:23,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 12:58:27,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:58:27,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:58:27,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:58:29,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 12:58:36,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:58:38,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:58:38,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:58:39,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:58:39,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:58:41,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:58:43,261 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.976e+02 2.292e+02 2.609e+02 3.713e+02, threshold=4.584e+02, percent-clipped=0.0 2023-09-29 12:58:44,819 INFO [train.py:1039] (1/4) Epoch 11, batch 2700, loss[loss=0.1634, simple_loss=0.2371, pruned_loss=0.04487, over 24304.00 frames. ], tot_loss[loss=0.2037, simple_loss=0.2734, pruned_loss=0.067, over 4723831.52 frames. ], batch size: 56, lr: 9.44e-03, grad_scale: 8.0 2023-09-29 12:58:44,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:58:44,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 12:58:46,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:58:48,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 12:58:49,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:58:49,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:58:49,881 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:58:51,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:58:51,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:58:51,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:58:51,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 12:58:53,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 12:58:54,899 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:58:56,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 12:58:57,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 12:58:59,250 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:58:59,485 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=372206.6666666667, ans=0.125 2023-09-29 12:59:02,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 12:59:04,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 12:59:04,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:59:07,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:59:08,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:59:14,926 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:59:14,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:59:14,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:59:14,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 12:59:19,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:59:21,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:59:21,216 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:59:21,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:59:22,897 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=372273.3333333333, ans=0.125 2023-09-29 12:59:25,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:59:27,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:59:32,695 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=372340.0, ans=0.2 2023-09-29 12:59:37,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:59:38,913 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:59:41,897 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:59:41,913 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:59:45,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:59:45,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:59:47,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:59:50,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:59:52,266 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:59:52,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:59:55,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:59:55,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:59:56,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:59:59,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 13:00:01,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:00:03,668 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:00:03,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 13:00:05,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 13:00:06,667 INFO [train.py:1039] (1/4) Epoch 11, batch 2750, loss[loss=0.2333, simple_loss=0.2924, pruned_loss=0.08711, over 23163.00 frames. ], tot_loss[loss=0.2027, simple_loss=0.2721, pruned_loss=0.06662, over 4727442.74 frames. ], batch size: 105, lr: 9.43e-03, grad_scale: 8.0 2023-09-29 13:00:06,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:00:10,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:00:10,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:00:13,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:00:13,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 13:00:13,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:00:17,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:00:17,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 13:00:17,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:00:17,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:00:17,938 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 13:00:19,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:00:19,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:00:24,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 13:00:27,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:00:27,878 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:00:27,972 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:00:29,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 13:00:29,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:00:31,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:00:32,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:00:32,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:00:37,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 13:00:37,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 13:00:37,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:00:39,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:00:41,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 13:00:41,335 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=372606.6666666667, ans=0.125 2023-09-29 13:00:47,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:00:49,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 13:00:50,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:00:51,078 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=372606.6666666667, ans=0.1 2023-09-29 13:00:57,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:00:57,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 13:00:57,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:00:58,220 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.71 vs. limit=15.0 2023-09-29 13:00:58,298 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.52 vs. limit=22.5 2023-09-29 13:01:02,378 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:01:03,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:01:03,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 13:01:08,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:01:09,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 13:01:15,161 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 13:01:18,668 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:01:19,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 13:01:20,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:01:23,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:01:23,097 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 13:01:24,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:01:25,971 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.991e+02 2.210e+02 2.554e+02 4.000e+02, threshold=4.420e+02, percent-clipped=0.0 2023-09-29 13:01:26,546 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=372806.6666666667, ans=0.125 2023-09-29 13:01:27,632 INFO [train.py:1039] (1/4) Epoch 11, batch 2800, loss[loss=0.2043, simple_loss=0.2621, pruned_loss=0.0732, over 23870.00 frames. ], tot_loss[loss=0.201, simple_loss=0.2705, pruned_loss=0.06582, over 4732304.03 frames. ], batch size: 195, lr: 9.43e-03, grad_scale: 16.0 2023-09-29 13:01:27,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 13:01:27,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:01:30,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:01:30,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 13:01:30,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:01:30,664 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:01:32,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:01:32,780 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.92 vs. limit=15.0 2023-09-29 13:01:33,612 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 13:01:33,613 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 13:01:36,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:01:38,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 13:01:39,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:01:42,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:01:44,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 13:01:47,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 13:01:49,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 13:01:51,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:01:52,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:01:52,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:01:53,074 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=372873.3333333333, ans=0.0 2023-09-29 13:01:56,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:01:56,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:01:56,105 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 13:01:57,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:02:05,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:02:07,965 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:02:10,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:02:12,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:02:13,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:02:18,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:02:18,575 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 13:02:19,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:02:20,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:02:20,086 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:02:25,892 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:02:25,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:02:26,189 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=373006.6666666667, ans=0.0 2023-09-29 13:02:27,965 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.31 vs. limit=10.0 2023-09-29 13:02:30,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:02:32,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:02:33,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:02:33,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 13:02:33,517 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 13:02:34,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:02:35,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:02:35,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 13:02:36,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:02:38,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:02:38,742 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:02:38,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 13:02:39,764 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.06 vs. limit=15.0 2023-09-29 13:02:40,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:02:40,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:02:40,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:02:41,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 13:02:49,513 INFO [train.py:1039] (1/4) Epoch 11, batch 2850, loss[loss=0.1922, simple_loss=0.2845, pruned_loss=0.0499, over 24640.00 frames. ], tot_loss[loss=0.1999, simple_loss=0.2696, pruned_loss=0.0651, over 4733224.42 frames. ], batch size: 73, lr: 9.43e-03, grad_scale: 16.0 2023-09-29 13:02:49,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:02:49,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 13:02:50,402 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=13.49 vs. limit=15.0 2023-09-29 13:02:51,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:02:53,305 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:02:53,964 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.64 vs. limit=15.0 2023-09-29 13:02:54,207 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.whiten.whitening_limit, batch_count=373140.0, ans=15.0 2023-09-29 13:02:58,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:02:58,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:02:58,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:03:01,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:03:01,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:03:03,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 13:03:03,216 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 13:03:03,979 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.66 vs. limit=15.0 2023-09-29 13:03:06,644 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=373206.6666666667, ans=0.0 2023-09-29 13:03:10,052 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 13:03:10,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:03:10,270 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=373206.6666666667, ans=0.125 2023-09-29 13:03:11,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 13:03:13,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:03:14,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 13:03:14,776 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 13:03:15,095 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=373206.6666666667, ans=0.1 2023-09-29 13:03:16,909 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:03:27,358 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=373273.3333333333, ans=0.125 2023-09-29 13:03:30,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:03:32,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:03:32,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:03:32,897 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.83 vs. limit=10.0 2023-09-29 13:03:33,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 13:03:33,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 13:03:35,281 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:03:36,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:03:41,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 13:03:43,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 13:03:44,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:03:44,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:03:46,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:03:48,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:03:48,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:03:49,186 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.14 vs. limit=15.0 2023-09-29 13:03:50,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:03:53,631 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:03:53,960 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=373340.0, ans=0.125 2023-09-29 13:03:55,175 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:03:55,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:03:56,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:03:58,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:04:04,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:04:04,679 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=373406.6666666667, ans=0.0 2023-09-29 13:04:06,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 13:04:06,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 13:04:10,149 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 13:04:10,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:04:10,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 13:04:10,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:04:10,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=373406.6666666667, ans=0.0 2023-09-29 13:04:11,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:04:13,357 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:04:13,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:04:13,410 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 13:04:13,476 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 13:04:14,336 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.38 vs. limit=12.0 2023-09-29 13:04:14,713 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.974e+02 2.166e+02 2.699e+02 4.540e+02, threshold=4.331e+02, percent-clipped=1.0 2023-09-29 13:04:14,811 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:04:14,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:04:16,394 INFO [train.py:1039] (1/4) Epoch 11, batch 2900, loss[loss=0.2035, simple_loss=0.2709, pruned_loss=0.06803, over 19542.00 frames. ], tot_loss[loss=0.1998, simple_loss=0.27, pruned_loss=0.06482, over 4733907.36 frames. ], batch size: 42, lr: 9.42e-03, grad_scale: 16.0 2023-09-29 13:04:18,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 13:04:18,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:04:18,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:04:18,721 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=373473.3333333333, ans=0.125 2023-09-29 13:04:19,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 13:04:25,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:04:26,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 13:04:26,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 13:04:28,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:04:28,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 13:04:30,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:04:31,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:04:36,235 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:04:36,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:04:40,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 13:04:40,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 13:04:40,538 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=373540.0, ans=0.2 2023-09-29 13:04:41,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 13:04:43,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:04:45,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 13:04:45,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 13:04:45,712 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=373540.0, ans=0.125 2023-09-29 13:04:48,376 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:04:48,380 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 13:04:48,406 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:04:49,949 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:04:49,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 13:04:50,179 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=373606.6666666667, ans=0.1 2023-09-29 13:04:50,268 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=373606.6666666667, ans=0.125 2023-09-29 13:04:52,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:04:53,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:04:58,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:05:01,789 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:05:04,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 13:05:04,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 13:05:04,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:05:05,134 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=373673.3333333333, ans=0.0 2023-09-29 13:05:05,592 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=9.75 vs. limit=15.0 2023-09-29 13:05:07,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:05:09,774 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=373673.3333333333, ans=0.1 2023-09-29 13:05:10,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 13:05:13,055 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:05:16,637 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=373673.3333333333, ans=0.035 2023-09-29 13:05:19,626 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:05:27,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:05:28,879 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 13:05:31,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 13:05:34,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:05:34,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 13:05:34,264 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:05:34,771 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.10 vs. limit=15.0 2023-09-29 13:05:35,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 13:05:39,103 INFO [train.py:1039] (1/4) Epoch 11, batch 2950, loss[loss=0.2085, simple_loss=0.2851, pruned_loss=0.06597, over 24565.00 frames. ], tot_loss[loss=0.2006, simple_loss=0.2709, pruned_loss=0.06517, over 4735439.62 frames. ], batch size: 71, lr: 9.42e-03, grad_scale: 16.0 2023-09-29 13:05:43,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:05:45,097 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 13:05:45,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:05:45,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:05:46,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:05:48,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:05:48,520 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 13:05:50,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 13:05:52,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 13:05:52,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:06:00,095 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:06:00,364 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=373873.3333333333, ans=0.0 2023-09-29 13:06:01,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:06:04,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:06:04,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:06:08,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:06:08,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:06:11,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:06:12,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:06:12,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:06:14,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 13:06:20,889 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 13:06:20,920 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 13:06:22,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:06:24,035 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 13:06:24,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 13:06:24,312 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=373940.0, ans=0.125 2023-09-29 13:06:25,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:06:27,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:06:27,098 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 13:06:27,105 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 13:06:29,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 13:06:31,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:06:32,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:06:34,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:06:34,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:06:36,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:06:36,095 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 13:06:36,164 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:06:36,375 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=374006.6666666667, ans=0.125 2023-09-29 13:06:37,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 13:06:45,557 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:06:45,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:06:47,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 13:06:47,157 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:06:48,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 13:06:50,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:06:52,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:06:53,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 13:06:54,267 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=374073.3333333333, ans=0.125 2023-09-29 13:06:55,442 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:06:55,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 13:06:55,743 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=374073.3333333333, ans=10.0 2023-09-29 13:06:58,225 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 2.184e+02 2.499e+02 3.162e+02 5.312e+02, threshold=4.998e+02, percent-clipped=4.0 2023-09-29 13:06:58,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:06:59,750 INFO [train.py:1039] (1/4) Epoch 11, batch 3000, loss[loss=0.209, simple_loss=0.2916, pruned_loss=0.06319, over 24459.00 frames. ], tot_loss[loss=0.2023, simple_loss=0.2724, pruned_loss=0.06616, over 4734942.23 frames. ], batch size: 69, lr: 9.41e-03, grad_scale: 16.0 2023-09-29 13:06:59,751 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-29 13:07:13,676 INFO [train.py:1071] (1/4) Epoch 11, validation: loss=0.3146, simple_loss=0.2865, pruned_loss=0.1713, over 1125622.00 frames. 2023-09-29 13:07:13,677 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-29 13:07:13,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:07:13,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 13:07:13,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:07:15,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:07:15,757 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=374140.0, ans=0.0 2023-09-29 13:07:16,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:07:18,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:07:18,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 13:07:20,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:07:23,087 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:07:23,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 13:07:28,280 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 13:07:28,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 13:07:29,973 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:07:31,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:07:31,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 13:07:31,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:07:37,025 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 13:07:45,604 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=374273.3333333333, ans=0.1 2023-09-29 13:07:46,723 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:07:52,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 13:07:52,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:07:54,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:07:55,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:07:55,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:07:57,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:07:57,628 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 13:07:59,225 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=374273.3333333333, ans=0.125 2023-09-29 13:08:00,442 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 13:08:01,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:08:02,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 13:08:04,242 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:08:04,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:08:06,156 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=374340.0, ans=0.1 2023-09-29 13:08:07,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:08:07,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:08:10,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 13:08:10,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:08:10,261 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:08:13,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:08:17,053 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 13:08:17,345 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=374340.0, ans=0.125 2023-09-29 13:08:18,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:08:18,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:08:18,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:08:21,068 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=374406.6666666667, ans=0.125 2023-09-29 13:08:23,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:08:23,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:08:26,595 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 13:08:26,664 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 13:08:26,881 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=374406.6666666667, ans=0.0 2023-09-29 13:08:28,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:08:28,495 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 13:08:28,568 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:08:30,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 13:08:30,920 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.86 vs. limit=6.0 2023-09-29 13:08:34,570 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 13:08:35,969 INFO [train.py:1039] (1/4) Epoch 11, batch 3050, loss[loss=0.208, simple_loss=0.2678, pruned_loss=0.07412, over 23619.00 frames. ], tot_loss[loss=0.2031, simple_loss=0.273, pruned_loss=0.06653, over 4733337.94 frames. ], batch size: 256, lr: 9.41e-03, grad_scale: 16.0 2023-09-29 13:08:36,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 13:08:36,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 13:08:37,540 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 13:08:37,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 13:08:39,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:08:39,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:08:39,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 13:08:39,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:08:39,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:08:43,073 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=374473.3333333333, ans=0.2 2023-09-29 13:08:44,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 13:08:45,835 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:08:47,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:08:48,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 13:08:52,496 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:08:57,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 13:08:59,412 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=374540.0, ans=0.2 2023-09-29 13:08:59,418 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=374540.0, ans=0.0 2023-09-29 13:08:59,463 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=374540.0, ans=0.125 2023-09-29 13:09:02,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 13:09:02,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 13:09:02,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:09:06,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:09:09,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:09:09,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:09:10,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:09:12,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:09:12,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 13:09:13,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:09:13,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:09:13,757 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:09:15,867 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:09:18,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:09:22,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:09:22,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 13:09:24,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:09:24,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:09:27,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:09:29,066 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:09:29,161 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:09:29,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:09:34,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:09:34,544 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:09:35,424 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.77 vs. limit=10.0 2023-09-29 13:09:42,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:09:42,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:09:42,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:09:42,950 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=374740.0, ans=0.125 2023-09-29 13:09:45,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:09:45,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 13:09:47,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:09:47,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 13:09:49,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:09:49,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:09:50,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 13:09:51,215 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=374740.0, ans=0.2 2023-09-29 13:09:52,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:09:57,150 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.09 vs. limit=15.0 2023-09-29 13:09:57,386 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 2.035e+02 2.257e+02 2.557e+02 3.814e+02, threshold=4.513e+02, percent-clipped=0.0 2023-09-29 13:09:58,902 INFO [train.py:1039] (1/4) Epoch 11, batch 3100, loss[loss=0.2189, simple_loss=0.2742, pruned_loss=0.08182, over 23732.00 frames. ], tot_loss[loss=0.2032, simple_loss=0.2731, pruned_loss=0.06667, over 4727164.40 frames. ], batch size: 164, lr: 9.41e-03, grad_scale: 16.0 2023-09-29 13:09:59,062 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:10:00,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:10:04,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 13:10:04,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 13:10:04,737 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=374806.6666666667, ans=0.125 2023-09-29 13:10:07,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 13:10:07,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 13:10:10,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:10:13,999 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:10:14,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:10:15,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 13:10:17,456 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=374873.3333333333, ans=0.125 2023-09-29 13:10:20,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:10:26,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 13:10:30,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 13:10:31,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:10:31,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:10:31,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:10:33,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 13:10:37,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:10:37,043 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 13:10:37,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:10:38,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:10:39,367 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.00 vs. limit=15.0 2023-09-29 13:10:41,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 13:10:43,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:10:46,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 13:10:48,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 13:10:48,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 13:10:49,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:10:49,926 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=375006.6666666667, ans=0.0 2023-09-29 13:10:51,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:10:54,236 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:10:54,268 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:10:54,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:10:55,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:10:55,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:10:57,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:10:57,451 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:10:57,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:10:57,463 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 13:11:01,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:11:03,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 13:11:03,714 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=375073.3333333333, ans=0.125 2023-09-29 13:11:06,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:11:06,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 13:11:07,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:11:07,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:11:07,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 13:11:19,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 13:11:21,188 INFO [train.py:1039] (1/4) Epoch 11, batch 3150, loss[loss=0.1925, simple_loss=0.251, pruned_loss=0.06705, over 23625.00 frames. ], tot_loss[loss=0.2016, simple_loss=0.2715, pruned_loss=0.06581, over 4718537.87 frames. ], batch size: 256, lr: 9.40e-03, grad_scale: 16.0 2023-09-29 13:11:22,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:11:22,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:11:25,712 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:11:25,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:11:27,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 13:11:28,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:11:28,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 13:11:30,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 13:11:31,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:11:33,362 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 13:11:37,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 13:11:37,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:11:40,264 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 13:11:40,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 13:11:40,719 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=375206.6666666667, ans=0.5 2023-09-29 13:11:41,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 13:11:43,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 13:11:43,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 13:11:43,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:11:43,396 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:11:44,955 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:11:48,661 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 13:11:49,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:11:50,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:11:52,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:11:53,716 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 13:11:56,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 13:11:56,978 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:11:58,570 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 13:12:00,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:12:00,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 13:12:01,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 13:12:02,148 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=375273.3333333333, ans=0.2 2023-09-29 13:12:03,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:12:03,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 13:12:03,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 13:12:04,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:12:04,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 13:12:07,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 13:12:08,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 13:12:08,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 13:12:10,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 13:12:10,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:12:13,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:12:13,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:12:13,689 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 13:12:15,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:12:16,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 13:12:16,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:12:18,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 13:12:19,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 13:12:21,937 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:12:21,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:12:23,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 13:12:24,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 13:12:26,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:12:29,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:12:31,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:12:31,105 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:12:33,243 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten.whitening_limit, batch_count=375406.6666666667, ans=22.5 2023-09-29 13:12:37,058 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:12:37,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:12:39,895 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.483e+02 1.896e+02 2.240e+02 2.702e+02 3.896e+02, threshold=4.479e+02, percent-clipped=0.0 2023-09-29 13:12:40,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 13:12:41,471 INFO [train.py:1039] (1/4) Epoch 11, batch 3200, loss[loss=0.2189, simple_loss=0.2892, pruned_loss=0.07427, over 23967.00 frames. ], tot_loss[loss=0.2003, simple_loss=0.2707, pruned_loss=0.06496, over 4734274.85 frames. ], batch size: 80, lr: 9.40e-03, grad_scale: 32.0 2023-09-29 13:12:45,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:12:45,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 13:12:50,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:12:50,564 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:12:50,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 13:12:53,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:12:57,124 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.68 vs. limit=15.0 2023-09-29 13:13:00,670 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:13:00,904 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=375540.0, ans=0.05 2023-09-29 13:13:03,887 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:13:11,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:13:17,735 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.whiten.whitening_limit, batch_count=375606.6666666667, ans=15.0 2023-09-29 13:13:20,884 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=375606.6666666667, ans=0.0 2023-09-29 13:13:21,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 13:13:23,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:13:25,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 13:13:25,333 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=375606.6666666667, ans=0.125 2023-09-29 13:13:25,372 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=375606.6666666667, ans=0.1 2023-09-29 13:13:26,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 13:13:29,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:13:30,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 13:13:30,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:13:34,367 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 13:13:35,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 13:13:37,703 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=375673.3333333333, ans=0.0 2023-09-29 13:13:38,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 13:13:41,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 13:13:43,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:13:49,027 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:13:49,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 13:13:50,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:13:50,457 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 13:13:50,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 13:13:50,658 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=375740.0, ans=0.1 2023-09-29 13:13:52,927 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=375740.0, ans=0.1 2023-09-29 13:13:57,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:13:57,174 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 13:13:58,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 13:14:00,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 13:14:02,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 13:14:04,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:14:06,034 INFO [train.py:1039] (1/4) Epoch 11, batch 3250, loss[loss=0.2106, simple_loss=0.2757, pruned_loss=0.07279, over 23774.00 frames. ], tot_loss[loss=0.1998, simple_loss=0.2701, pruned_loss=0.06475, over 4715724.82 frames. ], batch size: 232, lr: 9.39e-03, grad_scale: 32.0 2023-09-29 13:14:06,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 13:14:07,601 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 13:14:07,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:14:07,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:14:10,582 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 13:14:15,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:14:18,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:14:26,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:14:26,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 13:14:26,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:14:26,669 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:14:26,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:14:28,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:14:28,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:14:31,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:14:32,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 13:14:32,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:14:34,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:14:34,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:14:34,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:14:37,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:14:37,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:14:39,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:14:40,455 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:14:42,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:14:42,644 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:14:42,659 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:14:45,950 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=375940.0, ans=0.125 2023-09-29 13:14:47,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 13:14:48,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:14:48,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:14:50,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:14:50,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:14:57,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:15:05,393 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:15:07,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:15:07,403 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 13:15:07,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:15:07,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 13:15:07,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:15:10,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 13:15:10,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 13:15:12,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:15:12,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:15:13,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:15:13,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 13:15:14,516 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.96 vs. limit=15.0 2023-09-29 13:15:15,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:15:15,595 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=376073.3333333333, ans=0.125 2023-09-29 13:15:18,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:15:18,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:15:21,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 13:15:21,812 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:15:24,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:15:24,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 13:15:26,175 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.864e+02 2.159e+02 2.577e+02 4.318e+02, threshold=4.318e+02, percent-clipped=0.0 2023-09-29 13:15:27,694 INFO [train.py:1039] (1/4) Epoch 11, batch 3300, loss[loss=0.1887, simple_loss=0.2774, pruned_loss=0.04998, over 24401.00 frames. ], tot_loss[loss=0.2015, simple_loss=0.2717, pruned_loss=0.06567, over 4699987.37 frames. ], batch size: 69, lr: 9.39e-03, grad_scale: 32.0 2023-09-29 13:15:27,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:15:27,940 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 13:15:29,541 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 13:15:29,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 13:15:29,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:15:32,109 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=376140.0, ans=0.025 2023-09-29 13:15:32,117 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=376140.0, ans=0.0 2023-09-29 13:15:33,997 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.72 vs. limit=12.0 2023-09-29 13:15:34,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:15:36,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:15:36,329 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:15:38,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 13:15:38,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 13:15:42,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:15:42,688 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=376140.0, ans=0.125 2023-09-29 13:15:43,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:15:48,570 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 13:15:48,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:15:48,708 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:15:50,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:15:51,721 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 13:15:53,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:15:53,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 13:15:55,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:15:55,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:15:55,401 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 13:15:58,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:15:58,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 13:15:58,928 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=376206.6666666667, ans=0.1 2023-09-29 13:16:01,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:16:01,506 WARNING [train.py:1197] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 13:16:03,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 13:16:03,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:16:05,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:16:08,214 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 13:16:09,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 13:16:09,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:16:12,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 13:16:15,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:16:19,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 13:16:19,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:16:22,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:16:22,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:16:22,487 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:16:22,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 13:16:24,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:16:24,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:16:24,653 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=376340.0, ans=0.0 2023-09-29 13:16:25,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:16:27,897 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 13:16:29,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 13:16:30,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 13:16:30,961 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:16:30,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:16:33,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:16:33,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:16:36,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 13:16:36,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:16:36,878 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 13:16:38,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:16:40,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 13:16:43,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 13:16:43,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:16:46,482 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:16:46,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:16:48,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:16:50,014 INFO [train.py:1039] (1/4) Epoch 11, batch 3350, loss[loss=0.2014, simple_loss=0.2795, pruned_loss=0.06164, over 24365.00 frames. ], tot_loss[loss=0.2022, simple_loss=0.2725, pruned_loss=0.06591, over 4715881.50 frames. ], batch size: 77, lr: 9.38e-03, grad_scale: 32.0 2023-09-29 13:16:50,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:16:52,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:16:52,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:16:54,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:16:55,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:16:58,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:17:02,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:17:05,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:17:05,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:17:06,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:17:08,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 13:17:08,322 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 13:17:08,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:17:11,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 13:17:12,085 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten.whitening_limit, batch_count=376540.0, ans=15.0 2023-09-29 13:17:13,363 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 13:17:14,743 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:17:14,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:17:16,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:17:16,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 13:17:16,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:17:17,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:17:19,463 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:17:22,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:17:22,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:17:24,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:17:28,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:17:31,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:17:32,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:17:37,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:17:37,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:17:40,403 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:17:40,417 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:17:41,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:17:44,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 13:17:44,189 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 13:17:44,230 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 13:17:45,791 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:17:47,296 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 13:17:48,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:17:50,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:17:52,839 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=376673.3333333333, ans=0.0 2023-09-29 13:17:56,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:17:57,116 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 13:17:59,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 13:18:01,214 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:18:01,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:18:05,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:18:08,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 13:18:08,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 13:18:09,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:18:10,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:18:10,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 13:18:11,974 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.492e+02 1.933e+02 2.082e+02 2.375e+02 4.063e+02, threshold=4.164e+02, percent-clipped=0.0 2023-09-29 13:18:12,038 INFO [train.py:1039] (1/4) Epoch 11, batch 3400, loss[loss=0.186, simple_loss=0.2612, pruned_loss=0.05539, over 24473.00 frames. ], tot_loss[loss=0.2033, simple_loss=0.2737, pruned_loss=0.06646, over 4716957.30 frames. ], batch size: 63, lr: 9.38e-03, grad_scale: 16.0 2023-09-29 13:18:12,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:18:12,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 13:18:13,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:18:15,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:18:15,225 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 13:18:17,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:18:17,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 13:18:22,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 13:18:22,677 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 13:18:22,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:18:22,925 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=376806.6666666667, ans=0.1 2023-09-29 13:18:27,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:18:27,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 13:18:27,427 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:18:28,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 13:18:35,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:18:37,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 13:18:40,561 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:18:42,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:18:42,232 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:18:43,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 13:18:50,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:18:55,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 13:19:03,487 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:19:04,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:19:04,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 13:19:04,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:19:05,165 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=377006.6666666667, ans=0.1 2023-09-29 13:19:07,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:19:08,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:19:08,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:19:12,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:19:15,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 13:19:15,751 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:19:21,839 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:19:24,903 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 13:19:28,314 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=377073.3333333333, ans=0.125 2023-09-29 13:19:30,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 13:19:35,256 INFO [train.py:1039] (1/4) Epoch 11, batch 3450, loss[loss=0.1998, simple_loss=0.2799, pruned_loss=0.05985, over 24482.00 frames. ], tot_loss[loss=0.2028, simple_loss=0.2737, pruned_loss=0.066, over 4721697.33 frames. ], batch size: 69, lr: 9.38e-03, grad_scale: 16.0 2023-09-29 13:19:35,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 13:19:38,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 13:19:40,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:19:41,175 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.52 vs. limit=15.0 2023-09-29 13:19:41,666 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:19:41,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 13:19:43,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:19:49,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 13:19:53,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:19:54,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:19:55,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:19:55,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:19:57,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:20:05,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 13:20:10,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 13:20:10,841 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 13:20:12,193 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:20:12,528 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=377273.3333333333, ans=0.1 2023-09-29 13:20:13,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:20:20,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 13:20:21,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 13:20:26,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:20:27,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:20:29,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 13:20:29,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:20:31,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 13:20:31,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:20:32,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:20:35,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:20:37,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 13:20:41,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:20:43,965 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.25 vs. limit=15.0 2023-09-29 13:20:44,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:20:47,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:20:51,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:20:56,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:20:56,772 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:20:58,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:20:58,240 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:20:59,827 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 2.022e+02 2.344e+02 2.789e+02 4.683e+02, threshold=4.688e+02, percent-clipped=2.0 2023-09-29 13:20:59,869 INFO [train.py:1039] (1/4) Epoch 11, batch 3500, loss[loss=0.2084, simple_loss=0.2452, pruned_loss=0.08581, over 19547.00 frames. ], tot_loss[loss=0.2017, simple_loss=0.2721, pruned_loss=0.06564, over 4724550.67 frames. ], batch size: 389, lr: 9.37e-03, grad_scale: 16.0 2023-09-29 13:21:00,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:21:04,712 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:21:04,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 13:21:05,358 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=6.60 vs. limit=10.0 2023-09-29 13:21:06,921 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.29 vs. limit=22.5 2023-09-29 13:21:08,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 13:21:11,300 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 13:21:12,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:21:12,917 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 13:21:13,101 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=377473.3333333333, ans=0.1 2023-09-29 13:21:20,711 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:21:20,866 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:21:22,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:21:22,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:21:22,679 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=377540.0, ans=0.125 2023-09-29 13:21:23,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 13:21:23,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:21:24,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:21:24,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 13:21:27,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:21:27,825 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 13:21:29,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:21:33,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:21:35,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 13:21:35,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:21:38,400 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:21:40,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:21:42,106 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:21:45,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:21:45,121 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:21:45,445 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=377606.6666666667, ans=0.2 2023-09-29 13:21:46,674 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 13:21:46,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 13:21:48,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 13:21:48,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:21:50,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:21:50,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:21:51,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 13:21:55,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 13:21:55,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:22:02,035 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:22:03,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 13:22:03,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 13:22:03,663 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:22:05,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:22:06,811 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:22:08,314 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:22:11,836 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 13:22:11,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:22:14,872 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:22:16,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 13:22:17,956 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 13:22:19,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:22:19,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:22:21,012 INFO [train.py:1039] (1/4) Epoch 11, batch 3550, loss[loss=0.1899, simple_loss=0.267, pruned_loss=0.05637, over 24505.00 frames. ], tot_loss[loss=0.1996, simple_loss=0.2699, pruned_loss=0.06462, over 4718457.24 frames. ], batch size: 66, lr: 9.37e-03, grad_scale: 16.0 2023-09-29 13:22:21,108 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:22:21,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:22:23,222 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=377806.6666666667, ans=0.125 2023-09-29 13:22:24,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:22:35,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:22:36,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 13:22:38,447 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=377873.3333333333, ans=0.125 2023-09-29 13:22:39,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:22:41,056 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:22:42,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:22:44,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:22:44,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 13:22:47,702 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:22:47,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:22:47,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:22:47,888 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 13:22:49,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 13:22:55,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:22:55,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:22:58,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:22:58,263 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:23:00,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:23:00,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 13:23:00,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:23:01,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:23:02,252 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=377940.0, ans=0.125 2023-09-29 13:23:02,262 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=377940.0, ans=0.125 2023-09-29 13:23:03,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 13:23:09,007 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=378006.6666666667, ans=0.0 2023-09-29 13:23:10,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:23:12,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:23:13,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:23:13,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 13:23:15,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:23:18,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 13:23:19,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:23:21,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:23:21,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:23:25,400 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 13:23:27,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:23:30,483 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=378073.3333333333, ans=0.2 2023-09-29 13:23:31,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:23:33,357 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 13:23:33,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:23:38,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:23:39,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 13:23:43,826 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.448e+02 1.951e+02 2.213e+02 2.629e+02 3.694e+02, threshold=4.426e+02, percent-clipped=0.0 2023-09-29 13:23:43,869 INFO [train.py:1039] (1/4) Epoch 11, batch 3600, loss[loss=0.1874, simple_loss=0.2647, pruned_loss=0.05504, over 24459.00 frames. ], tot_loss[loss=0.1986, simple_loss=0.2696, pruned_loss=0.06374, over 4736733.61 frames. ], batch size: 63, lr: 9.36e-03, grad_scale: 32.0 2023-09-29 13:23:45,572 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 13:23:45,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:23:47,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:23:49,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:23:49,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:23:50,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:23:54,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:23:55,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:23:57,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:23:57,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:23:59,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:23:59,300 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 13:24:02,620 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=378206.6666666667, ans=0.07 2023-09-29 13:24:03,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 13:24:05,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:24:08,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:24:12,066 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:24:13,151 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.30 vs. limit=15.0 2023-09-29 13:24:13,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:24:13,665 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:24:13,700 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 13:24:15,228 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:24:15,539 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=378273.3333333333, ans=0.2 2023-09-29 13:24:18,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:24:21,054 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:24:22,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:24:24,254 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:24:25,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:24:25,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 13:24:27,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=378273.3333333333, ans=0.1 2023-09-29 13:24:33,908 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=378340.0, ans=0.0 2023-09-29 13:24:35,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:24:36,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 13:24:37,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 13:24:41,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:24:45,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:24:46,057 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=378340.0, ans=0.0 2023-09-29 13:24:47,481 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=378340.0, ans=0.09899494936611666 2023-09-29 13:24:48,670 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:24:55,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 13:24:56,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:24:56,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 13:24:57,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 13:24:59,127 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 13:25:02,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:25:02,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:25:03,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 13:25:03,896 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:25:03,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:25:03,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:25:04,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 13:25:06,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 13:25:07,553 INFO [train.py:1039] (1/4) Epoch 11, batch 3650, loss[loss=0.1911, simple_loss=0.2776, pruned_loss=0.05228, over 24319.00 frames. ], tot_loss[loss=0.1996, simple_loss=0.2707, pruned_loss=0.06428, over 4726103.81 frames. ], batch size: 74, lr: 9.36e-03, grad_scale: 32.0 2023-09-29 13:25:07,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:25:08,040 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 13:25:14,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 13:25:16,147 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:25:19,683 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=378473.3333333333, ans=0.125 2023-09-29 13:25:20,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 13:25:21,174 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=378473.3333333333, ans=0.125 2023-09-29 13:25:21,200 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=378473.3333333333, ans=0.125 2023-09-29 13:25:23,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 13:25:27,727 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:25:27,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 13:25:28,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:25:31,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 13:25:32,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:25:32,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 13:25:33,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:25:34,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:25:34,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 13:25:36,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 13:25:37,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:25:37,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:25:37,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:25:39,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 13:25:41,498 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 13:25:43,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:25:44,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 13:25:46,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:25:46,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:25:54,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:25:56,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:25:56,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 13:25:56,395 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 13:25:57,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:25:59,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:26:01,268 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.74 vs. limit=15.0 2023-09-29 13:26:02,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:26:05,431 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:26:06,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:26:06,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:26:06,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 13:26:08,367 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:26:09,900 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:26:15,151 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 13:26:19,477 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:26:19,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:26:20,948 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 13:26:21,033 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:26:21,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:26:23,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:26:24,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 13:26:24,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:26:26,504 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 13:26:28,344 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=378740.0, ans=0.125 2023-09-29 13:26:29,447 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:26:30,855 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 2.035e+02 2.257e+02 2.600e+02 3.794e+02, threshold=4.515e+02, percent-clipped=0.0 2023-09-29 13:26:30,918 INFO [train.py:1039] (1/4) Epoch 11, batch 3700, loss[loss=0.1912, simple_loss=0.2754, pruned_loss=0.05347, over 24448.00 frames. ], tot_loss[loss=0.2011, simple_loss=0.2718, pruned_loss=0.06515, over 4715644.49 frames. ], batch size: 69, lr: 9.36e-03, grad_scale: 32.0 2023-09-29 13:26:31,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:26:31,307 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=378806.6666666667, ans=0.125 2023-09-29 13:26:31,440 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=378806.6666666667, ans=0.2 2023-09-29 13:26:34,684 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:26:34,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 13:26:35,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:26:36,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 13:26:36,847 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 13:26:39,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 13:26:41,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:26:42,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:26:43,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:26:44,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:26:44,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 13:26:46,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:26:48,272 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 13:26:56,082 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 13:26:57,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:26:57,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 13:26:58,302 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.46 vs. limit=15.0 2023-09-29 13:26:59,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 13:26:59,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 13:26:59,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:27:04,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:27:04,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 13:27:08,025 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:27:08,986 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=378940.0, ans=0.0 2023-09-29 13:27:10,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:27:13,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:27:13,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 13:27:14,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 13:27:19,518 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:27:19,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 13:27:19,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:27:19,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 13:27:23,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:27:25,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:27:28,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:27:28,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 13:27:31,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:27:31,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 13:27:31,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:27:31,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:27:35,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:27:36,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 13:27:38,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 13:27:38,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:27:38,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:27:40,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:27:41,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:27:47,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:27:49,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:27:51,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:27:53,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 13:27:55,220 INFO [train.py:1039] (1/4) Epoch 11, batch 3750, loss[loss=0.1937, simple_loss=0.2751, pruned_loss=0.05618, over 24497.00 frames. ], tot_loss[loss=0.2021, simple_loss=0.273, pruned_loss=0.06562, over 4712160.90 frames. ], batch size: 66, lr: 9.35e-03, grad_scale: 32.0 2023-09-29 13:27:55,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 13:27:57,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 13:27:58,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 13:27:58,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:28:00,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:28:01,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:28:03,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:28:06,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:28:09,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:28:11,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 13:28:13,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:28:16,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:28:18,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 13:28:20,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:28:20,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:28:21,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:28:23,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 13:28:28,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 13:28:30,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:28:31,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:28:32,029 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=379273.3333333333, ans=0.125 2023-09-29 13:28:33,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:28:39,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:28:39,517 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=379273.3333333333, ans=0.125 2023-09-29 13:28:42,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 13:28:44,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 13:28:48,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:28:53,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:28:53,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:28:58,370 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:29:02,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 13:29:03,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 13:29:05,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 13:29:07,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:29:08,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 13:29:10,583 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=379406.6666666667, ans=0.0 2023-09-29 13:29:12,250 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 13:29:16,650 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:29:18,117 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 2.004e+02 2.192e+02 2.441e+02 3.152e+02, threshold=4.385e+02, percent-clipped=0.0 2023-09-29 13:29:18,161 INFO [train.py:1039] (1/4) Epoch 11, batch 3800, loss[loss=0.1976, simple_loss=0.2495, pruned_loss=0.07288, over 22729.00 frames. ], tot_loss[loss=0.202, simple_loss=0.2728, pruned_loss=0.06559, over 4713225.67 frames. ], batch size: 322, lr: 9.35e-03, grad_scale: 32.0 2023-09-29 13:29:19,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:29:19,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 13:29:21,304 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 13:29:23,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:29:23,641 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:29:25,642 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 13:29:27,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 13:29:27,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:29:29,374 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:29:30,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:29:32,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:29:32,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:29:34,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 13:29:39,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 13:29:39,135 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:29:40,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:29:45,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:29:45,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 13:29:47,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 13:29:47,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:29:49,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:29:51,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:29:51,607 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=379606.6666666667, ans=0.09899494936611666 2023-09-29 13:29:53,049 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=379606.6666666667, ans=0.2 2023-09-29 13:29:55,479 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.27 vs. limit=12.0 2023-09-29 13:29:56,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 13:29:56,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 13:29:58,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:30:05,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:30:09,704 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=379673.3333333333, ans=0.125 2023-09-29 13:30:10,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:30:12,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 13:30:14,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 13:30:15,434 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:30:17,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:30:17,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:30:18,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 13:30:23,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 13:30:23,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 13:30:24,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:30:26,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:30:31,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:30:33,839 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 13:30:40,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:30:40,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 13:30:42,637 INFO [train.py:1039] (1/4) Epoch 11, batch 3850, loss[loss=0.1951, simple_loss=0.2674, pruned_loss=0.06137, over 24521.00 frames. ], tot_loss[loss=0.2011, simple_loss=0.2718, pruned_loss=0.06516, over 4715290.53 frames. ], batch size: 63, lr: 9.34e-03, grad_scale: 32.0 2023-09-29 13:30:42,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 13:30:42,879 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:30:43,203 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=379806.6666666667, ans=0.125 2023-09-29 13:30:48,697 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 13:30:51,678 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:30:54,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 13:30:56,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 13:31:01,406 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:31:05,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:31:06,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:31:06,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:31:10,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:31:12,073 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:31:12,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:31:12,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:31:14,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:31:16,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:31:18,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:31:18,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:31:18,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 13:31:18,320 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 13:31:19,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:31:19,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:31:21,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:31:22,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:31:22,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 13:31:25,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 13:31:28,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:31:30,820 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 13:31:33,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 13:31:40,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:31:40,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:31:45,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:31:45,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 13:31:49,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 13:31:50,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:31:50,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:31:55,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 13:31:55,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:31:55,286 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:31:56,851 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:31:56,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:31:56,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 13:31:58,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:32:01,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 13:32:01,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:32:01,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:32:04,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:32:05,654 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.657e+02 2.057e+02 2.295e+02 2.790e+02 3.822e+02, threshold=4.589e+02, percent-clipped=0.0 2023-09-29 13:32:05,723 INFO [train.py:1039] (1/4) Epoch 11, batch 3900, loss[loss=0.2078, simple_loss=0.2679, pruned_loss=0.07385, over 23754.00 frames. ], tot_loss[loss=0.1999, simple_loss=0.2713, pruned_loss=0.0643, over 4725061.27 frames. ], batch size: 164, lr: 9.34e-03, grad_scale: 32.0 2023-09-29 13:32:05,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:32:07,464 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:32:07,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:32:07,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:32:09,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:32:10,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 13:32:10,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:32:15,238 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:32:16,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 13:32:16,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:32:16,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:32:20,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 13:32:20,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:32:23,353 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 13:32:23,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 13:32:25,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:32:27,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 13:32:27,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:32:29,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 13:32:29,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 13:32:35,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:32:37,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:32:37,539 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:32:37,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:32:40,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:32:41,742 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=19.41 vs. limit=22.5 2023-09-29 13:32:42,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:32:44,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:32:44,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:32:45,623 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:32:51,037 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=380273.3333333333, ans=0.0 2023-09-29 13:32:52,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:32:52,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:33:02,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 13:33:04,090 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:33:13,935 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:33:17,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:33:17,257 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 13:33:17,317 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 13:33:17,337 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:33:18,290 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.23 vs. limit=22.5 2023-09-29 13:33:20,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 13:33:21,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:33:23,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 13:33:23,446 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=380406.6666666667, ans=0.0 2023-09-29 13:33:28,103 INFO [train.py:1039] (1/4) Epoch 11, batch 3950, loss[loss=0.2004, simple_loss=0.2791, pruned_loss=0.06087, over 24645.00 frames. ], tot_loss[loss=0.1991, simple_loss=0.2705, pruned_loss=0.06386, over 4732318.57 frames. ], batch size: 68, lr: 9.34e-03, grad_scale: 16.0 2023-09-29 13:33:29,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:33:31,298 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 13:33:31,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:33:34,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:33:36,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:33:45,496 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 13:33:45,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 13:33:45,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 13:33:47,005 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 13:33:48,577 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:33:51,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:33:51,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 13:33:51,701 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:33:54,682 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 13:33:57,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:33:59,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 13:33:59,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:34:00,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:34:00,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:34:01,035 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=380606.6666666667, ans=0.125 2023-09-29 13:34:12,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:34:14,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:34:19,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 13:34:21,617 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=380673.3333333333, ans=0.125 2023-09-29 13:34:25,868 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 13:34:25,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 13:34:25,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:34:27,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:34:36,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:34:36,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 13:34:36,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:34:36,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:34:36,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 13:34:42,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:34:43,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:34:47,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 13:34:48,043 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=380740.0, ans=0.125 2023-09-29 13:34:50,665 INFO [train.py:1039] (1/4) Epoch 11, batch 4000, loss[loss=0.2013, simple_loss=0.2728, pruned_loss=0.06493, over 24486.00 frames. ], tot_loss[loss=0.1992, simple_loss=0.2705, pruned_loss=0.06394, over 4727281.13 frames. ], batch size: 63, lr: 9.33e-03, grad_scale: 32.0 2023-09-29 13:34:52,658 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.507e+02 1.905e+02 2.140e+02 2.457e+02 3.925e+02, threshold=4.280e+02, percent-clipped=0.0 2023-09-29 13:34:58,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:35:05,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:35:05,637 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=380873.3333333333, ans=0.0 2023-09-29 13:35:10,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:35:10,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:35:10,239 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:35:10,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 13:35:12,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 13:35:12,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 13:35:12,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:35:12,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 13:35:14,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:35:18,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:35:18,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:35:18,684 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:35:18,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:35:18,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 13:35:20,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:35:22,434 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 13:35:22,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:35:22,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:35:25,216 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=380940.0, ans=0.125 2023-09-29 13:35:26,452 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 13:35:26,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 13:35:26,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:35:34,195 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 13:35:34,313 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:35:37,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:35:38,842 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 13:35:40,415 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:35:40,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 13:35:40,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:35:42,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:35:42,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:35:45,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:35:45,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 13:35:46,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:35:47,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 13:35:48,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:35:50,566 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 13:35:54,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 13:35:57,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 13:35:59,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:35:59,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:36:00,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:36:02,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:36:06,781 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:36:09,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 13:36:11,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 13:36:13,376 INFO [train.py:1039] (1/4) Epoch 11, batch 4050, loss[loss=0.1974, simple_loss=0.2732, pruned_loss=0.06084, over 23975.00 frames. ], tot_loss[loss=0.2005, simple_loss=0.2715, pruned_loss=0.06473, over 4720592.94 frames. ], batch size: 86, lr: 9.33e-03, grad_scale: 32.0 2023-09-29 13:36:13,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 13:36:13,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:36:14,988 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:36:16,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 13:36:19,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:36:22,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:36:22,891 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=381140.0, ans=0.1 2023-09-29 13:36:25,582 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:36:26,974 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 13:36:28,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:36:28,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:36:32,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:36:34,246 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=381206.6666666667, ans=0.2 2023-09-29 13:36:35,089 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.50 vs. limit=15.0 2023-09-29 13:36:35,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 13:36:38,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 13:36:40,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 13:36:40,202 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 13:36:41,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:36:42,024 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=381206.6666666667, ans=0.125 2023-09-29 13:36:51,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 13:36:51,884 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=381273.3333333333, ans=0.0 2023-09-29 13:36:52,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:36:56,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:36:59,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:36:59,664 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:36:59,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:37:03,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:37:07,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 13:37:07,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 13:37:09,398 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:37:11,016 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=381340.0, ans=0.0 2023-09-29 13:37:12,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 13:37:16,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:37:22,356 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten.whitening_limit, batch_count=381406.6666666667, ans=22.5 2023-09-29 13:37:26,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 13:37:26,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:37:26,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 13:37:29,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 13:37:29,840 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 13:37:29,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:37:32,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:37:34,406 INFO [train.py:1039] (1/4) Epoch 11, batch 4100, loss[loss=0.2206, simple_loss=0.2821, pruned_loss=0.07956, over 23778.00 frames. ], tot_loss[loss=0.2012, simple_loss=0.2717, pruned_loss=0.06529, over 4717355.35 frames. ], batch size: 232, lr: 9.32e-03, grad_scale: 32.0 2023-09-29 13:37:34,544 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:37:34,570 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:37:35,988 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.675e+02 2.063e+02 2.315e+02 3.202e+02 5.550e+02, threshold=4.630e+02, percent-clipped=7.0 2023-09-29 13:37:41,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 13:37:43,437 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 13:37:44,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 13:37:46,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 13:37:46,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:37:47,196 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=381473.3333333333, ans=0.2 2023-09-29 13:37:48,341 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:37:48,383 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:37:48,404 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:37:49,925 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 13:37:52,950 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:37:54,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:37:54,476 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:37:54,832 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=381540.0, ans=0.0 2023-09-29 13:37:55,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:37:59,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 13:38:01,394 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:38:01,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:38:01,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 13:38:02,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:38:02,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:38:02,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:38:02,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:38:04,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 13:38:08,763 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:38:08,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 13:38:10,505 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:38:14,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:38:14,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 13:38:16,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:38:17,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:38:17,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:38:18,245 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=381606.6666666667, ans=0.0 2023-09-29 13:38:19,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 13:38:21,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 13:38:21,772 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:38:23,535 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=381673.3333333333, ans=0.125 2023-09-29 13:38:24,667 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 13:38:24,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:38:24,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:38:27,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:38:34,470 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:38:37,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:38:39,127 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:38:47,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:38:47,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:38:50,164 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=21.92 vs. limit=22.5 2023-09-29 13:38:50,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:38:53,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:38:57,298 INFO [train.py:1039] (1/4) Epoch 11, batch 4150, loss[loss=0.1901, simple_loss=0.2706, pruned_loss=0.05482, over 24466.00 frames. ], tot_loss[loss=0.2015, simple_loss=0.2719, pruned_loss=0.06558, over 4710661.09 frames. ], batch size: 69, lr: 9.32e-03, grad_scale: 32.0 2023-09-29 13:38:58,924 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:39:00,599 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 13:39:02,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:39:02,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:39:05,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 13:39:06,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:39:06,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 13:39:06,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 13:39:06,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 13:39:08,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:39:14,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:39:14,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:39:18,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:39:19,691 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:39:21,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 13:39:23,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 13:39:23,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:39:25,334 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 13:39:25,723 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=381873.3333333333, ans=0.125 2023-09-29 13:39:27,061 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=381873.3333333333, ans=0.125 2023-09-29 13:39:30,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:39:34,590 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:39:34,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 13:39:38,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 13:39:38,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:39:39,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 13:39:39,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:39:39,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:39:44,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:39:44,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:39:48,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 13:39:50,511 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.00 vs. limit=22.5 2023-09-29 13:39:51,207 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 13:39:51,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:39:52,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 13:39:54,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:39:56,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 13:39:59,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:40:00,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:40:02,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:40:03,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 13:40:03,977 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:40:03,980 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 13:40:04,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 13:40:07,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 13:40:07,285 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:40:07,291 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:40:07,461 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=382073.3333333333, ans=0.0 2023-09-29 13:40:08,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 13:40:08,797 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 13:40:08,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:40:08,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 13:40:10,835 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:40:12,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:40:12,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 13:40:14,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 13:40:18,767 INFO [train.py:1039] (1/4) Epoch 11, batch 4200, loss[loss=0.1878, simple_loss=0.2709, pruned_loss=0.05237, over 24475.00 frames. ], tot_loss[loss=0.2006, simple_loss=0.2713, pruned_loss=0.06493, over 4718572.35 frames. ], batch size: 69, lr: 9.32e-03, grad_scale: 32.0 2023-09-29 13:40:20,254 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.458e+02 1.930e+02 2.193e+02 2.587e+02 4.330e+02, threshold=4.386e+02, percent-clipped=0.0 2023-09-29 13:40:20,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:40:20,863 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=382140.0, ans=0.125 2023-09-29 13:40:22,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 13:40:24,894 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 13:40:26,548 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:40:26,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:40:28,089 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:40:28,091 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:40:30,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 13:40:33,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 13:40:35,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:40:35,612 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=382206.6666666667, ans=0.0 2023-09-29 13:40:36,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:40:38,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:40:41,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 13:40:44,740 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:40:44,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:40:44,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 13:40:44,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:40:45,199 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=382206.6666666667, ans=0.0 2023-09-29 13:40:46,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:40:47,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:40:48,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 13:40:48,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 13:40:51,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 13:40:51,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:40:55,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 13:40:56,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:41:00,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:41:01,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:41:05,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:41:05,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 13:41:05,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:41:05,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:41:11,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 13:41:14,633 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:41:14,908 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=382340.0, ans=0.125 2023-09-29 13:41:20,295 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.33 vs. limit=6.0 2023-09-29 13:41:22,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:41:24,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 13:41:24,974 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=382406.6666666667, ans=0.1 2023-09-29 13:41:27,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:41:30,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 13:41:31,324 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=382406.6666666667, ans=0.125 2023-09-29 13:41:32,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:41:34,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 13:41:38,629 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 13:41:41,371 INFO [train.py:1039] (1/4) Epoch 11, batch 4250, loss[loss=0.2013, simple_loss=0.2793, pruned_loss=0.06168, over 24492.00 frames. ], tot_loss[loss=0.1992, simple_loss=0.2698, pruned_loss=0.06432, over 4717407.87 frames. ], batch size: 63, lr: 9.31e-03, grad_scale: 32.0 2023-09-29 13:41:41,741 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=382473.3333333333, ans=0.125 2023-09-29 13:41:44,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:41:44,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:41:48,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:41:53,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 13:41:53,559 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 13:41:55,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:41:57,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:42:01,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:42:06,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:42:07,514 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:42:07,774 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:42:07,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:42:08,104 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=382540.0, ans=0.125 2023-09-29 13:42:10,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:42:10,831 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:42:11,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:42:14,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:42:14,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:42:16,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 13:42:20,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 13:42:20,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:42:22,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:42:22,265 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:42:22,484 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=382606.6666666667, ans=0.0 2023-09-29 13:42:23,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:42:25,276 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:42:25,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:42:28,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 13:42:31,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:42:36,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:42:38,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:42:38,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 13:42:39,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:42:41,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 13:42:42,878 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:42:44,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 13:42:44,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:42:44,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:42:46,133 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.11 vs. limit=15.0 2023-09-29 13:42:48,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 13:42:49,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 13:42:51,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 13:42:54,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:42:56,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:42:57,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:42:59,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:43:00,737 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:43:02,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:43:03,987 INFO [train.py:1039] (1/4) Epoch 11, batch 4300, loss[loss=0.2134, simple_loss=0.288, pruned_loss=0.06935, over 24406.00 frames. ], tot_loss[loss=0.1994, simple_loss=0.2692, pruned_loss=0.06482, over 4697571.73 frames. ], batch size: 77, lr: 9.31e-03, grad_scale: 16.0 2023-09-29 13:43:04,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:43:04,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 13:43:05,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:43:07,086 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.989e+02 2.378e+02 2.757e+02 5.301e+02, threshold=4.756e+02, percent-clipped=4.0 2023-09-29 13:43:12,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:43:12,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:43:17,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:43:23,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:43:23,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 13:43:24,781 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=382873.3333333333, ans=0.025 2023-09-29 13:43:25,888 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:43:27,472 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 13:43:27,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:43:28,849 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 13:43:33,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 13:43:33,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 13:43:35,175 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 13:43:37,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:43:37,244 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 13:43:40,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 13:43:42,479 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:43:44,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:43:44,203 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:43:46,105 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:43:46,555 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=382940.0, ans=0.2 2023-09-29 13:43:47,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:43:47,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:43:47,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 13:43:49,354 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 13:43:49,552 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=382940.0, ans=0.0 2023-09-29 13:43:52,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:43:54,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:43:54,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 13:43:54,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:43:56,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:43:56,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 13:43:56,169 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 13:43:56,295 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 13:43:57,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:43:57,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 13:43:59,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 13:44:02,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:44:05,325 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 13:44:06,740 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:44:08,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:44:09,635 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:44:13,172 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 13:44:13,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 13:44:13,290 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:44:14,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:44:14,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:44:14,900 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:44:18,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:44:21,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:44:22,171 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=383073.3333333333, ans=0.125 2023-09-29 13:44:23,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:44:23,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:44:26,388 INFO [train.py:1039] (1/4) Epoch 11, batch 4350, loss[loss=0.2676, simple_loss=0.3109, pruned_loss=0.1121, over 19536.00 frames. ], tot_loss[loss=0.2007, simple_loss=0.2706, pruned_loss=0.06543, over 4704038.43 frames. ], batch size: 388, lr: 9.30e-03, grad_scale: 16.0 2023-09-29 13:44:30,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 13:44:31,533 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 13:44:34,787 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:44:37,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:44:37,902 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=383140.0, ans=0.2 2023-09-29 13:44:40,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:44:40,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:44:45,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 13:44:49,359 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:44:52,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:44:52,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:44:54,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 13:44:54,521 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=383206.6666666667, ans=0.125 2023-09-29 13:44:57,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:45:00,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 13:45:07,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 13:45:08,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:45:08,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:45:13,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:45:16,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 13:45:19,002 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.09 vs. limit=15.0 2023-09-29 13:45:19,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:45:21,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 13:45:26,574 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 13:45:28,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:45:28,221 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:45:29,745 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 13:45:29,852 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 13:45:29,860 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:45:29,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:45:31,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:45:31,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:45:34,020 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:45:34,102 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:45:37,629 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 13:45:37,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:45:37,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:45:37,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:45:37,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 13:45:39,316 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 13:45:39,323 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 13:45:40,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 13:45:43,417 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=13.56 vs. limit=22.5 2023-09-29 13:45:43,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:45:43,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 13:45:43,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:45:45,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:45:45,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 13:45:47,234 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 13:45:47,249 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:45:48,672 INFO [train.py:1039] (1/4) Epoch 11, batch 4400, loss[loss=0.2202, simple_loss=0.2953, pruned_loss=0.0725, over 23732.00 frames. ], tot_loss[loss=0.2016, simple_loss=0.2717, pruned_loss=0.06573, over 4713551.07 frames. ], batch size: 85, lr: 9.30e-03, grad_scale: 32.0 2023-09-29 13:45:49,174 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=383473.3333333333, ans=0.125 2023-09-29 13:45:50,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:45:50,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:45:51,787 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.907e+02 2.226e+02 2.866e+02 4.775e+02, threshold=4.452e+02, percent-clipped=1.0 2023-09-29 13:45:52,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:45:55,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 13:45:55,768 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 13:45:55,832 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 13:45:57,241 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 13:45:57,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 13:45:57,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:46:00,967 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 13:46:02,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:46:04,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:46:04,264 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 13:46:08,524 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:46:08,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 13:46:09,957 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 13:46:14,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 13:46:14,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 13:46:15,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 13:46:15,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:46:17,580 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:46:19,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:46:20,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:46:20,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 13:46:20,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 13:46:22,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:46:23,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:46:23,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:46:25,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:46:27,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:46:27,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 13:46:27,738 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=9.23 vs. limit=10.0 2023-09-29 13:46:28,470 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 13:46:28,658 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=383606.6666666667, ans=0.125 2023-09-29 13:46:33,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:46:40,816 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:46:41,780 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 13:46:45,796 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=383673.3333333333, ans=0.1 2023-09-29 13:46:46,808 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:46:48,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:46:50,120 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=383673.3333333333, ans=0.0 2023-09-29 13:46:52,823 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:46:52,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 13:46:54,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:46:54,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 13:46:54,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:46:54,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 13:46:58,278 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=5.03 vs. limit=15.0 2023-09-29 13:46:59,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 13:47:02,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 13:47:03,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 13:47:03,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:47:03,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 13:47:03,805 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:47:07,626 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:47:09,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 13:47:12,142 INFO [train.py:1039] (1/4) Epoch 11, batch 4450, loss[loss=0.2644, simple_loss=0.3127, pruned_loss=0.108, over 19551.00 frames. ], tot_loss[loss=0.203, simple_loss=0.2729, pruned_loss=0.0666, over 4709413.71 frames. ], batch size: 388, lr: 9.30e-03, grad_scale: 32.0 2023-09-29 13:47:12,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:47:16,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:47:17,728 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:47:23,307 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:47:24,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:47:26,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:47:26,676 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 13:47:29,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:47:30,120 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.57 vs. limit=15.0 2023-09-29 13:47:31,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:47:31,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:47:32,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 13:47:32,829 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:47:34,273 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:47:34,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:47:35,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 13:47:35,969 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=383873.3333333333, ans=0.0 2023-09-29 13:47:38,835 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 13:47:45,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:47:46,017 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:47:48,114 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:47:48,413 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=383940.0, ans=0.2 2023-09-29 13:47:50,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:47:50,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:47:57,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 13:47:57,264 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 13:47:58,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 13:47:58,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:48:02,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:48:02,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 13:48:06,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 13:48:09,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:48:11,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 13:48:11,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:48:11,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:48:11,545 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:48:12,905 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:48:13,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:48:16,768 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 13:48:16,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 13:48:19,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 13:48:21,331 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:48:23,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:48:23,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:48:25,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 13:48:27,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 13:48:27,665 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.84 vs. limit=15.0 2023-09-29 13:48:30,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 13:48:32,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:48:36,512 INFO [train.py:1039] (1/4) Epoch 11, batch 4500, loss[loss=0.186, simple_loss=0.2527, pruned_loss=0.05967, over 21320.00 frames. ], tot_loss[loss=0.2036, simple_loss=0.2733, pruned_loss=0.06694, over 4702327.16 frames. ], batch size: 46, lr: 9.29e-03, grad_scale: 16.0 2023-09-29 13:48:38,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:48:39,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 13:48:39,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 13:48:41,278 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 2.027e+02 2.276e+02 2.770e+02 4.229e+02, threshold=4.551e+02, percent-clipped=0.0 2023-09-29 13:48:41,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:48:47,547 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:48:48,955 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:48:49,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 13:48:50,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:48:50,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:48:50,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:48:53,022 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=384206.6666666667, ans=0.1 2023-09-29 13:49:03,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:49:05,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:49:08,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:49:09,776 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:49:09,905 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 13:49:17,495 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 13:49:18,209 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.29 vs. limit=15.0 2023-09-29 13:49:20,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:49:24,308 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=384340.0, ans=0.5 2023-09-29 13:49:25,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 13:49:28,551 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:49:28,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 13:49:30,614 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:49:30,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:49:34,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:49:34,301 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:49:36,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:49:37,971 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 13:49:37,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 13:49:37,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:49:42,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:49:42,724 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:49:44,411 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:49:47,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 13:49:47,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:49:50,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 13:49:51,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 13:49:51,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 13:49:54,509 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=8.56 vs. limit=10.0 2023-09-29 13:49:56,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 13:49:58,073 INFO [train.py:1039] (1/4) Epoch 11, batch 4550, loss[loss=0.2241, simple_loss=0.2933, pruned_loss=0.07744, over 23399.00 frames. ], tot_loss[loss=0.2024, simple_loss=0.2723, pruned_loss=0.06625, over 4702480.43 frames. ], batch size: 93, lr: 9.29e-03, grad_scale: 16.0 2023-09-29 13:49:58,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 13:49:59,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:50:03,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:50:04,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:50:07,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:50:10,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:50:14,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:50:17,213 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:50:17,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:50:17,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:50:20,296 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:50:21,113 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.63 vs. limit=6.0 2023-09-29 13:50:21,647 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:50:24,145 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.06 vs. limit=22.5 2023-09-29 13:50:24,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:50:26,852 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.99 vs. limit=15.0 2023-09-29 13:50:27,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 13:50:27,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 13:50:29,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:50:32,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 13:50:35,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 13:50:35,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:50:40,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 13:50:42,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 13:50:45,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:50:45,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:50:45,966 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:50:49,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 13:50:51,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:50:52,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:50:52,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:50:55,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:50:57,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 13:50:57,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 13:50:58,514 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:50:58,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 13:51:01,691 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 13:51:01,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:51:03,390 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:51:03,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:51:04,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:51:04,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:51:05,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 13:51:06,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 13:51:08,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:51:08,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 13:51:08,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 13:51:08,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:51:08,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 13:51:12,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:51:12,777 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:51:14,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:51:14,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:51:14,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 13:51:18,002 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:51:18,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 13:51:21,626 INFO [train.py:1039] (1/4) Epoch 11, batch 4600, loss[loss=0.174, simple_loss=0.2107, pruned_loss=0.06863, over 19298.00 frames. ], tot_loss[loss=0.2008, simple_loss=0.2707, pruned_loss=0.06542, over 4704958.17 frames. ], batch size: 389, lr: 9.28e-03, grad_scale: 16.0 2023-09-29 13:51:21,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:51:23,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:51:25,952 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 2.102e+02 2.367e+02 2.907e+02 4.657e+02, threshold=4.735e+02, percent-clipped=1.0 2023-09-29 13:51:26,239 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:51:26,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 13:51:27,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:51:29,353 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 13:51:30,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:51:35,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:51:36,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:51:37,569 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.98 vs. limit=15.0 2023-09-29 13:51:40,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:51:46,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 13:51:48,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:51:51,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:51:53,115 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=384940.0, ans=0.125 2023-09-29 13:51:55,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:51:55,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:52:01,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 13:52:01,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 13:52:02,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:52:07,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:52:07,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 13:52:08,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:52:13,580 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 13:52:13,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 13:52:20,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:52:20,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:52:21,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:52:21,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 13:52:22,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:52:23,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 13:52:23,575 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:52:25,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:52:27,754 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:52:27,856 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:52:29,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:52:30,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 13:52:30,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 13:52:32,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 13:52:32,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:52:33,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:52:35,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:52:35,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:52:42,915 INFO [train.py:1039] (1/4) Epoch 11, batch 4650, loss[loss=0.2144, simple_loss=0.2776, pruned_loss=0.0756, over 23717.00 frames. ], tot_loss[loss=0.2005, simple_loss=0.271, pruned_loss=0.06496, over 4715673.19 frames. ], batch size: 232, lr: 9.28e-03, grad_scale: 16.0 2023-09-29 13:52:46,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:52:49,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:52:50,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:52:50,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:52:52,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:52:52,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:52:52,322 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:52:58,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 13:53:01,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:53:05,034 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 13:53:05,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:53:05,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 13:53:05,243 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:53:06,741 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 13:53:06,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 13:53:06,799 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:53:06,920 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:53:09,962 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 13:53:10,750 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.69 vs. limit=15.0 2023-09-29 13:53:11,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:53:11,500 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 13:53:14,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:53:16,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 13:53:19,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:53:19,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:53:20,693 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 13:53:22,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:53:25,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:53:29,142 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:53:29,304 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=385273.3333333333, ans=0.125 2023-09-29 13:53:34,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:53:37,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:53:40,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:53:40,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 13:53:41,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 13:53:41,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 13:53:43,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 13:53:43,573 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 13:53:45,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:53:51,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:53:51,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:53:51,302 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 13:53:51,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:53:52,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:53:52,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:53:54,324 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:53:57,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:53:57,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:53:59,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:54:02,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:54:02,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:54:04,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 13:54:04,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 13:54:06,314 INFO [train.py:1039] (1/4) Epoch 11, batch 4700, loss[loss=0.2141, simple_loss=0.2775, pruned_loss=0.07529, over 23392.00 frames. ], tot_loss[loss=0.201, simple_loss=0.2716, pruned_loss=0.06519, over 4712602.56 frames. ], batch size: 134, lr: 9.28e-03, grad_scale: 16.0 2023-09-29 13:54:06,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:54:06,582 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 13:54:10,174 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=385473.3333333333, ans=0.125 2023-09-29 13:54:11,830 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.872e+02 2.042e+02 2.233e+02 3.363e+02, threshold=4.084e+02, percent-clipped=0.0 2023-09-29 13:54:13,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:54:14,000 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=385473.3333333333, ans=0.0 2023-09-29 13:54:15,114 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:54:16,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:54:18,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:54:20,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 13:54:24,483 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=385540.0, ans=0.125 2023-09-29 13:54:25,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 13:54:27,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 13:54:30,859 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:54:31,026 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:54:31,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:54:31,346 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=385540.0, ans=0.125 2023-09-29 13:54:37,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:54:44,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:54:45,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 13:54:47,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:54:55,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 13:54:55,729 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:54:56,476 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.76 vs. limit=22.5 2023-09-29 13:54:58,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:55:00,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 13:55:02,276 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=385673.3333333333, ans=0.1 2023-09-29 13:55:03,427 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:55:07,163 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:55:07,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 13:55:09,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:55:09,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:55:12,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:55:12,795 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 13:55:12,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 13:55:12,959 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 13:55:14,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:55:14,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:55:14,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:55:14,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 13:55:16,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:55:22,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 13:55:26,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:55:27,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:55:29,196 INFO [train.py:1039] (1/4) Epoch 11, batch 4750, loss[loss=0.1983, simple_loss=0.2823, pruned_loss=0.05714, over 24541.00 frames. ], tot_loss[loss=0.2019, simple_loss=0.2718, pruned_loss=0.06598, over 4696151.30 frames. ], batch size: 71, lr: 9.27e-03, grad_scale: 16.0 2023-09-29 13:55:32,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:55:33,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:55:35,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 13:55:35,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:55:39,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 13:55:40,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:55:41,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:55:43,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:55:43,898 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=18.99 vs. limit=22.5 2023-09-29 13:55:47,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 13:55:51,076 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:55:53,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 13:55:54,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:55:55,085 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=385873.3333333333, ans=0.0 2023-09-29 13:55:56,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:55:56,404 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:55:56,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:55:57,909 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 13:55:57,914 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 13:56:04,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 13:56:04,942 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=385940.0, ans=0.125 2023-09-29 13:56:09,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:56:12,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:56:12,722 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=385940.0, ans=0.0 2023-09-29 13:56:14,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:56:14,486 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 13:56:14,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:56:18,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:56:21,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:56:21,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 13:56:23,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 13:56:23,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:56:23,246 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:56:24,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:56:24,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 13:56:26,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 13:56:27,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 13:56:31,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:56:33,199 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:56:33,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 13:56:35,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:56:36,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:56:38,294 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 13:56:39,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:56:39,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:56:42,999 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:56:44,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 13:56:44,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 13:56:46,194 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 13:56:48,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 13:56:48,765 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=386073.3333333333, ans=0.125 2023-09-29 13:56:49,861 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:56:50,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 13:56:52,872 INFO [train.py:1039] (1/4) Epoch 11, batch 4800, loss[loss=0.1942, simple_loss=0.2722, pruned_loss=0.05808, over 24528.00 frames. ], tot_loss[loss=0.2033, simple_loss=0.2729, pruned_loss=0.06681, over 4699124.87 frames. ], batch size: 71, lr: 9.27e-03, grad_scale: 32.0 2023-09-29 13:56:54,672 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:56:54,756 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:56:57,613 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.978e+02 2.285e+02 2.567e+02 3.711e+02, threshold=4.569e+02, percent-clipped=0.0 2023-09-29 13:56:59,341 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:57:01,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:57:01,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:57:02,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 13:57:02,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:57:04,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:57:05,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:57:09,703 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:57:12,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:57:12,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:57:14,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:57:14,065 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 13:57:15,520 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:57:17,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:57:18,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:57:22,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:57:25,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:57:25,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 13:57:25,688 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=386273.3333333333, ans=0.125 2023-09-29 13:57:26,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 13:57:28,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:57:30,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 13:57:30,123 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 13:57:31,614 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:57:31,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:57:33,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:57:33,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:57:33,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:57:34,910 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=386273.3333333333, ans=0.0 2023-09-29 13:57:36,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 13:57:36,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:57:41,260 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:57:41,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:57:43,776 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:57:48,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 13:57:48,349 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:57:49,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:57:49,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:57:49,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:57:54,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:57:55,283 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=386340.0, ans=0.125 2023-09-29 13:57:56,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 13:57:56,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:57:56,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:57:57,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:57:58,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:58:02,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:58:02,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:58:02,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:58:04,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 13:58:07,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 13:58:07,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:58:07,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:58:08,901 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:58:08,903 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:58:12,609 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:58:13,024 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=386473.3333333333, ans=0.125 2023-09-29 13:58:14,044 INFO [train.py:1039] (1/4) Epoch 11, batch 4850, loss[loss=0.2007, simple_loss=0.2814, pruned_loss=0.06, over 24397.00 frames. ], tot_loss[loss=0.2029, simple_loss=0.2727, pruned_loss=0.0666, over 4713458.09 frames. ], batch size: 77, lr: 9.26e-03, grad_scale: 16.0 2023-09-29 13:58:16,315 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=386473.3333333333, ans=0.0 2023-09-29 13:58:16,667 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.19 vs. limit=15.0 2023-09-29 13:58:22,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 13:58:24,067 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:58:27,416 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:58:27,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 13:58:27,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:58:27,854 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=386473.3333333333, ans=0.2 2023-09-29 13:58:33,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:58:33,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:58:34,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:58:34,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 13:58:40,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:58:42,528 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=386540.0, ans=0.1 2023-09-29 13:58:43,682 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:58:43,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 13:58:45,848 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:58:45,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 13:58:47,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:58:47,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:58:52,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:58:52,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 13:58:52,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 13:58:53,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 13:59:00,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:59:00,763 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 13:59:02,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:59:02,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:59:03,029 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=386673.3333333333, ans=0.0 2023-09-29 13:59:05,574 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=386673.3333333333, ans=0.2 2023-09-29 13:59:06,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 13:59:08,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 13:59:08,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:59:09,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 13:59:09,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:59:10,326 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.23 vs. limit=15.0 2023-09-29 13:59:11,178 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:59:12,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 13:59:22,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:59:27,309 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=386740.0, ans=0.2 2023-09-29 13:59:30,085 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:59:30,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:59:30,459 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=386740.0, ans=0.2 2023-09-29 13:59:35,635 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=386806.6666666667, ans=0.125 2023-09-29 13:59:36,684 INFO [train.py:1039] (1/4) Epoch 11, batch 4900, loss[loss=0.2106, simple_loss=0.2867, pruned_loss=0.06728, over 23294.00 frames. ], tot_loss[loss=0.2016, simple_loss=0.2717, pruned_loss=0.06572, over 4697780.83 frames. ], batch size: 93, lr: 9.26e-03, grad_scale: 16.0 2023-09-29 13:59:36,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 13:59:36,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:59:40,893 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=386806.6666666667, ans=0.125 2023-09-29 13:59:40,923 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=386806.6666666667, ans=0.125 2023-09-29 13:59:41,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:59:43,967 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.984e+02 2.247e+02 2.564e+02 4.606e+02, threshold=4.494e+02, percent-clipped=1.0 2023-09-29 13:59:44,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:59:44,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:59:47,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 13:59:51,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 13:59:56,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 13:59:57,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 13:59:58,568 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 13:59:58,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:59:58,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:59:58,667 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:59:58,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 14:00:00,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 14:00:03,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 14:00:04,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 14:00:06,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:00:08,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 14:00:09,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:00:09,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:00:10,514 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.99 vs. limit=15.0 2023-09-29 14:00:11,432 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:00:11,461 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 14:00:13,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:00:15,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:00:16,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 14:00:16,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 14:00:19,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 14:00:21,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:00:21,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:00:21,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:00:23,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:00:23,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 14:00:23,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:00:24,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 14:00:26,680 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=387006.6666666667, ans=0.1 2023-09-29 14:00:27,738 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:00:29,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 14:00:31,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:00:34,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 14:00:34,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:00:36,236 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 14:00:37,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 14:00:45,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:00:47,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 14:00:49,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 14:00:49,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 14:00:49,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:00:54,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:00:57,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:00:57,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:00:59,087 INFO [train.py:1039] (1/4) Epoch 11, batch 4950, loss[loss=0.2053, simple_loss=0.2897, pruned_loss=0.06042, over 24695.00 frames. ], tot_loss[loss=0.1992, simple_loss=0.2697, pruned_loss=0.06436, over 4696486.72 frames. ], batch size: 73, lr: 9.26e-03, grad_scale: 16.0 2023-09-29 14:00:59,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:00:59,210 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 14:00:59,651 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=387140.0, ans=0.125 2023-09-29 14:01:00,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 14:01:03,915 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:01:03,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 14:01:08,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 14:01:08,853 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 14:01:08,973 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=387140.0, ans=0.125 2023-09-29 14:01:10,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 14:01:10,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 14:01:10,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:01:10,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:01:11,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:01:11,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:01:14,865 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:01:14,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:01:16,515 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:01:17,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:01:20,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:01:20,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:01:24,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 14:01:25,717 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.16 vs. limit=12.0 2023-09-29 14:01:28,267 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=12.35 vs. limit=10.0 2023-09-29 14:01:29,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:01:32,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 14:01:33,657 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:01:33,738 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:01:36,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:01:36,965 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 14:01:38,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 14:01:38,731 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=387273.3333333333, ans=0.0 2023-09-29 14:01:41,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:01:41,979 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=387273.3333333333, ans=0.125 2023-09-29 14:01:43,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:01:43,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 14:01:43,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:01:45,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:01:45,440 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 14:01:48,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:01:50,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 14:01:51,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:01:53,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:01:53,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:01:55,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 14:01:55,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:01:55,603 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=387340.0, ans=0.0 2023-09-29 14:01:56,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 14:01:59,474 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=387340.0, ans=0.125 2023-09-29 14:02:02,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:02:03,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:02:03,597 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:02:03,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:02:03,865 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=387406.6666666667, ans=0.1 2023-09-29 14:02:05,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:02:05,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:02:08,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:02:09,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:02:09,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:02:10,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 14:02:13,701 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=5.17 vs. limit=15.0 2023-09-29 14:02:16,550 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:02:21,059 INFO [train.py:1039] (1/4) Epoch 11, batch 5000, loss[loss=0.1756, simple_loss=0.2574, pruned_loss=0.04683, over 24324.00 frames. ], tot_loss[loss=0.1986, simple_loss=0.2689, pruned_loss=0.06415, over 4693282.12 frames. ], batch size: 61, lr: 9.25e-03, grad_scale: 8.0 2023-09-29 14:02:21,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 14:02:21,368 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 14:02:24,898 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=387473.3333333333, ans=0.125 2023-09-29 14:02:27,580 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:02:27,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:02:29,463 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.930e+02 2.192e+02 2.539e+02 4.135e+02, threshold=4.383e+02, percent-clipped=0.0 2023-09-29 14:02:29,580 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 14:02:29,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 14:02:32,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:02:35,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 14:02:37,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:02:37,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 14:02:37,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 14:02:37,279 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:02:38,662 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:02:38,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 14:02:38,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:02:40,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:02:40,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 14:02:41,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 14:02:41,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:02:43,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 14:02:43,387 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 14:02:43,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:02:43,512 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 14:02:43,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 14:02:43,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 14:02:47,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 14:02:47,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:02:47,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:02:50,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 14:02:50,189 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:02:50,368 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:02:51,786 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:02:53,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 14:02:54,989 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 14:02:56,044 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.70 vs. limit=15.0 2023-09-29 14:02:56,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:02:57,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:03:01,899 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 14:03:05,469 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:03:06,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:03:06,971 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:03:09,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 14:03:10,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:03:10,712 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:03:10,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:03:13,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 14:03:15,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:03:19,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:03:19,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:03:25,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 14:03:28,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:03:39,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:03:40,601 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:03:40,613 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:03:40,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:03:40,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 14:03:40,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:03:42,182 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:03:45,667 INFO [train.py:1039] (1/4) Epoch 11, batch 5050, loss[loss=0.1744, simple_loss=0.2491, pruned_loss=0.04987, over 21960.00 frames. ], tot_loss[loss=0.2001, simple_loss=0.2702, pruned_loss=0.06498, over 4680087.12 frames. ], batch size: 48, lr: 9.25e-03, grad_scale: 8.0 2023-09-29 14:03:47,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:03:47,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 14:03:48,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:03:50,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:03:50,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 14:03:52,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 14:03:54,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:03:54,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:03:56,214 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=387806.6666666667, ans=0.0 2023-09-29 14:03:57,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 14:03:58,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 14:03:59,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 14:04:03,867 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=387873.3333333333, ans=0.125 2023-09-29 14:04:08,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 14:04:08,227 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 14:04:10,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:04:11,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 14:04:11,966 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:04:14,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:04:14,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:04:16,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:04:16,288 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 14:04:16,414 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 14:04:17,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:04:19,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:04:22,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:04:23,002 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=387940.0, ans=0.125 2023-09-29 14:04:24,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 14:04:25,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:04:29,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 14:04:30,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:04:32,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:04:32,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:04:33,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:04:34,776 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.03 vs. limit=15.0 2023-09-29 14:04:35,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:04:37,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:04:37,206 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=388006.6666666667, ans=0.125 2023-09-29 14:04:38,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:04:38,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:04:38,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:04:39,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 14:04:40,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:04:41,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:04:46,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:04:46,683 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 14:04:46,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 14:04:48,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:04:50,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:04:50,468 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 14:04:54,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:04:54,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 14:04:54,144 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:04:58,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:04:58,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:04:58,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 14:05:02,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 14:05:05,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:05:05,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:05:06,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:05:08,067 INFO [train.py:1039] (1/4) Epoch 11, batch 5100, loss[loss=0.1813, simple_loss=0.2668, pruned_loss=0.04787, over 24663.00 frames. ], tot_loss[loss=0.2006, simple_loss=0.2711, pruned_loss=0.06505, over 4702365.77 frames. ], batch size: 65, lr: 9.24e-03, grad_scale: 8.0 2023-09-29 14:05:08,248 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 14:05:08,580 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=388140.0, ans=0.125 2023-09-29 14:05:11,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:05:14,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 14:05:14,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 14:05:15,744 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.929e+02 2.127e+02 2.436e+02 3.285e+02, threshold=4.254e+02, percent-clipped=0.0 2023-09-29 14:05:15,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:05:17,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:05:17,945 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=388140.0, ans=0.1 2023-09-29 14:05:21,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:05:22,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 14:05:22,724 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 14:05:23,132 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=388206.6666666667, ans=0.125 2023-09-29 14:05:29,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:05:29,997 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:05:33,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:05:36,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 14:05:36,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:05:38,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:05:38,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 14:05:40,533 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=388273.3333333333, ans=0.125 2023-09-29 14:05:41,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:05:41,741 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:05:41,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 14:05:44,719 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 14:05:46,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:05:46,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 14:05:46,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 14:05:49,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:05:49,741 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=388273.3333333333, ans=0.125 2023-09-29 14:05:59,736 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:05:59,903 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 14:06:02,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 14:06:02,877 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 14:06:02,906 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 14:06:05,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 14:06:05,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:06:08,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 14:06:12,701 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 14:06:16,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 14:06:17,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 14:06:18,190 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=388406.6666666667, ans=0.125 2023-09-29 14:06:19,626 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 14:06:22,599 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 14:06:23,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 14:06:27,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:06:28,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:06:28,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:06:30,043 INFO [train.py:1039] (1/4) Epoch 11, batch 5150, loss[loss=0.2113, simple_loss=0.2785, pruned_loss=0.07203, over 24599.00 frames. ], tot_loss[loss=0.2013, simple_loss=0.2718, pruned_loss=0.0654, over 4702408.39 frames. ], batch size: 60, lr: 9.24e-03, grad_scale: 8.0 2023-09-29 14:06:30,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:06:30,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 14:06:30,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:06:32,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 14:06:32,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 14:06:32,379 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 14:06:34,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:06:34,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 14:06:35,900 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:06:37,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 14:06:37,677 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=388473.3333333333, ans=0.0 2023-09-29 14:06:38,882 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:06:40,463 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:06:45,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 14:06:45,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 14:06:47,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:06:47,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:06:48,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 14:06:48,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:06:48,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:06:50,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:06:50,987 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 14:06:52,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 14:06:53,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:06:54,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 14:06:55,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 14:06:58,601 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 14:06:58,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:07:02,152 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=388606.6666666667, ans=0.0 2023-09-29 14:07:05,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:07:05,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 14:07:12,074 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:07:17,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:07:19,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:07:22,461 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=388673.3333333333, ans=0.125 2023-09-29 14:07:23,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:07:23,737 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:07:25,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 14:07:29,280 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:07:29,549 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=388673.3333333333, ans=0.0 2023-09-29 14:07:30,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:07:30,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 14:07:33,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:07:34,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:07:35,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 14:07:40,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:07:42,943 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 14:07:46,407 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:07:47,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:07:49,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 14:07:49,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 14:07:49,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:07:49,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:07:53,749 INFO [train.py:1039] (1/4) Epoch 11, batch 5200, loss[loss=0.1975, simple_loss=0.2762, pruned_loss=0.05942, over 24059.00 frames. ], tot_loss[loss=0.2024, simple_loss=0.2728, pruned_loss=0.06596, over 4705020.26 frames. ], batch size: 80, lr: 9.24e-03, grad_scale: 16.0 2023-09-29 14:07:53,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:07:55,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:07:57,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:08:02,221 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.626e+02 2.017e+02 2.564e+02 3.234e+02 5.917e+02, threshold=5.129e+02, percent-clipped=10.0 2023-09-29 14:08:02,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 14:08:02,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:08:04,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:08:08,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:08:08,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:08:08,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:08:08,784 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=388873.3333333333, ans=0.125 2023-09-29 14:08:11,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 14:08:13,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 14:08:15,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:08:16,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 14:08:19,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 14:08:20,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 14:08:22,038 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 14:08:22,118 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 14:08:25,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 14:08:25,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:08:25,172 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 14:08:26,561 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:08:28,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:08:28,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:08:28,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 14:08:29,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:08:32,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:08:35,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 14:08:36,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 14:08:36,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 14:08:41,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 14:08:41,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 14:08:41,847 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=389006.6666666667, ans=0.125 2023-09-29 14:08:48,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:08:48,310 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=389006.6666666667, ans=0.0 2023-09-29 14:08:49,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:08:51,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 14:08:51,108 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:08:52,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 14:08:52,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:08:52,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 14:08:52,717 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer_ff3.min_abs, batch_count=389006.6666666667, ans=0.2 2023-09-29 14:08:55,846 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=389006.6666666667, ans=0.125 2023-09-29 14:08:57,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:08:57,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:09:01,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:09:02,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:09:02,858 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:09:05,217 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=389073.3333333333, ans=0.125 2023-09-29 14:09:09,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:09:10,067 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.80 vs. limit=22.5 2023-09-29 14:09:10,960 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 14:09:12,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:09:12,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:09:14,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:09:14,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 14:09:15,398 INFO [train.py:1039] (1/4) Epoch 11, batch 5250, loss[loss=0.1619, simple_loss=0.2356, pruned_loss=0.04408, over 24610.00 frames. ], tot_loss[loss=0.2006, simple_loss=0.271, pruned_loss=0.06509, over 4707292.61 frames. ], batch size: 60, lr: 9.23e-03, grad_scale: 16.0 2023-09-29 14:09:17,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:09:17,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:09:20,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:09:20,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:09:22,893 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:09:27,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:09:29,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:09:32,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:09:35,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 14:09:37,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 14:09:37,366 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:09:39,499 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:09:40,292 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.49 vs. limit=15.0 2023-09-29 14:09:46,705 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=389273.3333333333, ans=0.125 2023-09-29 14:09:49,298 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=389273.3333333333, ans=0.0 2023-09-29 14:09:49,843 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=19.10 vs. limit=22.5 2023-09-29 14:09:50,708 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=389273.3333333333, ans=0.0 2023-09-29 14:09:50,806 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=389273.3333333333, ans=0.125 2023-09-29 14:10:13,573 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.34 vs. limit=15.0 2023-09-29 14:10:22,708 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=389406.6666666667, ans=0.0 2023-09-29 14:10:29,781 INFO [train.py:1039] (1/4) Epoch 11, batch 5300, loss[loss=0.1934, simple_loss=0.2694, pruned_loss=0.05876, over 24050.00 frames. ], tot_loss[loss=0.1995, simple_loss=0.2692, pruned_loss=0.06487, over 4695694.69 frames. ], batch size: 80, lr: 9.23e-03, grad_scale: 16.0 2023-09-29 14:10:32,950 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=389473.3333333333, ans=0.0 2023-09-29 14:10:36,666 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.614e+02 2.030e+02 2.219e+02 2.602e+02 3.750e+02, threshold=4.437e+02, percent-clipped=0.0 2023-09-29 14:10:42,179 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=389540.0, ans=0.125 2023-09-29 14:10:42,325 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=389540.0, ans=0.125 2023-09-29 14:10:44,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:10:45,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 14:10:45,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 14:10:45,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:10:45,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:10:45,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:10:45,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:10:45,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:10:45,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:10:45,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:10:45,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 14:10:46,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:10:46,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 14:10:46,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 14:10:46,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 14:10:47,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 14:10:47,306 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 14:10:47,437 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 14:10:47,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:10:48,131 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:10:48,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:10:48,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:10:48,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:10:48,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:10:48,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:10:49,021 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:10:49,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:10:49,193 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:10:49,200 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:10:49,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:10:49,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:10:50,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 14:10:50,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:10:51,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:10:51,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 14:10:51,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 14:10:51,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:10:51,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:10:51,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 14:10:51,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 14:10:51,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 14:10:52,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:10:52,717 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:10:52,869 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 14:10:53,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 14:10:53,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 14:10:53,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:10:53,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 14:10:53,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 14:10:53,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 14:10:53,806 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 14:11:03,746 INFO [train.py:1039] (1/4) Epoch 12, batch 0, loss[loss=0.2147, simple_loss=0.2803, pruned_loss=0.07457, over 24460.00 frames. ], tot_loss[loss=0.2147, simple_loss=0.2803, pruned_loss=0.07457, over 24460.00 frames. ], batch size: 58, lr: 8.84e-03, grad_scale: 32.0 2023-09-29 14:11:03,747 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-29 14:11:19,096 INFO [train.py:1071] (1/4) Epoch 12, validation: loss=0.305, simple_loss=0.2807, pruned_loss=0.1647, over 1125622.00 frames. 2023-09-29 14:11:19,097 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-29 14:11:23,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 14:11:25,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:11:26,654 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:11:30,099 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:11:30,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:11:30,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:11:31,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 14:11:33,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 14:11:34,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:11:36,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:11:38,168 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=389620.0, ans=0.0 2023-09-29 14:11:40,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:11:40,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:11:40,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:11:40,326 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:11:41,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 14:11:43,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:11:51,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 14:11:51,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:11:54,032 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 14:11:57,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:11:57,562 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:12:00,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:12:05,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:12:08,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:12:15,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 14:12:17,486 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.05 vs. limit=22.5 2023-09-29 14:12:18,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 14:12:18,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:12:18,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:12:19,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:12:19,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:12:21,618 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=389753.3333333333, ans=0.0 2023-09-29 14:12:22,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 14:12:24,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:12:26,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:12:27,337 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.19 vs. limit=10.0 2023-09-29 14:12:28,822 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=389820.0, ans=0.1 2023-09-29 14:12:31,556 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 14:12:33,300 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 14:12:36,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:12:39,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:12:40,548 INFO [train.py:1039] (1/4) Epoch 12, batch 50, loss[loss=0.2023, simple_loss=0.2679, pruned_loss=0.06839, over 23476.00 frames. ], tot_loss[loss=0.2077, simple_loss=0.2763, pruned_loss=0.06954, over 1060012.66 frames. ], batch size: 134, lr: 8.84e-03, grad_scale: 16.0 2023-09-29 14:12:40,960 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=389886.6666666667, ans=0.125 2023-09-29 14:12:42,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:12:42,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 14:12:42,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 14:12:42,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:12:44,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:12:46,101 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:12:47,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:12:49,705 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=389886.6666666667, ans=0.1 2023-09-29 14:12:52,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 14:12:52,246 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:12:52,597 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=389886.6666666667, ans=0.125 2023-09-29 14:12:57,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 14:12:58,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 14:13:01,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 14:13:01,573 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=389953.3333333333, ans=0.1 2023-09-29 14:13:02,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:13:04,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:13:04,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:13:05,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:13:07,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 14:13:07,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 14:13:07,270 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:13:13,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:13:15,999 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 14:13:16,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 14:13:17,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 14:13:20,384 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:13:20,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 14:13:20,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 14:13:22,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:13:23,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 14:13:28,657 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=390086.6666666667, ans=0.125 2023-09-29 14:13:33,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:13:33,259 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:13:35,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:13:36,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:13:36,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:13:40,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 14:13:40,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 14:13:42,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:13:42,138 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:13:43,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:13:43,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:13:45,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 14:13:45,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 14:13:48,081 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 14:13:49,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:13:49,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:13:50,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 14:13:50,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 14:13:51,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:13:53,162 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 2.076e+02 2.460e+02 3.514e+02 7.647e+02, threshold=4.919e+02, percent-clipped=15.0 2023-09-29 14:13:53,330 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 14:13:54,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 14:13:54,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:13:56,748 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=390153.3333333333, ans=0.125 2023-09-29 14:13:58,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:14:01,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:14:02,865 INFO [train.py:1039] (1/4) Epoch 12, batch 100, loss[loss=0.1661, simple_loss=0.2517, pruned_loss=0.04022, over 24453.00 frames. ], tot_loss[loss=0.2051, simple_loss=0.2758, pruned_loss=0.0672, over 1867910.36 frames. ], batch size: 69, lr: 8.83e-03, grad_scale: 16.0 2023-09-29 14:14:03,116 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=390220.0, ans=0.125 2023-09-29 14:14:04,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:14:06,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 14:14:06,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:14:08,478 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=390220.0, ans=0.2 2023-09-29 14:14:10,312 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:14:11,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:14:11,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 14:14:11,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:14:11,842 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:14:15,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 14:14:16,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 14:14:17,832 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=390220.0, ans=15.0 2023-09-29 14:14:18,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:14:18,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:14:18,479 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:14:23,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 14:14:24,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:14:26,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:14:26,207 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 14:14:27,667 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1.whitening_limit, batch_count=390286.6666666667, ans=10.0 2023-09-29 14:14:28,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 14:14:31,505 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 14:14:31,529 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 14:14:34,449 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:14:34,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:14:36,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 14:14:39,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:14:41,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:14:49,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:14:50,232 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.47 vs. limit=15.0 2023-09-29 14:14:51,018 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 14:14:53,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 14:14:53,358 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=390420.0, ans=0.125 2023-09-29 14:14:57,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:14:57,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:15:00,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:15:02,608 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:15:07,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:15:08,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:15:10,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:15:12,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:15:13,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:15:13,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:15:13,641 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:15:13,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 14:15:13,779 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 14:15:13,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:15:15,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:15:15,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:15:15,381 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:15:17,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 14:15:17,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 14:15:17,296 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 14:15:17,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:15:18,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:15:20,811 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:15:22,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:15:22,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:15:25,206 INFO [train.py:1039] (1/4) Epoch 12, batch 150, loss[loss=0.221, simple_loss=0.2888, pruned_loss=0.0766, over 23204.00 frames. ], tot_loss[loss=0.203, simple_loss=0.2746, pruned_loss=0.06567, over 2515524.49 frames. ], batch size: 93, lr: 8.83e-03, grad_scale: 16.0 2023-09-29 14:15:25,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:15:27,619 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.84 vs. limit=15.0 2023-09-29 14:15:30,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:15:30,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:15:30,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:15:34,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:15:35,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:15:36,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:15:38,237 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:15:43,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 14:15:43,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 14:15:43,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 14:15:46,567 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:15:46,575 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 14:15:48,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:15:49,608 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:15:49,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:15:49,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:15:49,755 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:15:51,175 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 14:15:54,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:16:00,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:16:05,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:16:06,472 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 14:16:09,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:16:09,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:16:09,786 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:16:12,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:16:14,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:16:15,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:16:18,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:16:18,375 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=390753.3333333333, ans=0.125 2023-09-29 14:16:19,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 14:16:24,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:16:25,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:16:25,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:16:25,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:16:26,054 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=390753.3333333333, ans=0.0 2023-09-29 14:16:29,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:16:30,076 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.31 vs. limit=15.0 2023-09-29 14:16:30,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 14:16:34,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:16:34,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:16:36,362 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:16:37,594 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.686e+02 1.933e+02 2.162e+02 2.482e+02 3.211e+02, threshold=4.324e+02, percent-clipped=0.0 2023-09-29 14:16:39,779 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:16:40,490 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.48 vs. limit=6.0 2023-09-29 14:16:41,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 14:16:41,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:16:41,280 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 14:16:41,632 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=390820.0, ans=0.0 2023-09-29 14:16:44,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:16:47,323 INFO [train.py:1039] (1/4) Epoch 12, batch 200, loss[loss=0.1966, simple_loss=0.2633, pruned_loss=0.06496, over 20696.00 frames. ], tot_loss[loss=0.202, simple_loss=0.2737, pruned_loss=0.06511, over 3007706.87 frames. ], batch size: 44, lr: 8.83e-03, grad_scale: 16.0 2023-09-29 14:16:50,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:16:50,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:16:52,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 14:16:53,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:16:54,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:16:57,240 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 14:16:57,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 14:16:58,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:16:59,261 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=390886.6666666667, ans=0.0 2023-09-29 14:17:00,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:17:04,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:17:04,234 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:17:04,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:17:07,570 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=390953.3333333333, ans=0.0 2023-09-29 14:17:15,989 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=390953.3333333333, ans=0.0 2023-09-29 14:17:25,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:17:26,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:17:26,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:17:28,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:17:28,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 14:17:30,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 14:17:32,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:17:33,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 14:17:34,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:17:35,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:17:37,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 14:17:39,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 14:17:39,379 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:17:44,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:17:50,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:18:00,203 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:18:00,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:18:04,212 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.47 vs. limit=10.0 2023-09-29 14:18:07,874 INFO [train.py:1039] (1/4) Epoch 12, batch 250, loss[loss=0.2111, simple_loss=0.2845, pruned_loss=0.06882, over 23240.00 frames. ], tot_loss[loss=0.2012, simple_loss=0.2731, pruned_loss=0.0647, over 3394756.75 frames. ], batch size: 93, lr: 8.82e-03, grad_scale: 16.0 2023-09-29 14:18:08,724 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:18:10,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 14:18:11,660 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:18:11,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:18:11,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:18:11,817 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:18:13,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 14:18:13,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:18:15,569 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 14:18:15,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:18:17,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:18:18,755 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:18:20,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:18:21,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:18:21,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:18:23,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:18:30,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:18:40,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:18:42,394 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:18:43,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:18:49,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 14:18:50,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 14:18:50,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:18:52,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:18:52,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 14:18:52,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:18:54,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:18:57,236 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:19:00,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 14:19:00,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:19:04,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:19:04,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:19:04,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:19:05,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:19:05,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 14:19:05,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:19:09,065 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:19:09,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:19:10,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:19:14,322 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:19:17,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:19:19,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:19:22,364 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.950e+02 2.182e+02 2.665e+02 5.527e+02, threshold=4.363e+02, percent-clipped=2.0 2023-09-29 14:19:25,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:19:26,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:19:30,029 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 14:19:31,307 INFO [train.py:1039] (1/4) Epoch 12, batch 300, loss[loss=0.171, simple_loss=0.2234, pruned_loss=0.05933, over 22744.00 frames. ], tot_loss[loss=0.1999, simple_loss=0.2705, pruned_loss=0.06458, over 3681877.12 frames. ], batch size: 322, lr: 8.82e-03, grad_scale: 16.0 2023-09-29 14:19:31,483 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:19:33,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:19:34,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 14:19:34,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 14:19:36,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:19:36,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 14:19:36,942 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 14:19:41,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:19:43,258 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:19:46,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:19:48,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 14:19:49,928 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:19:50,647 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=11.36 vs. limit=12.0 2023-09-29 14:19:51,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 14:19:51,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 14:19:51,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:19:56,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 14:19:59,757 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 14:20:01,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 14:20:02,976 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 14:20:04,382 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:20:04,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:20:09,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:20:09,666 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 14:20:09,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:20:12,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:20:14,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:20:14,819 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:20:19,942 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 14:20:19,949 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 14:20:21,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:20:24,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:20:26,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 14:20:27,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:20:31,446 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=391753.3333333333, ans=0.5 2023-09-29 14:20:32,672 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:20:33,309 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.64 vs. limit=15.0 2023-09-29 14:20:34,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:20:34,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 14:20:38,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:20:38,852 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 14:20:40,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:20:42,576 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:20:43,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 14:20:44,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 14:20:44,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:20:45,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 14:20:47,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:20:47,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:20:49,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:20:50,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:20:50,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:20:54,291 INFO [train.py:1039] (1/4) Epoch 12, batch 350, loss[loss=0.2011, simple_loss=0.284, pruned_loss=0.05908, over 24570.00 frames. ], tot_loss[loss=0.199, simple_loss=0.2694, pruned_loss=0.06428, over 3904408.56 frames. ], batch size: 71, lr: 8.82e-03, grad_scale: 16.0 2023-09-29 14:20:55,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:20:55,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 14:20:59,558 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:21:01,473 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=391886.6666666667, ans=0.125 2023-09-29 14:21:07,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:21:10,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:21:10,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:21:12,088 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 14:21:13,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:21:15,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 14:21:17,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:21:17,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 14:21:18,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:21:22,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 14:21:23,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:21:26,953 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.37 vs. limit=10.0 2023-09-29 14:21:27,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:21:29,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:21:29,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:21:29,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:21:30,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:21:30,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:21:30,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:21:33,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:21:33,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:21:40,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:21:40,613 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:21:40,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:21:42,214 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:21:46,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 14:21:46,866 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:21:52,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:21:52,413 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:21:53,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:21:55,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 14:21:57,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:21:57,194 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 14:22:00,682 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 14:22:00,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:22:05,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:22:05,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 14:22:07,470 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.628e+02 1.933e+02 2.139e+02 2.473e+02 3.749e+02, threshold=4.278e+02, percent-clipped=0.0 2023-09-29 14:22:07,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:22:09,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:22:10,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:22:10,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:22:10,924 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:22:12,678 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=392153.3333333333, ans=0.0 2023-09-29 14:22:13,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:22:15,742 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=392220.0, ans=0.0 2023-09-29 14:22:16,740 INFO [train.py:1039] (1/4) Epoch 12, batch 400, loss[loss=0.1996, simple_loss=0.2857, pruned_loss=0.05671, over 24439.00 frames. ], tot_loss[loss=0.199, simple_loss=0.2691, pruned_loss=0.06451, over 4080886.53 frames. ], batch size: 69, lr: 8.81e-03, grad_scale: 32.0 2023-09-29 14:22:16,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:22:17,366 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=392220.0, ans=0.5 2023-09-29 14:22:18,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:22:20,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 14:22:20,202 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:22:20,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:22:23,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:22:23,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:22:26,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:22:29,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:22:31,357 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 14:22:32,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 14:22:32,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:22:37,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 14:22:37,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:22:39,042 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=392286.6666666667, ans=0.1 2023-09-29 14:22:40,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:22:40,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:22:40,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 14:22:41,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:22:41,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:22:41,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:22:43,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:22:45,011 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 14:22:46,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 14:22:51,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:22:52,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:22:54,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 14:22:55,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 14:22:57,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:23:00,830 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:23:08,787 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 14:23:12,466 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 14:23:14,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 14:23:15,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:23:18,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:23:18,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 14:23:21,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:23:24,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 14:23:26,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:23:27,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:23:27,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 14:23:28,153 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=392486.6666666667, ans=0.0 2023-09-29 14:23:29,558 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 14:23:30,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 14:23:34,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 14:23:34,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:23:36,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 14:23:37,568 INFO [train.py:1039] (1/4) Epoch 12, batch 450, loss[loss=0.2593, simple_loss=0.3014, pruned_loss=0.1086, over 19581.00 frames. ], tot_loss[loss=0.1995, simple_loss=0.2697, pruned_loss=0.06463, over 4211720.65 frames. ], batch size: 388, lr: 8.81e-03, grad_scale: 32.0 2023-09-29 14:23:39,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 14:23:39,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:23:39,227 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 14:23:42,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 14:23:42,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:23:44,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:23:46,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:23:46,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 14:23:47,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:23:48,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 14:23:51,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 14:23:54,732 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=392620.0, ans=0.2 2023-09-29 14:24:00,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:24:00,666 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:24:02,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 14:24:03,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 14:24:08,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 14:24:11,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:24:13,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:24:16,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:24:16,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:24:17,040 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=392686.6666666667, ans=0.1 2023-09-29 14:24:20,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 14:24:22,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 14:24:23,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 14:24:23,985 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:24:25,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:24:25,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:24:27,107 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 14:24:27,121 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 14:24:28,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:24:30,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:24:31,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 14:24:34,719 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 14:24:34,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:24:36,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 14:24:36,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 14:24:39,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:24:40,844 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 14:24:40,888 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 14:24:41,671 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.51 vs. limit=15.0 2023-09-29 14:24:42,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 14:24:48,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:24:48,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 14:24:49,660 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.81 vs. limit=22.5 2023-09-29 14:24:50,218 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.855e+02 2.154e+02 2.453e+02 3.354e+02, threshold=4.308e+02, percent-clipped=0.0 2023-09-29 14:24:50,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 14:24:52,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:24:58,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:24:59,745 INFO [train.py:1039] (1/4) Epoch 12, batch 500, loss[loss=0.1929, simple_loss=0.2676, pruned_loss=0.05913, over 24301.00 frames. ], tot_loss[loss=0.1988, simple_loss=0.27, pruned_loss=0.06381, over 4332007.68 frames. ], batch size: 61, lr: 8.80e-03, grad_scale: 16.0 2023-09-29 14:24:59,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:25:01,478 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:25:02,777 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 14:25:07,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:25:07,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:25:08,828 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:25:08,843 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 14:25:10,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 14:25:10,498 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:25:12,242 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=392886.6666666667, ans=0.05 2023-09-29 14:25:13,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 14:25:18,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 14:25:20,068 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 14:25:20,317 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:25:20,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:25:20,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=392953.3333333333, ans=0.125 2023-09-29 14:25:21,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:25:21,944 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=392953.3333333333, ans=0.125 2023-09-29 14:25:33,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:25:33,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 14:25:34,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 14:25:34,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:25:34,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 14:25:34,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 14:25:37,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:25:38,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:25:39,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:25:39,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:25:39,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 14:25:43,891 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 14:25:45,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:25:47,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:25:47,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:25:47,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:25:49,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 14:25:50,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 14:25:55,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:25:55,943 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.29 vs. limit=22.5 2023-09-29 14:25:56,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:26:01,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:26:04,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:26:06,761 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=393153.3333333333, ans=0.125 2023-09-29 14:26:09,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:26:12,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 14:26:12,621 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:26:12,650 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:26:15,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 14:26:17,320 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 14:26:18,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:26:20,306 INFO [train.py:1039] (1/4) Epoch 12, batch 550, loss[loss=0.2879, simple_loss=0.3275, pruned_loss=0.1241, over 19533.00 frames. ], tot_loss[loss=0.1999, simple_loss=0.2711, pruned_loss=0.06439, over 4412374.89 frames. ], batch size: 389, lr: 8.80e-03, grad_scale: 16.0 2023-09-29 14:26:21,326 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.30 vs. limit=15.0 2023-09-29 14:26:22,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 14:26:25,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 14:26:25,682 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:26:25,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 14:26:27,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:26:27,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:26:28,671 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:26:28,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:26:28,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:26:28,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:26:32,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:26:32,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 14:26:34,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:26:39,600 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:26:39,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:26:42,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:26:44,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:26:48,205 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.32 vs. limit=15.0 2023-09-29 14:26:49,212 WARNING [train.py:1197] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 14:26:50,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 14:26:52,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:26:56,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:26:56,915 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:26:59,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 14:27:01,010 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=393353.3333333333, ans=0.125 2023-09-29 14:27:03,532 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:27:03,541 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 14:27:03,672 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:27:05,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 14:27:05,462 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 14:27:07,377 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:27:08,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 14:27:08,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 14:27:09,227 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=393420.0, ans=0.125 2023-09-29 14:27:10,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:27:10,818 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=393420.0, ans=0.125 2023-09-29 14:27:11,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 14:27:12,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 14:27:13,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:27:13,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:27:13,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:27:13,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:27:17,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:27:19,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:27:21,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:27:22,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:27:22,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 14:27:24,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 14:27:25,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:27:27,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:27:29,748 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:27:29,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 14:27:31,349 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 14:27:34,359 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 2.077e+02 2.338e+02 2.752e+02 4.124e+02, threshold=4.676e+02, percent-clipped=0.0 2023-09-29 14:27:37,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 14:27:41,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 14:27:43,123 INFO [train.py:1039] (1/4) Epoch 12, batch 600, loss[loss=0.1902, simple_loss=0.2785, pruned_loss=0.05094, over 24351.00 frames. ], tot_loss[loss=0.2009, simple_loss=0.2717, pruned_loss=0.06504, over 4469806.42 frames. ], batch size: 74, lr: 8.80e-03, grad_scale: 16.0 2023-09-29 14:27:43,271 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:27:44,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 14:27:44,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:27:51,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:27:54,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 14:27:55,653 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 14:27:56,416 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.82 vs. limit=6.0 2023-09-29 14:27:58,596 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 14:28:00,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:28:01,930 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:28:05,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 14:28:05,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:28:07,598 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 14:28:10,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 14:28:10,825 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=393620.0, ans=0.125 2023-09-29 14:28:14,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:28:14,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:28:14,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:28:14,648 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=393686.6666666667, ans=0.04949747468305833 2023-09-29 14:28:16,221 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=393686.6666666667, ans=0.125 2023-09-29 14:28:20,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:28:20,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:28:22,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:28:28,276 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:28:32,983 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:28:32,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:28:33,006 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:28:41,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 14:28:46,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 14:28:46,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:28:52,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 14:28:53,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:28:56,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 14:28:56,894 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:28:58,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:29:03,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 14:29:04,520 INFO [train.py:1039] (1/4) Epoch 12, batch 650, loss[loss=0.2123, simple_loss=0.2952, pruned_loss=0.06467, over 24432.00 frames. ], tot_loss[loss=0.2008, simple_loss=0.2719, pruned_loss=0.06487, over 4530071.62 frames. ], batch size: 69, lr: 8.79e-03, grad_scale: 8.0 2023-09-29 14:29:04,753 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 14:29:07,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:29:10,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:29:12,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:29:15,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 14:29:16,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:29:23,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:29:23,472 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:29:26,672 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:29:27,026 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=393953.3333333333, ans=0.125 2023-09-29 14:29:29,731 WARNING [train.py:1197] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 14:29:31,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:29:32,797 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:29:33,567 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.24 vs. limit=15.0 2023-09-29 14:29:35,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:29:36,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 14:29:39,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:29:39,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:29:40,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 14:29:41,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:29:42,582 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:29:44,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:29:45,609 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 14:29:45,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:29:45,679 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:29:49,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:29:49,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:29:50,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:29:51,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:29:51,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 14:29:53,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:29:53,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:29:55,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 14:29:57,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:29:58,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 14:30:00,541 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 14:30:02,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 14:30:02,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:30:02,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:30:03,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:30:03,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:30:05,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:30:06,910 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=394086.6666666667, ans=0.125 2023-09-29 14:30:12,524 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:30:12,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:30:14,073 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:30:17,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:30:17,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 14:30:17,317 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:30:19,293 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=394153.3333333333, ans=0.0 2023-09-29 14:30:20,380 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.997e+02 2.199e+02 2.485e+02 3.515e+02, threshold=4.397e+02, percent-clipped=0.0 2023-09-29 14:30:26,990 INFO [train.py:1039] (1/4) Epoch 12, batch 700, loss[loss=0.2018, simple_loss=0.2751, pruned_loss=0.0642, over 23422.00 frames. ], tot_loss[loss=0.1993, simple_loss=0.2693, pruned_loss=0.06464, over 4553461.97 frames. ], batch size: 93, lr: 8.79e-03, grad_scale: 8.0 2023-09-29 14:30:27,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 14:30:27,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:30:27,188 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:30:27,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:30:30,633 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=394220.0, ans=0.125 2023-09-29 14:30:32,620 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 14:30:34,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 14:30:36,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 14:30:37,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:30:39,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:30:42,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 14:30:44,243 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=394286.6666666667, ans=0.2 2023-09-29 14:30:46,956 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:30:49,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:30:51,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:30:53,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 14:30:53,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:30:55,513 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.17 vs. limit=12.0 2023-09-29 14:30:57,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:30:58,189 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=394353.3333333333, ans=0.025 2023-09-29 14:30:59,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 14:30:59,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:31:03,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 14:31:05,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 14:31:08,751 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 14:31:10,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:31:12,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:31:16,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:31:18,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 14:31:21,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:31:21,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:31:21,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 14:31:26,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:31:26,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:31:29,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:31:36,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:31:36,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 14:31:40,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 14:31:41,523 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 14:31:44,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:31:46,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:31:46,917 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:31:48,592 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:31:48,602 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 14:31:50,005 INFO [train.py:1039] (1/4) Epoch 12, batch 750, loss[loss=0.1676, simple_loss=0.2469, pruned_loss=0.04414, over 24367.00 frames. ], tot_loss[loss=0.1986, simple_loss=0.2689, pruned_loss=0.06416, over 4599571.19 frames. ], batch size: 61, lr: 8.79e-03, grad_scale: 8.0 2023-09-29 14:31:50,516 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=394553.3333333333, ans=0.1 2023-09-29 14:31:53,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 14:31:53,294 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 14:31:53,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 14:31:54,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 14:31:56,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 14:31:56,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:31:56,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 14:31:57,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:31:59,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 14:32:00,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:32:02,427 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:32:04,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 14:32:05,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:32:06,228 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 14:32:08,165 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:32:10,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 14:32:11,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:32:13,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:32:13,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:32:15,157 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 14:32:16,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:32:18,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:32:19,472 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=7.05 vs. limit=15.0 2023-09-29 14:32:20,715 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:32:23,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 14:32:25,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 14:32:25,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:32:25,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 14:32:25,664 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 14:32:27,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 14:32:27,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:32:27,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 14:32:27,437 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=394686.6666666667, ans=0.125 2023-09-29 14:32:30,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:32:37,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 14:32:37,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:32:37,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 14:32:40,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:32:41,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:32:41,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 14:32:43,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:32:45,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 14:32:46,504 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.07 vs. limit=15.0 2023-09-29 14:32:47,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:32:47,849 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=394753.3333333333, ans=0.125 2023-09-29 14:32:49,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:32:50,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 14:32:50,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:32:55,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:32:57,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:32:57,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:32:59,475 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=394820.0, ans=0.0 2023-09-29 14:33:00,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:33:02,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 14:33:02,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:33:03,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:33:05,583 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:33:05,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:33:06,838 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 2.038e+02 2.318e+02 2.858e+02 4.234e+02, threshold=4.635e+02, percent-clipped=0.0 2023-09-29 14:33:10,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:33:10,692 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 14:33:12,254 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=394886.6666666667, ans=0.0 2023-09-29 14:33:13,493 INFO [train.py:1039] (1/4) Epoch 12, batch 800, loss[loss=0.1982, simple_loss=0.278, pruned_loss=0.05917, over 24407.00 frames. ], tot_loss[loss=0.1999, simple_loss=0.2701, pruned_loss=0.06478, over 4621251.09 frames. ], batch size: 77, lr: 8.78e-03, grad_scale: 16.0 2023-09-29 14:33:19,038 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=394886.6666666667, ans=0.2 2023-09-29 14:33:22,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:33:22,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:33:22,661 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=394886.6666666667, ans=0.0 2023-09-29 14:33:23,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:33:24,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:33:25,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:33:27,003 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:33:27,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:33:27,504 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=394886.6666666667, ans=0.125 2023-09-29 14:33:32,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:33:32,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 14:33:34,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 14:33:35,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:33:36,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:33:37,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:33:37,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:33:38,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 14:33:38,576 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:33:38,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 14:33:43,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:33:46,217 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:33:47,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:33:48,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:33:52,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:33:52,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:33:57,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:33:57,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:33:58,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 14:34:00,390 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 14:34:00,429 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 14:34:00,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 14:34:00,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:34:03,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:34:03,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:34:07,279 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 14:34:07,429 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=395086.6666666667, ans=0.125 2023-09-29 14:34:07,456 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=395086.6666666667, ans=0.0 2023-09-29 14:34:08,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 14:34:08,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 14:34:11,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:34:15,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:34:18,279 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:34:21,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 14:34:22,003 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:34:25,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 14:34:28,170 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=395153.3333333333, ans=15.0 2023-09-29 14:34:31,119 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=395153.3333333333, ans=0.0 2023-09-29 14:34:34,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 14:34:36,934 INFO [train.py:1039] (1/4) Epoch 12, batch 850, loss[loss=0.2004, simple_loss=0.2765, pruned_loss=0.06219, over 24669.00 frames. ], tot_loss[loss=0.1997, simple_loss=0.2703, pruned_loss=0.06462, over 4649503.99 frames. ], batch size: 65, lr: 8.78e-03, grad_scale: 16.0 2023-09-29 14:34:37,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:34:37,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 14:34:37,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:34:37,312 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:34:40,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 14:34:40,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:34:40,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:34:42,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:34:45,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 14:34:47,005 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:34:48,481 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 14:34:48,545 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 14:34:48,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 14:34:50,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 14:34:50,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:34:53,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:34:53,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:34:53,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:35:00,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:35:00,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:35:00,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 14:35:06,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 14:35:09,238 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:35:10,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 14:35:11,403 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.28 vs. limit=15.0 2023-09-29 14:35:15,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 14:35:15,532 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=395353.3333333333, ans=0.125 2023-09-29 14:35:16,726 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 14:35:18,903 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 14:35:18,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:35:18,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:35:19,611 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.78 vs. limit=15.0 2023-09-29 14:35:20,296 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 14:35:21,956 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:35:23,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:35:24,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 14:35:26,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:35:26,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:35:28,093 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:35:28,125 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 14:35:30,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:35:31,130 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=395420.0, ans=0.1 2023-09-29 14:35:32,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 14:35:32,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 14:35:34,035 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.15 vs. limit=15.0 2023-09-29 14:35:35,415 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.15 vs. limit=15.0 2023-09-29 14:35:37,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:35:37,771 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:35:37,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:35:37,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:35:40,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:35:43,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:35:44,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:35:46,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 14:35:48,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:35:48,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 14:35:53,430 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 1.869e+02 2.098e+02 2.387e+02 5.753e+02, threshold=4.196e+02, percent-clipped=1.0 2023-09-29 14:35:53,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 14:35:56,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:35:56,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 14:35:57,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:35:57,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:35:59,366 INFO [train.py:1039] (1/4) Epoch 12, batch 900, loss[loss=0.1943, simple_loss=0.2766, pruned_loss=0.05596, over 24624.00 frames. ], tot_loss[loss=0.2008, simple_loss=0.271, pruned_loss=0.06526, over 4665592.51 frames. ], batch size: 68, lr: 8.77e-03, grad_scale: 16.0 2023-09-29 14:36:00,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 14:36:07,855 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:36:09,533 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=395553.3333333333, ans=0.0 2023-09-29 14:36:10,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:36:11,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 14:36:13,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:36:14,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 14:36:15,114 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=395620.0, ans=0.125 2023-09-29 14:36:16,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 14:36:16,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:36:16,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:36:16,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 14:36:16,867 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=395620.0, ans=0.0 2023-09-29 14:36:17,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:36:21,860 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.23 vs. limit=15.0 2023-09-29 14:36:27,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:36:27,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:36:27,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 14:36:29,875 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=395620.0, ans=0.0 2023-09-29 14:36:31,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:36:34,342 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=395686.6666666667, ans=0.1 2023-09-29 14:36:36,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 14:36:40,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:36:44,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 14:36:46,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 14:36:46,837 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=395686.6666666667, ans=10.0 2023-09-29 14:36:47,929 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 14:36:49,405 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 14:36:54,258 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 14:36:55,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:36:55,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 14:36:55,821 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=395753.3333333333, ans=0.1 2023-09-29 14:37:00,613 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:37:00,630 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:37:03,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 14:37:04,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:37:07,116 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 14:37:07,694 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=14.84 vs. limit=15.0 2023-09-29 14:37:10,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:37:10,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:37:10,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:37:11,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:37:15,719 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 14:37:15,772 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 14:37:17,378 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 14:37:19,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 14:37:20,820 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:37:22,881 INFO [train.py:1039] (1/4) Epoch 12, batch 950, loss[loss=0.1941, simple_loss=0.2762, pruned_loss=0.056, over 24550.00 frames. ], tot_loss[loss=0.201, simple_loss=0.2712, pruned_loss=0.06546, over 4666839.24 frames. ], batch size: 71, lr: 8.77e-03, grad_scale: 16.0 2023-09-29 14:37:24,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 14:37:29,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:37:31,182 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=395886.6666666667, ans=0.1 2023-09-29 14:37:32,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:37:32,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:37:32,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 14:37:34,242 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=395886.6666666667, ans=0.2 2023-09-29 14:37:35,510 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 14:37:39,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:37:39,301 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:37:39,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:37:40,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:37:40,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 14:37:41,509 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.71 vs. limit=12.0 2023-09-29 14:37:42,456 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 14:37:46,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:37:46,938 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.82 vs. limit=15.0 2023-09-29 14:37:48,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 14:37:49,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:37:52,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:37:53,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:37:53,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:37:54,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 14:37:56,798 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 14:37:58,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:37:59,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:38:04,605 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:38:04,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:38:09,162 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 14:38:12,030 WARNING [train.py:1197] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 14:38:12,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 14:38:12,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:38:12,607 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.79 vs. limit=22.5 2023-09-29 14:38:13,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:38:13,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:38:19,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 14:38:19,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:38:19,850 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=396086.6666666667, ans=0.125 2023-09-29 14:38:22,582 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:38:22,697 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:38:24,713 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 14:38:24,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:38:24,759 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:38:24,926 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=396086.6666666667, ans=0.0 2023-09-29 14:38:26,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 14:38:26,415 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=396086.6666666667, ans=10.0 2023-09-29 14:38:32,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:38:34,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:38:38,838 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.985e+02 2.175e+02 2.387e+02 3.582e+02, threshold=4.351e+02, percent-clipped=0.0 2023-09-29 14:38:39,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:38:42,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 14:38:42,067 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 14:38:44,283 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=6.06 vs. limit=12.0 2023-09-29 14:38:44,933 INFO [train.py:1039] (1/4) Epoch 12, batch 1000, loss[loss=0.1815, simple_loss=0.229, pruned_loss=0.067, over 19531.00 frames. ], tot_loss[loss=0.2002, simple_loss=0.2702, pruned_loss=0.06516, over 4670302.00 frames. ], batch size: 388, lr: 8.77e-03, grad_scale: 16.0 2023-09-29 14:38:46,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:38:48,424 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 14:38:48,916 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.75 vs. limit=15.0 2023-09-29 14:38:49,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:38:55,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:38:57,237 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 14:38:57,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 14:39:04,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:39:04,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:39:04,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:39:09,136 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 14:39:09,615 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=396286.6666666667, ans=0.0 2023-09-29 14:39:10,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 14:39:12,521 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=396286.6666666667, ans=0.125 2023-09-29 14:39:13,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 14:39:15,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:39:15,428 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 14:39:17,079 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 14:39:17,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 14:39:17,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:39:18,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:39:28,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:39:29,864 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.08 vs. limit=15.0 2023-09-29 14:39:30,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:39:30,780 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=396353.3333333333, ans=0.125 2023-09-29 14:39:31,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:39:32,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:39:32,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 14:39:32,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:39:34,034 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:39:34,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:39:35,674 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 14:39:36,544 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=8.50 vs. limit=12.0 2023-09-29 14:39:36,964 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=15.57 vs. limit=15.0 2023-09-29 14:39:40,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 14:39:40,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 14:39:42,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 14:39:44,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:39:48,269 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.58 vs. limit=15.0 2023-09-29 14:39:52,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:39:52,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:39:52,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:39:52,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:39:52,673 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=396486.6666666667, ans=0.125 2023-09-29 14:39:53,989 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=396486.6666666667, ans=0.125 2023-09-29 14:39:54,576 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=14.48 vs. limit=15.0 2023-09-29 14:39:55,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 14:39:56,799 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:39:56,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 14:39:58,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 14:40:00,490 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:40:00,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:40:02,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:40:04,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 14:40:06,757 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.67 vs. limit=12.0 2023-09-29 14:40:07,462 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:40:08,956 INFO [train.py:1039] (1/4) Epoch 12, batch 1050, loss[loss=0.2067, simple_loss=0.2907, pruned_loss=0.06141, over 24383.00 frames. ], tot_loss[loss=0.1994, simple_loss=0.2695, pruned_loss=0.06465, over 4684979.73 frames. ], batch size: 77, lr: 8.76e-03, grad_scale: 16.0 2023-09-29 14:40:09,310 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=396553.3333333333, ans=0.125 2023-09-29 14:40:10,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:40:12,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:40:13,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 14:40:15,891 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:40:18,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:40:21,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 14:40:23,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 14:40:25,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:40:26,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:40:26,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 14:40:27,551 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.38 vs. limit=15.0 2023-09-29 14:40:28,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:40:29,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 14:40:29,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:40:30,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 14:40:34,633 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:40:34,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 14:40:34,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 14:40:43,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:40:44,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 14:40:44,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:40:46,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 14:40:47,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 14:40:47,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:40:51,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 14:40:55,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 14:40:55,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:40:58,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 14:41:00,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 14:41:00,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:41:00,689 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:41:05,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:41:08,867 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 14:41:09,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 14:41:10,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 14:41:10,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:41:10,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:41:14,101 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 14:41:17,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:41:20,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:41:20,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:41:22,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:41:22,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:41:24,159 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.513e+02 1.921e+02 2.123e+02 2.375e+02 5.047e+02, threshold=4.247e+02, percent-clipped=1.0 2023-09-29 14:41:26,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:41:26,043 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 14:41:28,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:41:29,022 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 14:41:29,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 14:41:29,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:41:30,455 INFO [train.py:1039] (1/4) Epoch 12, batch 1100, loss[loss=0.1927, simple_loss=0.2775, pruned_loss=0.05401, over 24557.00 frames. ], tot_loss[loss=0.1988, simple_loss=0.2691, pruned_loss=0.06425, over 4692399.28 frames. ], batch size: 71, lr: 8.76e-03, grad_scale: 16.0 2023-09-29 14:41:33,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:41:35,661 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=396886.6666666667, ans=0.0 2023-09-29 14:41:38,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:41:43,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:41:45,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:41:46,441 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:41:46,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 14:41:48,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:41:50,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:41:54,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:41:57,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:41:57,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 14:41:59,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 14:41:59,223 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:41:59,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:42:02,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:42:04,030 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 14:42:07,335 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=397020.0, ans=0.125 2023-09-29 14:42:08,603 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:42:11,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 14:42:11,969 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 14:42:13,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:42:16,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:42:17,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 14:42:17,983 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:42:20,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 14:42:20,459 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=397086.6666666667, ans=0.0 2023-09-29 14:42:21,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:42:21,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:42:21,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:42:23,169 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:42:23,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 14:42:27,321 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=397086.6666666667, ans=0.1 2023-09-29 14:42:30,160 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:42:31,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 14:42:32,544 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.76 vs. limit=22.5 2023-09-29 14:42:33,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:42:37,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 14:42:41,201 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 14:42:41,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 14:42:41,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:42:43,414 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=397153.3333333333, ans=0.125 2023-09-29 14:42:44,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:42:44,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:42:44,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 14:42:44,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:42:44,762 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:42:46,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 14:42:46,449 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:42:47,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 14:42:48,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:42:48,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:42:49,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 14:42:52,847 INFO [train.py:1039] (1/4) Epoch 12, batch 1150, loss[loss=0.2247, simple_loss=0.2794, pruned_loss=0.08496, over 19288.00 frames. ], tot_loss[loss=0.2, simple_loss=0.2698, pruned_loss=0.06509, over 4684581.53 frames. ], batch size: 388, lr: 8.76e-03, grad_scale: 16.0 2023-09-29 14:42:55,300 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=397220.0, ans=0.125 2023-09-29 14:42:57,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:43:01,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:43:02,471 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=15.05 vs. limit=15.0 2023-09-29 14:43:03,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:43:03,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:43:03,339 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 14:43:03,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:43:06,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 14:43:08,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:43:08,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 14:43:14,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 14:43:15,740 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:43:20,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:43:21,821 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:43:21,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 14:43:21,907 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 14:43:21,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:43:23,696 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=397353.3333333333, ans=0.1 2023-09-29 14:43:23,749 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=397353.3333333333, ans=0.0 2023-09-29 14:43:25,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 14:43:27,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:43:29,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:43:31,765 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=397353.3333333333, ans=0.125 2023-09-29 14:43:33,223 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=397353.3333333333, ans=0.125 2023-09-29 14:43:38,291 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=397353.3333333333, ans=0.125 2023-09-29 14:43:39,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:43:45,798 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:43:45,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 14:43:47,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:43:47,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:43:47,731 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=397420.0, ans=0.125 2023-09-29 14:43:54,867 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 14:43:56,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:44:00,731 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=10.86 vs. limit=15.0 2023-09-29 14:44:03,149 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 14:44:07,548 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=397486.6666666667, ans=0.0 2023-09-29 14:44:08,541 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.949e+02 2.168e+02 2.522e+02 3.297e+02, threshold=4.336e+02, percent-clipped=0.0 2023-09-29 14:44:08,712 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:44:08,852 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:44:08,886 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 14:44:10,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:44:13,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:44:15,308 INFO [train.py:1039] (1/4) Epoch 12, batch 1200, loss[loss=0.1674, simple_loss=0.2372, pruned_loss=0.04877, over 24314.00 frames. ], tot_loss[loss=0.2, simple_loss=0.2703, pruned_loss=0.06483, over 4694800.22 frames. ], batch size: 56, lr: 8.75e-03, grad_scale: 32.0 2023-09-29 14:44:15,760 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=397553.3333333333, ans=0.1 2023-09-29 14:44:19,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:44:20,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 14:44:20,464 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 14:44:21,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:44:21,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:44:23,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:44:24,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:44:26,272 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 14:44:27,892 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=397553.3333333333, ans=0.125 2023-09-29 14:44:29,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:44:29,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:44:30,878 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=397620.0, ans=0.0 2023-09-29 14:44:32,187 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 14:44:34,796 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=397620.0, ans=0.025 2023-09-29 14:44:35,909 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 14:44:39,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:44:39,765 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=5.64 vs. limit=12.0 2023-09-29 14:44:42,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:44:43,693 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.75 vs. limit=15.0 2023-09-29 14:44:44,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:44:46,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:44:46,552 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 14:44:48,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:44:54,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 14:44:54,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:44:55,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 14:44:55,948 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:45:00,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 14:45:03,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 14:45:05,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:45:05,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:45:06,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:45:06,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 14:45:09,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:45:09,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:45:10,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:45:11,786 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 14:45:11,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 14:45:11,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:45:11,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 14:45:14,076 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:45:14,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:45:21,095 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 14:45:22,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:45:25,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 14:45:30,626 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 14:45:30,806 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:45:33,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:45:35,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:45:36,953 INFO [train.py:1039] (1/4) Epoch 12, batch 1250, loss[loss=0.194, simple_loss=0.2604, pruned_loss=0.06381, over 23321.00 frames. ], tot_loss[loss=0.2003, simple_loss=0.2709, pruned_loss=0.06485, over 4694078.09 frames. ], batch size: 93, lr: 8.75e-03, grad_scale: 32.0 2023-09-29 14:45:37,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:45:40,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 14:45:46,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:45:48,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:45:48,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 14:45:51,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:45:51,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 14:45:57,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 14:45:58,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:45:58,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:45:58,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:46:01,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 14:46:05,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 14:46:05,053 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 14:46:05,074 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:46:07,847 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:46:07,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:46:11,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:46:11,248 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=398020.0, ans=0.125 2023-09-29 14:46:12,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 14:46:17,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 14:46:19,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:46:22,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:46:22,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 14:46:22,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:46:22,571 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 14:46:23,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:46:23,997 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:46:29,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:46:32,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:46:32,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:46:34,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 14:46:34,527 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 14:46:34,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 14:46:39,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:46:39,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 14:46:39,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:46:42,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 14:46:42,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:46:44,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 14:46:45,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 14:46:45,483 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:46:46,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 14:46:46,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:46:48,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 14:46:50,093 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:46:52,056 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 2.018e+02 2.304e+02 2.594e+02 4.435e+02, threshold=4.607e+02, percent-clipped=1.0 2023-09-29 14:46:52,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:46:52,953 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.92 vs. limit=15.0 2023-09-29 14:46:53,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:46:57,267 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 14:46:58,619 INFO [train.py:1039] (1/4) Epoch 12, batch 1300, loss[loss=0.196, simple_loss=0.2559, pruned_loss=0.06807, over 23578.00 frames. ], tot_loss[loss=0.201, simple_loss=0.2714, pruned_loss=0.06526, over 4693674.60 frames. ], batch size: 256, lr: 8.75e-03, grad_scale: 32.0 2023-09-29 14:47:02,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:47:02,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 14:47:05,616 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:47:07,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 14:47:08,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:47:10,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:47:11,788 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:47:13,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 14:47:19,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:47:20,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 14:47:22,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 14:47:27,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 14:47:30,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:47:30,715 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:47:33,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:47:36,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:47:38,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 14:47:38,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 14:47:38,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 14:47:43,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:47:43,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 14:47:44,717 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 14:47:46,251 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 14:47:47,109 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.93 vs. limit=15.0 2023-09-29 14:47:47,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:47:48,674 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.51 vs. limit=15.0 2023-09-29 14:47:50,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:47:50,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 14:47:52,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:47:52,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 14:47:53,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:47:59,054 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:47:59,058 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:48:02,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 14:48:03,734 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 14:48:05,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 14:48:12,023 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:48:14,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 14:48:15,793 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:48:20,330 INFO [train.py:1039] (1/4) Epoch 12, batch 1350, loss[loss=0.1917, simple_loss=0.2392, pruned_loss=0.0721, over 19411.00 frames. ], tot_loss[loss=0.2004, simple_loss=0.2701, pruned_loss=0.06534, over 4686010.78 frames. ], batch size: 388, lr: 8.74e-03, grad_scale: 32.0 2023-09-29 14:48:20,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 14:48:23,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:48:26,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:48:30,557 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:48:30,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:48:32,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:48:33,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:48:36,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:48:38,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 14:48:39,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 14:48:42,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:48:45,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 14:48:45,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:48:47,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:48:47,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 14:48:48,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 14:48:49,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 14:48:52,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:48:52,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 14:49:05,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:49:16,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:49:16,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:49:16,832 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 14:49:21,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:49:21,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 14:49:21,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 14:49:23,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:49:25,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:49:27,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 14:49:28,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:49:33,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 14:49:35,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 14:49:36,890 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 1.916e+02 2.100e+02 2.366e+02 3.252e+02, threshold=4.201e+02, percent-clipped=0.0 2023-09-29 14:49:42,870 INFO [train.py:1039] (1/4) Epoch 12, batch 1400, loss[loss=0.1824, simple_loss=0.2665, pruned_loss=0.04915, over 24662.00 frames. ], tot_loss[loss=0.1985, simple_loss=0.2678, pruned_loss=0.06461, over 4681461.48 frames. ], batch size: 68, lr: 8.74e-03, grad_scale: 32.0 2023-09-29 14:49:43,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 14:49:43,194 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=398886.6666666667, ans=0.1 2023-09-29 14:49:44,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:49:48,151 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:49:49,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:49:55,512 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 14:49:57,045 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 14:49:59,255 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.21 vs. limit=10.0 2023-09-29 14:50:03,728 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=398953.3333333333, ans=0.125 2023-09-29 14:50:07,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:50:08,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:50:11,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:50:11,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 14:50:16,152 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:50:16,488 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=399020.0, ans=0.0 2023-09-29 14:50:17,198 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.50 vs. limit=15.0 2023-09-29 14:50:18,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 14:50:27,408 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:50:29,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:50:34,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 14:50:34,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:50:34,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:50:36,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:50:37,680 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:50:39,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:50:39,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:50:39,258 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:50:41,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 14:50:41,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:50:45,140 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.58 vs. limit=15.0 2023-09-29 14:50:47,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:50:50,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:50:57,473 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 14:50:58,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 14:51:00,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:51:02,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 14:51:02,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:51:04,476 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=399220.0, ans=0.0 2023-09-29 14:51:05,598 INFO [train.py:1039] (1/4) Epoch 12, batch 1450, loss[loss=0.211, simple_loss=0.2886, pruned_loss=0.06669, over 24375.00 frames. ], tot_loss[loss=0.1973, simple_loss=0.2668, pruned_loss=0.06394, over 4693168.78 frames. ], batch size: 77, lr: 8.74e-03, grad_scale: 32.0 2023-09-29 14:51:05,760 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:51:09,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 14:51:12,248 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:51:12,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:51:12,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 14:51:17,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:51:18,862 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 14:51:20,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:51:20,525 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 14:51:20,982 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=399286.6666666667, ans=0.0 2023-09-29 14:51:22,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 14:51:22,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 14:51:23,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:51:23,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:51:23,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 14:51:27,600 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:51:28,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 14:51:29,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 14:51:29,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:51:29,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:51:30,825 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:51:33,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:51:39,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:51:39,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:51:42,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:51:43,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:51:44,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:51:45,921 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:51:45,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:51:46,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:51:46,269 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=399353.3333333333, ans=0.0 2023-09-29 14:51:49,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 14:51:53,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:51:56,355 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 14:51:57,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:51:59,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 14:52:01,073 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:52:02,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 14:52:03,913 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.37 vs. limit=15.0 2023-09-29 14:52:07,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:52:09,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 14:52:10,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 14:52:12,313 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:52:14,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:52:14,855 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:52:17,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 14:52:19,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 14:52:19,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 14:52:21,465 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:52:22,739 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 2.049e+02 2.244e+02 2.638e+02 4.746e+02, threshold=4.488e+02, percent-clipped=1.0 2023-09-29 14:52:24,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 14:52:29,494 INFO [train.py:1039] (1/4) Epoch 12, batch 1500, loss[loss=0.22, simple_loss=0.2901, pruned_loss=0.07489, over 23395.00 frames. ], tot_loss[loss=0.1974, simple_loss=0.2673, pruned_loss=0.06377, over 4703073.90 frames. ], batch size: 93, lr: 8.73e-03, grad_scale: 32.0 2023-09-29 14:52:37,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 14:52:37,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:52:37,273 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:52:39,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:52:40,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:52:40,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:52:42,261 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 14:52:42,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:52:43,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 14:52:43,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:52:43,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:52:45,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:52:47,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:52:53,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:52:53,973 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 14:52:54,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 14:52:56,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:52:57,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:52:58,209 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.23 vs. limit=6.0 2023-09-29 14:53:02,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 14:53:05,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 14:53:07,366 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:53:07,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 14:53:10,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 14:53:13,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:53:13,846 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:53:13,878 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:53:15,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 14:53:16,829 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:53:16,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:53:16,958 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 14:53:18,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:53:21,403 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.whiten.whitening_limit, batch_count=399753.3333333333, ans=12.0 2023-09-29 14:53:22,670 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=9.97 vs. limit=22.5 2023-09-29 14:53:23,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:53:23,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 14:53:28,437 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:53:30,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 14:53:32,388 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=399753.3333333333, ans=0.5 2023-09-29 14:53:32,497 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=399753.3333333333, ans=0.125 2023-09-29 14:53:35,644 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 14:53:35,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:53:37,059 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 14:53:38,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:53:38,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:53:40,158 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 14:53:41,792 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 14:53:44,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 14:53:45,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:53:50,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:53:50,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:53:50,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:53:51,400 INFO [train.py:1039] (1/4) Epoch 12, batch 1550, loss[loss=0.1931, simple_loss=0.2831, pruned_loss=0.05152, over 24623.00 frames. ], tot_loss[loss=0.198, simple_loss=0.2685, pruned_loss=0.06376, over 4704349.04 frames. ], batch size: 68, lr: 8.73e-03, grad_scale: 16.0 2023-09-29 14:53:51,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:53:51,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:53:53,837 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 14:53:55,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 14:53:55,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:53:56,922 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 14:53:57,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 14:54:00,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:54:00,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:54:00,653 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=399886.6666666667, ans=0.0 2023-09-29 14:54:01,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:54:01,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:54:03,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:54:03,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:54:07,895 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 14:54:07,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:54:08,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:54:08,893 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=399953.3333333333, ans=0.1 2023-09-29 14:54:10,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:54:12,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 14:54:12,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 14:54:13,178 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.14 vs. limit=15.0 2023-09-29 14:54:15,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:54:15,259 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 14:54:16,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 14:54:16,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 14:54:16,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:54:16,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:54:18,573 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=399953.3333333333, ans=0.125 2023-09-29 14:54:23,748 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=399953.3333333333, ans=0.0 2023-09-29 14:54:25,232 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=399953.3333333333, ans=0.125 2023-09-29 14:54:26,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:54:26,898 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=400020.0, ans=0.125 2023-09-29 14:54:28,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 14:54:28,081 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 14:54:30,130 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.57 vs. limit=12.0 2023-09-29 14:54:33,565 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=400020.0, ans=0.1 2023-09-29 14:54:37,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:54:41,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:54:42,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:54:42,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:54:42,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 14:54:49,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 14:54:50,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:54:53,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:54:55,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:54:55,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:54:55,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 14:54:57,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 14:54:59,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:54:59,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:54:59,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 14:54:59,233 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 14:54:59,581 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=400153.3333333333, ans=0.0 2023-09-29 14:55:01,765 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.61 vs. limit=15.0 2023-09-29 14:55:02,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:55:09,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 14:55:11,981 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.422e+02 2.047e+02 2.446e+02 2.941e+02 5.003e+02, threshold=4.892e+02, percent-clipped=3.0 2023-09-29 14:55:12,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:55:13,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:55:13,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 14:55:16,841 INFO [train.py:1039] (1/4) Epoch 12, batch 1600, loss[loss=0.2209, simple_loss=0.2861, pruned_loss=0.07791, over 23365.00 frames. ], tot_loss[loss=0.1988, simple_loss=0.2696, pruned_loss=0.06404, over 4714964.35 frames. ], batch size: 285, lr: 8.72e-03, grad_scale: 32.0 2023-09-29 14:55:16,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 14:55:18,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:55:18,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:55:18,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:55:19,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:55:24,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:55:24,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 14:55:26,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 14:55:29,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 14:55:31,110 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:55:33,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 14:55:34,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:55:35,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:55:41,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:55:44,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 14:55:46,605 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.71 vs. limit=12.0 2023-09-29 14:55:48,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:55:49,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 14:55:50,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:55:50,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 14:55:51,061 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=400353.3333333333, ans=0.125 2023-09-29 14:55:53,708 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=400353.3333333333, ans=0.125 2023-09-29 14:55:57,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 14:56:00,512 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=400353.3333333333, ans=0.0 2023-09-29 14:56:04,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:56:04,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 14:56:06,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:56:06,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:56:06,290 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:56:09,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 14:56:13,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 14:56:13,410 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:56:13,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:56:14,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:56:16,380 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:56:17,970 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:56:18,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:56:19,654 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:56:27,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:56:27,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:56:30,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 14:56:30,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:56:31,014 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 14:56:36,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:56:39,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:56:39,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:56:41,161 INFO [train.py:1039] (1/4) Epoch 12, batch 1650, loss[loss=0.2083, simple_loss=0.2935, pruned_loss=0.06153, over 24314.00 frames. ], tot_loss[loss=0.2007, simple_loss=0.271, pruned_loss=0.06523, over 4704901.97 frames. ], batch size: 74, lr: 8.72e-03, grad_scale: 16.0 2023-09-29 14:56:41,212 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 14:56:41,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 14:56:41,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 14:56:41,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 14:56:46,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:56:48,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:56:48,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:56:48,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 14:56:48,461 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=400553.3333333333, ans=0.2 2023-09-29 14:56:50,395 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.06 vs. limit=15.0 2023-09-29 14:56:51,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:56:55,433 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 14:56:57,093 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:56:57,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:56:57,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:56:57,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:56:58,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 14:56:58,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 14:57:06,792 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 14:57:07,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 14:57:17,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 14:57:19,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:57:19,655 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.33 vs. limit=22.5 2023-09-29 14:57:22,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 14:57:23,800 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=400686.6666666667, ans=0.125 2023-09-29 14:57:25,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:57:28,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:57:29,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:57:29,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:57:31,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:57:31,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:57:34,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:57:36,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:57:36,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:57:36,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:57:37,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:57:38,236 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=400753.3333333333, ans=0.125 2023-09-29 14:57:39,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 14:57:40,516 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.66 vs. limit=15.0 2023-09-29 14:57:41,428 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=400753.3333333333, ans=0.2 2023-09-29 14:57:42,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:57:44,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 14:57:45,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:57:47,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 14:57:47,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 14:57:47,805 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 14:57:47,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:57:48,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_na.min_abs, batch_count=400820.0, ans=0.02 2023-09-29 14:57:48,781 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=13.45 vs. limit=22.5 2023-09-29 14:57:49,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:57:50,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:57:51,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:57:51,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 14:57:53,879 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.57 vs. limit=12.0 2023-09-29 14:57:56,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:57:58,384 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:57:58,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:57:59,739 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.460e+02 1.879e+02 2.089e+02 2.451e+02 3.406e+02, threshold=4.179e+02, percent-clipped=0.0 2023-09-29 14:58:00,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 14:58:03,015 INFO [train.py:1039] (1/4) Epoch 12, batch 1700, loss[loss=0.1748, simple_loss=0.244, pruned_loss=0.05276, over 24283.00 frames. ], tot_loss[loss=0.1992, simple_loss=0.2698, pruned_loss=0.06429, over 4718264.40 frames. ], batch size: 56, lr: 8.72e-03, grad_scale: 16.0 2023-09-29 14:58:04,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:58:04,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:58:06,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 14:58:06,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:58:06,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:58:06,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:58:09,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:58:09,589 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=400886.6666666667, ans=0.0 2023-09-29 14:58:09,712 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=400886.6666666667, ans=0.125 2023-09-29 14:58:10,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:58:10,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 14:58:13,202 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 14:58:18,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:58:18,530 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=400953.3333333333, ans=0.1 2023-09-29 14:58:21,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:58:29,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 14:58:29,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:58:29,236 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:58:29,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:58:33,693 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 14:58:35,493 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:58:35,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:58:37,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 14:58:37,219 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=401020.0, ans=0.015 2023-09-29 14:58:37,381 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=401020.0, ans=0.125 2023-09-29 14:58:38,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 14:58:42,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 14:58:42,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 14:58:43,902 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:58:46,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 14:58:46,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:58:52,528 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=401086.6666666667, ans=0.2 2023-09-29 14:58:55,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:58:57,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:58:58,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:58:59,292 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.60 vs. limit=12.0 2023-09-29 14:59:00,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 14:59:00,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 14:59:00,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:59:02,617 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:59:02,618 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 14:59:02,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:59:02,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:59:02,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:59:02,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:59:07,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:59:07,262 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:59:08,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:59:08,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:59:08,854 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:59:12,103 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:59:13,607 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 14:59:17,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:59:18,781 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:59:21,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 14:59:27,078 INFO [train.py:1039] (1/4) Epoch 12, batch 1750, loss[loss=0.2002, simple_loss=0.2679, pruned_loss=0.06624, over 20263.00 frames. ], tot_loss[loss=0.1978, simple_loss=0.268, pruned_loss=0.06383, over 4698195.31 frames. ], batch size: 44, lr: 8.71e-03, grad_scale: 16.0 2023-09-29 14:59:28,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:59:31,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:59:31,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 14:59:33,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 14:59:33,227 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:59:35,033 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer_ff2.min_abs, batch_count=401220.0, ans=0.1 2023-09-29 14:59:35,142 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=401220.0, ans=0.125 2023-09-29 14:59:36,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:59:36,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:59:41,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 14:59:43,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:59:46,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 14:59:46,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:59:48,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:59:51,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 14:59:53,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 14:59:54,732 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:59:56,086 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 15:00:04,340 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:00:06,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:00:06,137 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:00:13,194 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:00:13,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:00:16,062 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:00:17,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:00:19,384 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:00:20,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:00:21,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 15:00:24,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:00:26,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 15:00:28,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:00:29,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:00:29,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:00:33,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 15:00:34,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 15:00:34,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:00:37,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:00:42,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:00:43,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:00:45,915 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.394e+02 1.915e+02 2.238e+02 2.656e+02 3.754e+02, threshold=4.475e+02, percent-clipped=0.0 2023-09-29 15:00:46,132 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:00:46,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 15:00:46,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:00:49,074 INFO [train.py:1039] (1/4) Epoch 12, batch 1800, loss[loss=0.218, simple_loss=0.2931, pruned_loss=0.07143, over 23934.00 frames. ], tot_loss[loss=0.197, simple_loss=0.2671, pruned_loss=0.06347, over 4695350.46 frames. ], batch size: 86, lr: 8.71e-03, grad_scale: 16.0 2023-09-29 15:00:49,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 15:00:49,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:00:49,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:00:49,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:00:50,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 15:00:50,971 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=401553.3333333333, ans=0.0 2023-09-29 15:00:54,281 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:00:55,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:00:58,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 15:00:59,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:01:02,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 15:01:04,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:01:06,636 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=401620.0, ans=0.125 2023-09-29 15:01:07,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:01:10,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:01:11,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:01:12,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:01:14,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:01:14,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 15:01:15,751 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:01:19,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:01:24,154 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 15:01:25,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 15:01:25,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 15:01:27,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:01:29,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:01:29,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:01:29,372 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:01:30,341 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=401686.6666666667, ans=0.125 2023-09-29 15:01:39,686 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 15:01:39,843 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:01:41,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:01:42,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 15:01:43,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 15:01:44,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:01:46,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:01:46,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 15:01:50,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 15:01:58,004 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.max_abs, batch_count=401820.0, ans=10.0 2023-09-29 15:01:59,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:01:59,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 15:01:59,307 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:01:59,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:01:59,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:02:01,055 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 15:02:04,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:02:04,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:02:08,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 15:02:08,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:02:10,265 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.28 vs. limit=15.0 2023-09-29 15:02:11,071 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:02:11,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 15:02:11,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:02:13,212 INFO [train.py:1039] (1/4) Epoch 12, batch 1850, loss[loss=0.1763, simple_loss=0.2572, pruned_loss=0.04768, over 24477.00 frames. ], tot_loss[loss=0.1973, simple_loss=0.2679, pruned_loss=0.06337, over 4698564.62 frames. ], batch size: 63, lr: 8.71e-03, grad_scale: 16.0 2023-09-29 15:02:13,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:02:14,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 15:02:17,924 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:02:17,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:02:19,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:02:20,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:02:27,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:02:27,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 15:02:32,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 15:02:35,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 15:02:38,134 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=401953.3333333333, ans=0.0 2023-09-29 15:02:41,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:02:41,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 15:02:41,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 15:02:47,010 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=13.50 vs. limit=15.0 2023-09-29 15:02:51,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:02:51,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 15:02:52,037 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=402020.0, ans=0.04949747468305833 2023-09-29 15:02:54,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:02:54,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:02:55,513 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=19.86 vs. limit=22.5 2023-09-29 15:02:57,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 15:02:57,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:02:57,879 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:03:00,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:03:03,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:03:07,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:03:07,743 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=402086.6666666667, ans=0.1 2023-09-29 15:03:11,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:03:11,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:03:11,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 15:03:11,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:03:15,051 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:03:16,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:03:20,020 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=402153.3333333333, ans=0.125 2023-09-29 15:03:20,235 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=402153.3333333333, ans=0.1 2023-09-29 15:03:21,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 15:03:21,509 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=402153.3333333333, ans=10.0 2023-09-29 15:03:23,316 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:03:25,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:03:26,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:03:26,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 15:03:26,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 15:03:28,133 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 15:03:28,278 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 15:03:29,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 15:03:29,957 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:03:29,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:03:29,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:03:31,428 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 15:03:31,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:03:31,498 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:03:32,689 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.599e+02 1.981e+02 2.242e+02 2.724e+02 5.145e+02, threshold=4.485e+02, percent-clipped=1.0 2023-09-29 15:03:32,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 15:03:34,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 15:03:35,677 INFO [train.py:1039] (1/4) Epoch 12, batch 1900, loss[loss=0.2039, simple_loss=0.2689, pruned_loss=0.06949, over 23717.00 frames. ], tot_loss[loss=0.1985, simple_loss=0.2689, pruned_loss=0.064, over 4706317.94 frames. ], batch size: 232, lr: 8.70e-03, grad_scale: 16.0 2023-09-29 15:03:37,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:03:37,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 15:03:40,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:03:40,405 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 15:03:42,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:03:43,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:03:48,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:03:51,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:03:51,698 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 15:03:53,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 15:03:54,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:03:54,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:03:54,827 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 15:03:56,692 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 15:03:58,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 15:04:01,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:04:03,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 15:04:03,288 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=402286.6666666667, ans=0.125 2023-09-29 15:04:05,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 15:04:15,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 15:04:17,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 15:04:17,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:04:19,138 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 15:04:19,146 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 15:04:19,209 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 15:04:19,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 15:04:19,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:04:24,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 15:04:29,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:04:33,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:04:33,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 15:04:34,817 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=402420.0, ans=0.2 2023-09-29 15:04:37,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:04:39,582 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.28 vs. limit=12.0 2023-09-29 15:04:40,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 15:04:40,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 15:04:46,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:04:46,602 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:04:46,645 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:04:48,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:04:49,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 15:04:49,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 15:04:51,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:04:54,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:04:54,784 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:04:57,109 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:04:57,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:04:57,190 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 15:04:58,331 INFO [train.py:1039] (1/4) Epoch 12, batch 1950, loss[loss=0.2436, simple_loss=0.2985, pruned_loss=0.09438, over 22756.00 frames. ], tot_loss[loss=0.1993, simple_loss=0.2702, pruned_loss=0.06424, over 4712213.51 frames. ], batch size: 322, lr: 8.70e-03, grad_scale: 16.0 2023-09-29 15:04:58,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:05:04,247 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:05:05,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:05:05,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:05:05,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 15:05:10,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 15:05:10,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 15:05:10,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:05:12,056 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=402553.3333333333, ans=0.1 2023-09-29 15:05:13,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:05:16,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:05:16,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:05:17,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:05:19,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:05:21,294 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:05:21,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 15:05:22,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:05:22,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:05:28,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:05:31,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:05:31,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:05:31,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 15:05:31,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 15:05:33,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:05:33,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:05:33,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:05:38,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:05:41,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:05:45,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 15:05:45,450 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=402686.6666666667, ans=0.125 2023-09-29 15:05:48,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:05:49,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 15:05:49,585 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 15:05:49,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:05:54,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:05:54,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:05:55,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 15:06:01,158 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=402753.3333333333, ans=0.0 2023-09-29 15:06:02,617 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:06:02,750 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:06:06,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:06:08,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:06:11,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:06:12,004 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:06:13,476 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 15:06:13,485 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 15:06:13,712 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=402820.0, ans=0.0 2023-09-29 15:06:14,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:06:16,424 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 15:06:17,820 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 2.002e+02 2.294e+02 2.547e+02 3.463e+02, threshold=4.587e+02, percent-clipped=0.0 2023-09-29 15:06:19,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:06:20,850 INFO [train.py:1039] (1/4) Epoch 12, batch 2000, loss[loss=0.2022, simple_loss=0.2663, pruned_loss=0.069, over 23777.00 frames. ], tot_loss[loss=0.2012, simple_loss=0.2721, pruned_loss=0.06514, over 4712039.74 frames. ], batch size: 164, lr: 8.70e-03, grad_scale: 32.0 2023-09-29 15:06:23,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 15:06:24,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:06:25,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:06:25,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:06:27,446 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:06:31,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 15:06:32,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 15:06:34,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:06:36,639 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 15:06:38,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 15:06:38,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:06:41,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:06:42,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 15:06:44,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:06:46,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:06:47,025 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=402953.3333333333, ans=0.125 2023-09-29 15:06:48,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:06:48,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 15:06:49,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:06:51,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 15:06:51,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:06:53,211 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=403020.0, ans=0.2 2023-09-29 15:06:54,451 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:06:55,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 15:06:55,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:06:57,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:06:59,048 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=403020.0, ans=0.05 2023-09-29 15:06:59,049 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=403020.0, ans=0.1 2023-09-29 15:07:00,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:07:01,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 15:07:03,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 15:07:03,190 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:07:03,202 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:07:06,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:07:10,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:07:10,171 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:07:12,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:07:13,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:07:13,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:07:13,908 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:07:13,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:07:15,424 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:07:19,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:07:19,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 15:07:19,348 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=403086.6666666667, ans=0.0 2023-09-29 15:07:24,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 15:07:24,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:07:27,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:07:27,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:07:29,896 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.22 vs. limit=15.0 2023-09-29 15:07:32,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:07:33,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:07:33,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:07:33,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 15:07:33,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 15:07:36,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:07:38,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:07:42,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:07:44,211 INFO [train.py:1039] (1/4) Epoch 12, batch 2050, loss[loss=0.1776, simple_loss=0.2548, pruned_loss=0.05015, over 24615.00 frames. ], tot_loss[loss=0.1998, simple_loss=0.2712, pruned_loss=0.0642, over 4714250.69 frames. ], batch size: 65, lr: 8.69e-03, grad_scale: 16.0 2023-09-29 15:07:45,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:07:52,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:07:55,998 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:07:56,102 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:07:57,556 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:08:00,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 15:08:00,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:08:02,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:08:02,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 15:08:10,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:08:10,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:08:11,877 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=403286.6666666667, ans=0.0 2023-09-29 15:08:13,099 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 15:08:14,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:08:18,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 15:08:18,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:08:20,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:08:23,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:08:23,569 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 15:08:25,654 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:08:26,431 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.53 vs. limit=6.0 2023-09-29 15:08:27,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:08:29,070 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:08:29,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:08:33,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:08:35,168 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 15:08:36,781 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:08:39,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:08:43,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:08:47,682 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:08:49,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 15:08:55,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:08:56,509 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:08:59,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:09:01,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 15:09:04,760 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.918e+02 2.083e+02 2.421e+02 3.715e+02, threshold=4.167e+02, percent-clipped=0.0 2023-09-29 15:09:06,254 INFO [train.py:1039] (1/4) Epoch 12, batch 2100, loss[loss=0.1853, simple_loss=0.2663, pruned_loss=0.05218, over 24653.00 frames. ], tot_loss[loss=0.1985, simple_loss=0.2694, pruned_loss=0.06375, over 4718737.55 frames. ], batch size: 65, lr: 8.69e-03, grad_scale: 16.0 2023-09-29 15:09:06,471 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 15:09:06,472 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:09:07,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:09:07,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 15:09:09,416 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:09:09,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 15:09:09,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 15:09:12,413 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:09:15,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:09:15,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:09:18,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:09:20,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:09:20,155 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 15:09:20,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:09:21,848 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 15:09:21,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 15:09:23,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:09:23,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:09:23,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 15:09:23,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 15:09:23,938 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=403620.0, ans=0.125 2023-09-29 15:09:29,623 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 15:09:29,625 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 15:09:34,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:09:36,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:09:39,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:09:39,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 15:09:39,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:09:39,549 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 15:09:41,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 15:09:43,149 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:09:43,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 15:09:43,206 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 15:09:43,283 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 15:09:46,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 15:09:47,848 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:09:48,300 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=403686.6666666667, ans=0.125 2023-09-29 15:09:50,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 15:09:52,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 15:09:54,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:09:55,750 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:09:55,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 15:09:55,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:09:55,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:09:57,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:09:59,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 15:10:00,834 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 15:10:02,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 15:10:06,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:10:09,202 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:10:09,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 15:10:15,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:10:19,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:10:19,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:10:19,528 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:10:19,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 15:10:19,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 15:10:21,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:10:21,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 15:10:22,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:10:22,777 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:10:23,126 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=403820.0, ans=0.125 2023-09-29 15:10:25,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 15:10:27,140 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 15:10:27,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:10:28,529 INFO [train.py:1039] (1/4) Epoch 12, batch 2150, loss[loss=0.1913, simple_loss=0.261, pruned_loss=0.06086, over 23544.00 frames. ], tot_loss[loss=0.197, simple_loss=0.2682, pruned_loss=0.06293, over 4721403.78 frames. ], batch size: 120, lr: 8.69e-03, grad_scale: 16.0 2023-09-29 15:10:28,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:10:28,805 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:10:30,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:10:30,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:10:32,656 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=403886.6666666667, ans=0.0 2023-09-29 15:10:37,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 15:10:37,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:10:37,952 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=403886.6666666667, ans=0.025 2023-09-29 15:10:39,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:10:40,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:10:40,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:10:40,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:10:45,439 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:10:47,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:10:47,386 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:10:51,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:10:51,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 15:10:55,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:10:57,605 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:10:57,853 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=403953.3333333333, ans=0.0 2023-09-29 15:10:59,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:10:59,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:10:59,689 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.52 vs. limit=22.5 2023-09-29 15:11:00,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:11:00,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 15:11:01,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:11:01,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:11:02,039 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:11:03,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 15:11:05,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:11:07,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:11:07,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:11:09,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 15:11:12,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:11:14,035 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:11:14,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:11:14,364 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=404020.0, ans=0.1 2023-09-29 15:11:15,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:11:15,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 15:11:15,709 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 15:11:16,037 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=404020.0, ans=0.125 2023-09-29 15:11:18,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:11:18,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:11:21,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:11:22,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 15:11:22,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:11:24,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:11:24,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 15:11:26,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 15:11:27,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:11:27,846 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 15:11:27,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:11:29,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:11:29,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 15:11:29,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:11:29,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 15:11:30,802 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 15:11:30,802 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 15:11:30,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 15:11:33,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:11:33,788 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:11:33,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:11:33,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:11:35,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 15:11:38,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:11:38,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:11:48,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:11:49,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 15:11:50,281 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.936e+02 2.140e+02 2.613e+02 4.157e+02, threshold=4.280e+02, percent-clipped=0.0 2023-09-29 15:11:51,880 INFO [train.py:1039] (1/4) Epoch 12, batch 2200, loss[loss=0.1951, simple_loss=0.2829, pruned_loss=0.05364, over 24679.00 frames. ], tot_loss[loss=0.1969, simple_loss=0.2681, pruned_loss=0.06285, over 4722763.26 frames. ], batch size: 73, lr: 8.68e-03, grad_scale: 16.0 2023-09-29 15:11:52,147 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:11:59,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:11:59,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:12:00,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:12:02,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 15:12:04,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:12:04,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:12:04,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 15:12:09,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 15:12:11,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 15:12:19,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 15:12:22,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:12:23,669 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.84 vs. limit=22.5 2023-09-29 15:12:24,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:12:24,121 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:12:27,309 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:12:28,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 15:12:32,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 15:12:33,915 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:12:34,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 15:12:37,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:12:37,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:12:39,753 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=404353.3333333333, ans=0.125 2023-09-29 15:12:42,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:12:43,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:12:44,292 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=404420.0, ans=0.2 2023-09-29 15:12:44,313 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=404420.0, ans=0.0 2023-09-29 15:12:45,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 15:12:46,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:12:47,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 15:12:50,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:12:50,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 15:12:52,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:12:54,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:12:55,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:12:55,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:12:55,836 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:12:57,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 15:12:57,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:12:59,401 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=404486.6666666667, ans=0.0 2023-09-29 15:13:00,622 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 15:13:03,566 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 15:13:04,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:13:05,723 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.41 vs. limit=6.0 2023-09-29 15:13:07,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:13:08,831 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 15:13:12,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 15:13:12,413 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 15:13:13,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 15:13:14,020 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 15:13:14,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:13:15,594 INFO [train.py:1039] (1/4) Epoch 12, batch 2250, loss[loss=0.1878, simple_loss=0.2562, pruned_loss=0.05967, over 23663.00 frames. ], tot_loss[loss=0.1972, simple_loss=0.2688, pruned_loss=0.06277, over 4722104.76 frames. ], batch size: 135, lr: 8.68e-03, grad_scale: 16.0 2023-09-29 15:13:15,704 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 15:13:17,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:13:18,899 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 15:13:20,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:13:23,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:13:29,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:13:29,825 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:13:32,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:13:34,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:13:35,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:13:38,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 15:13:38,722 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:13:38,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:13:40,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 15:13:40,884 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:13:42,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:13:43,846 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:13:50,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:13:51,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 15:13:52,050 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 15:13:53,790 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=404686.6666666667, ans=0.125 2023-09-29 15:13:54,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 15:13:56,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:13:56,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:14:03,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:14:05,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:14:06,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:14:06,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:14:08,508 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=404753.3333333333, ans=0.125 2023-09-29 15:14:09,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:14:10,014 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=404753.3333333333, ans=0.1 2023-09-29 15:14:11,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:14:13,024 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=404753.3333333333, ans=0.0 2023-09-29 15:14:16,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:14:18,128 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 15:14:23,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 15:14:23,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:14:23,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:14:29,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 15:14:31,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 15:14:31,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 15:14:31,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:14:31,424 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=404820.0, ans=0.1 2023-09-29 15:14:32,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:14:34,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 15:14:35,211 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=404820.0, ans=0.0 2023-09-29 15:14:36,006 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 2.000e+02 2.393e+02 2.961e+02 4.518e+02, threshold=4.786e+02, percent-clipped=2.0 2023-09-29 15:14:38,348 INFO [train.py:1039] (1/4) Epoch 12, batch 2300, loss[loss=0.2102, simple_loss=0.2879, pruned_loss=0.06627, over 24056.00 frames. ], tot_loss[loss=0.1988, simple_loss=0.2702, pruned_loss=0.06368, over 4722191.03 frames. ], batch size: 80, lr: 8.67e-03, grad_scale: 16.0 2023-09-29 15:14:38,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:14:38,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:14:44,741 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=404886.6666666667, ans=0.1 2023-09-29 15:14:45,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:14:46,031 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:14:48,557 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=404886.6666666667, ans=0.125 2023-09-29 15:14:49,660 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 15:14:51,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:15:00,766 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:15:00,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 15:15:00,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:15:02,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:15:02,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 15:15:02,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:15:03,445 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.68 vs. limit=22.5 2023-09-29 15:15:05,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:15:05,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:15:08,625 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:15:13,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:15:15,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:15:20,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:15:20,452 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:15:22,652 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=405020.0, ans=0.125 2023-09-29 15:15:23,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:15:26,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:15:27,167 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=405086.6666666667, ans=0.125 2023-09-29 15:15:30,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:15:30,671 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=7.88 vs. limit=15.0 2023-09-29 15:15:32,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 15:15:32,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:15:32,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 15:15:36,742 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 15:15:36,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:15:36,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:15:36,877 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:15:38,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:15:38,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 15:15:38,335 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 15:15:39,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 15:15:39,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:15:39,809 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:15:41,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 15:15:44,495 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=405153.3333333333, ans=0.1 2023-09-29 15:15:50,164 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:15:51,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:15:57,223 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:15:58,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:15:58,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 15:15:58,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 15:15:59,235 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=405220.0, ans=0.2 2023-09-29 15:16:00,289 INFO [train.py:1039] (1/4) Epoch 12, batch 2350, loss[loss=0.1756, simple_loss=0.2518, pruned_loss=0.04973, over 24429.00 frames. ], tot_loss[loss=0.1994, simple_loss=0.2705, pruned_loss=0.0641, over 4728617.03 frames. ], batch size: 58, lr: 8.67e-03, grad_scale: 8.0 2023-09-29 15:16:00,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:16:00,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 15:16:02,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 15:16:07,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:16:07,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 15:16:13,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 15:16:16,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:16:21,972 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:16:21,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:16:21,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:16:23,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:16:25,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 15:16:28,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:16:33,969 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=405353.3333333333, ans=0.125 2023-09-29 15:16:35,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 15:16:35,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:16:38,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 15:16:38,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:16:39,877 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:16:42,768 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 15:16:42,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:16:44,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:16:44,903 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:16:44,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:16:46,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:16:49,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 15:16:50,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:16:54,204 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=405420.0, ans=0.0 2023-09-29 15:16:55,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:16:55,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:16:55,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 15:16:56,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:16:59,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 15:17:00,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:17:06,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 15:17:11,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 15:17:12,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:17:12,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 15:17:14,159 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 15:17:14,199 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 15:17:17,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 15:17:20,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:17:22,272 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.501e+02 1.923e+02 2.128e+02 2.368e+02 3.787e+02, threshold=4.256e+02, percent-clipped=0.0 2023-09-29 15:17:22,315 INFO [train.py:1039] (1/4) Epoch 12, batch 2400, loss[loss=0.1831, simple_loss=0.2639, pruned_loss=0.05117, over 24495.00 frames. ], tot_loss[loss=0.1995, simple_loss=0.2708, pruned_loss=0.06411, over 4714705.62 frames. ], batch size: 66, lr: 8.67e-03, grad_scale: 16.0 2023-09-29 15:17:25,438 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:17:27,247 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:17:27,429 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=405553.3333333333, ans=0.0 2023-09-29 15:17:30,847 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:17:30,939 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 15:17:30,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 15:17:40,667 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 15:17:40,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:17:41,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 15:17:42,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:17:43,917 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:17:43,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 15:17:48,723 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:17:50,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 15:17:50,802 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 15:17:56,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 15:18:02,190 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 15:18:02,509 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=405686.6666666667, ans=0.125 2023-09-29 15:18:05,314 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.28 vs. limit=6.0 2023-09-29 15:18:05,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:18:07,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:18:12,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:18:12,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 15:18:12,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 15:18:20,449 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:18:23,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:18:25,870 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.20 vs. limit=10.0 2023-09-29 15:18:26,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:18:26,750 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:18:26,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 15:18:28,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:18:28,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:18:28,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:18:28,340 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 15:18:33,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:18:35,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 15:18:35,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 15:18:37,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 15:18:39,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:18:39,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:18:39,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 15:18:41,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 15:18:41,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 15:18:41,444 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 15:18:43,532 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 15:18:44,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:18:45,405 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 15:18:46,474 INFO [train.py:1039] (1/4) Epoch 12, batch 2450, loss[loss=0.1921, simple_loss=0.2774, pruned_loss=0.05345, over 24478.00 frames. ], tot_loss[loss=0.1977, simple_loss=0.269, pruned_loss=0.06322, over 4717650.57 frames. ], batch size: 69, lr: 8.66e-03, grad_scale: 16.0 2023-09-29 15:18:46,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:18:46,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:18:46,704 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 15:18:48,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:18:49,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 15:18:50,644 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=7.80 vs. limit=15.0 2023-09-29 15:18:54,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:18:54,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:18:54,615 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=405886.6666666667, ans=0.125 2023-09-29 15:18:57,555 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:18:57,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:18:59,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 15:19:03,045 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=15.29 vs. limit=22.5 2023-09-29 15:19:05,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:19:05,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:19:08,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:19:08,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:19:08,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:19:08,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 15:19:09,075 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=405953.3333333333, ans=0.0 2023-09-29 15:19:12,795 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=405953.3333333333, ans=0.0 2023-09-29 15:19:14,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:19:17,809 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 15:19:19,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:19:21,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 15:19:21,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:19:24,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:19:24,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:19:26,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 15:19:27,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:19:32,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:19:34,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:19:35,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:19:35,661 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:19:35,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:19:37,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:19:38,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 15:19:40,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:19:40,886 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:19:45,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:19:45,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:19:51,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:19:51,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 15:19:52,124 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=406153.3333333333, ans=0.125 2023-09-29 15:19:53,242 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:19:53,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:19:54,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 15:19:54,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:19:56,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:20:00,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:20:03,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:20:03,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:20:04,201 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 15:20:07,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 15:20:08,460 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 1.958e+02 2.224e+02 2.647e+02 3.379e+02, threshold=4.447e+02, percent-clipped=0.0 2023-09-29 15:20:08,503 INFO [train.py:1039] (1/4) Epoch 12, batch 2500, loss[loss=0.1743, simple_loss=0.2456, pruned_loss=0.05146, over 24308.00 frames. ], tot_loss[loss=0.1963, simple_loss=0.2674, pruned_loss=0.06259, over 4705151.88 frames. ], batch size: 56, lr: 8.66e-03, grad_scale: 16.0 2023-09-29 15:20:08,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:20:15,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:20:15,606 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=406220.0, ans=0.0 2023-09-29 15:20:17,096 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=406220.0, ans=0.5 2023-09-29 15:20:25,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:20:25,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:20:25,534 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=406286.6666666667, ans=0.95 2023-09-29 15:20:26,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:20:26,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 15:20:33,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:20:34,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:20:36,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 15:20:36,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 15:20:36,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 15:20:39,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:20:39,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:20:39,431 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 15:20:39,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:20:39,668 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 15:20:40,992 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 15:20:41,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:20:46,065 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=406353.3333333333, ans=0.07 2023-09-29 15:20:47,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:20:48,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:20:51,541 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 15:20:51,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 15:20:53,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:20:54,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:20:59,004 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:21:00,778 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=406420.0, ans=0.125 2023-09-29 15:21:03,659 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:21:03,899 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=406420.0, ans=0.2 2023-09-29 15:21:07,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:21:10,556 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=406420.0, ans=0.1 2023-09-29 15:21:13,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 15:21:15,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 15:21:16,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:21:16,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 15:21:19,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:21:19,662 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 15:21:19,959 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=406486.6666666667, ans=0.0 2023-09-29 15:21:21,165 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 15:21:21,166 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 15:21:21,174 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 15:21:24,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:21:26,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 15:21:26,461 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 15:21:26,582 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=406486.6666666667, ans=0.0 2023-09-29 15:21:29,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:21:29,859 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 15:21:31,830 INFO [train.py:1039] (1/4) Epoch 12, batch 2550, loss[loss=0.2073, simple_loss=0.2775, pruned_loss=0.06854, over 23233.00 frames. ], tot_loss[loss=0.1962, simple_loss=0.2677, pruned_loss=0.06232, over 4723745.54 frames. ], batch size: 93, lr: 8.66e-03, grad_scale: 16.0 2023-09-29 15:21:33,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 15:21:33,919 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=406553.3333333333, ans=0.125 2023-09-29 15:21:37,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:21:38,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:21:38,926 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:21:41,895 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:21:43,453 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 15:21:43,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 15:21:45,413 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=406553.3333333333, ans=0.1 2023-09-29 15:21:46,772 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 15:21:48,281 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:21:51,133 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:21:52,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:21:52,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 15:21:52,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 15:21:53,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:21:54,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:21:57,496 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:21:57,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 15:21:57,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 15:21:57,597 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:21:57,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 15:21:58,169 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=5.20 vs. limit=15.0 2023-09-29 15:22:14,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:22:16,091 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=406686.6666666667, ans=0.1 2023-09-29 15:22:17,971 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.31 vs. limit=15.0 2023-09-29 15:22:18,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:22:18,858 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:22:18,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:22:19,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 15:22:20,819 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=406753.3333333333, ans=0.05 2023-09-29 15:22:22,465 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=406753.3333333333, ans=0.125 2023-09-29 15:22:26,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:22:29,806 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=406753.3333333333, ans=0.1 2023-09-29 15:22:30,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 15:22:30,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 15:22:31,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:22:31,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 15:22:31,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 15:22:37,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:22:37,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:22:41,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:22:41,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 15:22:41,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:22:43,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:22:43,413 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=406820.0, ans=0.0 2023-09-29 15:22:44,623 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:22:46,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 15:22:46,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:22:53,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:22:53,432 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=406886.6666666667, ans=0.2 2023-09-29 15:22:54,378 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.975e+02 2.307e+02 2.705e+02 3.697e+02, threshold=4.615e+02, percent-clipped=0.0 2023-09-29 15:22:54,421 INFO [train.py:1039] (1/4) Epoch 12, batch 2600, loss[loss=0.2104, simple_loss=0.2724, pruned_loss=0.07424, over 23478.00 frames. ], tot_loss[loss=0.197, simple_loss=0.2687, pruned_loss=0.06265, over 4730562.51 frames. ], batch size: 285, lr: 8.65e-03, grad_scale: 16.0 2023-09-29 15:22:54,661 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:22:57,879 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 15:23:02,220 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 15:23:02,271 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:23:02,319 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 15:23:03,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 15:23:03,854 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 15:23:06,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:23:06,879 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 15:23:09,130 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 15:23:11,318 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 15:23:12,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:23:14,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 15:23:17,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 15:23:19,279 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 15:23:19,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 15:23:22,942 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 15:23:22,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 15:23:29,555 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=407020.0, ans=0.125 2023-09-29 15:23:30,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:23:30,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:23:30,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:23:30,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 15:23:33,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:23:39,948 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 15:23:45,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:23:45,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:23:46,006 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 15:23:47,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 15:23:48,873 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.12 vs. limit=15.0 2023-09-29 15:23:49,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:23:49,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:23:50,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 15:23:53,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 15:23:53,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:23:58,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:24:02,638 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 15:24:02,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:24:02,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:24:07,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:24:08,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:24:08,946 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 15:24:10,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:24:11,900 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:24:11,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:24:16,564 INFO [train.py:1039] (1/4) Epoch 12, batch 2650, loss[loss=0.2117, simple_loss=0.2779, pruned_loss=0.07278, over 23378.00 frames. ], tot_loss[loss=0.1973, simple_loss=0.269, pruned_loss=0.06277, over 4738821.61 frames. ], batch size: 119, lr: 8.65e-03, grad_scale: 16.0 2023-09-29 15:24:16,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 15:24:18,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:24:21,285 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 15:24:27,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 15:24:27,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:24:28,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 15:24:28,654 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 15:24:28,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:24:30,166 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=5.70 vs. limit=12.0 2023-09-29 15:24:30,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:24:32,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 15:24:34,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:24:37,091 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:24:37,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 15:24:38,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 15:24:38,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:24:39,106 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=407286.6666666667, ans=0.95 2023-09-29 15:24:41,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 15:24:42,277 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.90 vs. limit=22.5 2023-09-29 15:24:43,151 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 15:24:44,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:24:47,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 15:24:47,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:24:49,295 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 15:24:55,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:24:55,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 15:24:55,378 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:24:57,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:25:01,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 15:25:01,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 15:25:06,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 15:25:08,522 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 15:25:08,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:25:10,156 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:25:11,617 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 15:25:11,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:25:11,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:25:12,335 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.94 vs. limit=10.0 2023-09-29 15:25:14,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:25:16,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:25:16,306 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:25:16,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:25:17,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:25:19,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:25:19,840 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=407420.0, ans=0.2 2023-09-29 15:25:20,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 15:25:21,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:25:23,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:25:24,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 15:25:25,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:25:27,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:25:27,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:25:27,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 15:25:32,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:25:34,845 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:25:34,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:25:37,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:25:38,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 15:25:39,910 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.585e+02 1.894e+02 2.126e+02 2.422e+02 3.822e+02, threshold=4.252e+02, percent-clipped=0.0 2023-09-29 15:25:39,952 INFO [train.py:1039] (1/4) Epoch 12, batch 2700, loss[loss=0.2036, simple_loss=0.2724, pruned_loss=0.06743, over 23181.00 frames. ], tot_loss[loss=0.1987, simple_loss=0.2702, pruned_loss=0.06362, over 4719102.27 frames. ], batch size: 105, lr: 8.65e-03, grad_scale: 16.0 2023-09-29 15:25:40,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:25:41,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:25:43,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 15:25:46,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:25:47,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 15:25:49,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:25:49,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:25:49,295 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:25:50,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:25:50,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:25:50,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:25:50,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 15:25:51,120 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=407553.3333333333, ans=0.2 2023-09-29 15:25:51,285 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=407553.3333333333, ans=0.2 2023-09-29 15:25:52,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 15:25:52,463 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:25:52,927 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.62 vs. limit=22.5 2023-09-29 15:25:53,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 15:25:55,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 15:25:56,821 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:26:00,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:26:00,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 15:26:00,472 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=407620.0, ans=0.125 2023-09-29 15:26:02,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:26:07,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:26:07,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:26:07,798 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=407620.0, ans=0.125 2023-09-29 15:26:14,741 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:26:14,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:26:14,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:26:14,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:26:18,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:26:19,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:26:19,807 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 15:26:19,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:26:25,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:26:25,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:26:34,699 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=10.34 vs. limit=10.0 2023-09-29 15:26:35,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:26:35,744 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:26:37,672 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=407753.3333333333, ans=0.125 2023-09-29 15:26:40,882 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:26:40,885 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:26:44,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:26:44,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:26:46,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:26:48,285 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:26:48,647 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=407820.0, ans=0.0 2023-09-29 15:26:49,801 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:26:51,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:26:52,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:26:54,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:26:54,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:26:56,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 15:26:57,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:26:59,171 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:26:59,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 15:27:00,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 15:27:00,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:27:02,265 INFO [train.py:1039] (1/4) Epoch 12, batch 2750, loss[loss=0.1933, simple_loss=0.2485, pruned_loss=0.06902, over 23634.00 frames. ], tot_loss[loss=0.1982, simple_loss=0.2696, pruned_loss=0.0634, over 4710217.09 frames. ], batch size: 256, lr: 8.64e-03, grad_scale: 16.0 2023-09-29 15:27:03,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:27:06,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:27:08,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:27:08,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 15:27:09,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:27:12,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:27:14,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 15:27:14,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:27:14,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:27:14,185 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 15:27:14,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:27:14,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:27:21,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 15:27:21,987 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=407953.3333333333, ans=0.0 2023-09-29 15:27:23,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:27:23,237 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:27:24,875 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:27:26,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 15:27:26,539 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=407953.3333333333, ans=0.0 2023-09-29 15:27:27,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:27:27,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:27:27,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:27:29,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:27:32,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 15:27:32,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 15:27:32,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 15:27:34,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:27:34,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 15:27:37,927 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=408020.0, ans=0.0 2023-09-29 15:27:43,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:27:46,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 15:27:46,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:27:50,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:27:50,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:27:51,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:27:59,175 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:28:00,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:28:00,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 15:28:05,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:28:08,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 15:28:13,089 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 15:28:14,689 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:28:16,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 15:28:18,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:28:18,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:28:20,126 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 15:28:20,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:28:24,686 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.991e+02 2.173e+02 2.887e+02 4.719e+02, threshold=4.346e+02, percent-clipped=2.0 2023-09-29 15:28:24,734 INFO [train.py:1039] (1/4) Epoch 12, batch 2800, loss[loss=0.2013, simple_loss=0.2746, pruned_loss=0.064, over 24353.00 frames. ], tot_loss[loss=0.1968, simple_loss=0.2683, pruned_loss=0.06264, over 4715062.91 frames. ], batch size: 61, lr: 8.64e-03, grad_scale: 32.0 2023-09-29 15:28:24,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 15:28:24,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:28:24,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:28:26,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 15:28:26,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:28:28,430 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:28:29,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:28:30,027 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 15:28:30,028 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 15:28:33,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:28:36,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:28:36,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:28:39,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:28:39,754 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=408286.6666666667, ans=10.0 2023-09-29 15:28:40,258 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=14.08 vs. limit=15.0 2023-09-29 15:28:41,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 15:28:43,040 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=408286.6666666667, ans=0.125 2023-09-29 15:28:44,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 15:28:44,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 15:28:45,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:28:45,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:28:45,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:28:49,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:28:49,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:28:49,599 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 15:28:52,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:29:00,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:29:02,110 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:29:05,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:29:05,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:29:06,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:29:13,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:29:13,192 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 15:29:14,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:29:14,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:29:14,865 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:29:18,410 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=408420.0, ans=0.0 2023-09-29 15:29:19,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:29:19,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:29:24,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:29:26,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:29:27,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:29:27,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 15:29:28,348 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 15:29:28,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 15:29:30,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:29:30,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 15:29:30,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:29:31,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:29:31,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:29:33,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 15:29:35,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:29:35,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:29:36,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:29:38,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 15:29:39,255 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.47 vs. limit=15.0 2023-09-29 15:29:44,034 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=408486.6666666667, ans=0.1 2023-09-29 15:29:45,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:29:45,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 15:29:45,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:29:48,304 INFO [train.py:1039] (1/4) Epoch 12, batch 2850, loss[loss=0.2019, simple_loss=0.2668, pruned_loss=0.06845, over 23172.00 frames. ], tot_loss[loss=0.1952, simple_loss=0.2665, pruned_loss=0.06194, over 4707233.32 frames. ], batch size: 105, lr: 8.64e-03, grad_scale: 32.0 2023-09-29 15:29:48,424 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:29:51,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:29:51,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:29:53,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:29:55,337 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.85 vs. limit=15.0 2023-09-29 15:29:56,754 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:29:56,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:29:59,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 15:30:00,432 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 15:30:06,619 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 15:30:06,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:30:08,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 15:30:10,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:30:12,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 15:30:14,054 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 15:30:15,589 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:30:18,887 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=408620.0, ans=0.07 2023-09-29 15:30:29,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:30:29,738 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 15:30:31,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:30:31,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:30:31,785 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=408686.6666666667, ans=0.125 2023-09-29 15:30:32,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 15:30:32,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:30:33,008 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:30:36,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 15:30:36,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 15:30:37,147 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=408753.3333333333, ans=0.125 2023-09-29 15:30:38,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:30:38,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:30:38,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:30:38,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:30:41,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:30:41,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:30:43,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:30:45,169 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:30:48,157 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:30:48,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:30:50,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:30:53,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:30:57,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:30:59,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 15:31:00,601 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 15:31:02,267 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 15:31:02,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:31:03,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 15:31:03,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:31:05,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:31:05,189 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:31:05,233 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:31:05,234 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 15:31:07,274 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 15:31:07,280 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:31:07,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:31:10,767 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.902e+02 2.094e+02 2.569e+02 4.916e+02, threshold=4.188e+02, percent-clipped=1.0 2023-09-29 15:31:10,816 INFO [train.py:1039] (1/4) Epoch 12, batch 2900, loss[loss=0.2179, simple_loss=0.2872, pruned_loss=0.07427, over 23632.00 frames. ], tot_loss[loss=0.1957, simple_loss=0.2671, pruned_loss=0.0621, over 4710164.92 frames. ], batch size: 85, lr: 8.63e-03, grad_scale: 32.0 2023-09-29 15:31:12,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 15:31:12,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:31:12,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:31:14,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 15:31:18,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:31:19,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 15:31:21,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 15:31:22,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:31:22,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 15:31:22,983 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=408886.6666666667, ans=0.125 2023-09-29 15:31:26,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:31:26,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:31:31,137 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:31:31,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:31:34,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 15:31:35,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 15:31:35,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 15:31:37,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:31:40,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 15:31:40,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 15:31:44,428 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:31:44,433 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 15:31:44,461 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:31:47,809 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:31:49,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 15:31:52,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:31:53,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:31:57,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:32:02,023 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:32:02,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 15:32:04,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 15:32:04,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:32:08,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:32:10,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 15:32:11,694 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:32:16,268 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:32:24,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:32:24,121 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 15:32:24,384 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=409153.3333333333, ans=0.125 2023-09-29 15:32:25,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 15:32:27,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:32:27,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 15:32:27,580 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=409153.3333333333, ans=0.1 2023-09-29 15:32:28,738 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:32:28,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 15:32:30,266 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.38 vs. limit=15.0 2023-09-29 15:32:34,390 INFO [train.py:1039] (1/4) Epoch 12, batch 2950, loss[loss=0.2698, simple_loss=0.3175, pruned_loss=0.1111, over 19407.00 frames. ], tot_loss[loss=0.1962, simple_loss=0.2681, pruned_loss=0.06215, over 4714728.14 frames. ], batch size: 388, lr: 8.63e-03, grad_scale: 32.0 2023-09-29 15:32:34,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:32:36,487 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=409220.0, ans=0.09899494936611666 2023-09-29 15:32:37,494 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 15:32:37,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:32:37,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:32:37,819 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=409220.0, ans=0.125 2023-09-29 15:32:39,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:32:40,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:32:42,152 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 15:32:42,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 15:32:43,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 15:32:43,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:32:50,898 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:32:51,868 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.70 vs. limit=6.0 2023-09-29 15:32:52,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:32:55,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:32:55,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:32:55,787 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=409286.6666666667, ans=0.2 2023-09-29 15:32:58,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:32:58,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:33:02,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:33:03,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:33:03,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:33:05,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 15:33:09,421 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 15:33:09,454 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 15:33:10,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:33:12,506 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 15:33:15,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 15:33:15,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:33:16,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:33:16,835 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 15:33:16,853 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 15:33:19,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 15:33:20,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:33:20,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:33:23,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:33:24,734 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.27 vs. limit=15.0 2023-09-29 15:33:25,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:33:25,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:33:27,400 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 15:33:27,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:33:27,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 15:33:32,233 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=409420.0, ans=0.1 2023-09-29 15:33:34,967 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:33:37,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:33:38,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 15:33:38,592 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:33:38,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 15:33:41,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:33:43,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:33:45,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:33:46,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:33:46,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 15:33:46,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:33:48,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:33:48,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 15:33:50,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 15:33:50,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:33:51,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:33:53,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:33:53,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 15:33:54,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:33:56,517 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.935e+02 2.190e+02 2.674e+02 3.950e+02, threshold=4.379e+02, percent-clipped=0.0 2023-09-29 15:33:56,560 INFO [train.py:1039] (1/4) Epoch 12, batch 3000, loss[loss=0.1943, simple_loss=0.2724, pruned_loss=0.05808, over 24391.00 frames. ], tot_loss[loss=0.1979, simple_loss=0.2698, pruned_loss=0.06305, over 4706248.83 frames. ], batch size: 77, lr: 8.63e-03, grad_scale: 32.0 2023-09-29 15:33:56,561 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-29 15:34:11,478 INFO [train.py:1071] (1/4) Epoch 12, validation: loss=0.2606, simple_loss=0.2686, pruned_loss=0.1263, over 1125622.00 frames. 2023-09-29 15:34:11,479 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-29 15:34:14,559 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:34:14,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 15:34:19,209 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 15:34:19,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 15:34:20,936 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:34:20,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:34:22,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 15:34:22,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:34:29,828 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 15:34:39,499 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:34:39,806 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=409620.0, ans=0.125 2023-09-29 15:34:47,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 15:34:47,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:34:50,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 15:34:51,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=409686.6666666667, ans=0.125 2023-09-29 15:34:52,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:34:52,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:34:53,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:34:53,822 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 15:34:56,822 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 15:34:58,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:34:58,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 15:35:02,094 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:35:02,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:35:04,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:35:04,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:35:07,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 15:35:08,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:35:08,763 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:35:10,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:35:13,954 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 15:35:15,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:35:15,672 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=409820.0, ans=0.07 2023-09-29 15:35:16,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:35:17,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:35:20,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:35:22,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:35:22,536 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=409820.0, ans=0.2 2023-09-29 15:35:23,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 15:35:23,753 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 15:35:23,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:35:25,171 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 15:35:26,658 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:35:28,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 15:35:31,270 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 15:35:32,695 INFO [train.py:1039] (1/4) Epoch 12, batch 3050, loss[loss=0.2092, simple_loss=0.281, pruned_loss=0.06863, over 23177.00 frames. ], tot_loss[loss=0.1991, simple_loss=0.2705, pruned_loss=0.06389, over 4705617.52 frames. ], batch size: 105, lr: 8.62e-03, grad_scale: 32.0 2023-09-29 15:35:32,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 15:35:34,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 15:35:34,393 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 15:35:34,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 15:35:34,700 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=409886.6666666667, ans=0.125 2023-09-29 15:35:35,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:35:36,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:35:36,239 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=409886.6666666667, ans=0.0 2023-09-29 15:35:37,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 15:35:37,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:35:37,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:35:38,400 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.89 vs. limit=15.0 2023-09-29 15:35:39,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 15:35:41,472 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:35:43,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:35:43,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:35:48,070 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:35:48,859 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=12.56 vs. limit=15.0 2023-09-29 15:35:51,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 15:35:56,416 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=409953.3333333333, ans=0.125 2023-09-29 15:35:59,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 15:35:59,190 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 15:35:59,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:36:01,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:36:04,324 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:36:05,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:36:05,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:36:07,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:36:08,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 15:36:08,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:36:10,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:36:10,933 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:36:12,371 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:36:14,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:36:15,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:36:17,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 15:36:17,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:36:17,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 15:36:20,327 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.10 vs. limit=15.0 2023-09-29 15:36:20,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:36:20,957 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:36:22,414 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:36:22,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:36:28,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:36:28,862 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:36:37,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:36:37,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:36:37,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:36:40,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:36:40,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 15:36:42,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:36:42,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 15:36:44,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:36:44,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:36:47,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 15:36:51,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:36:54,186 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.685e+02 1.862e+02 2.008e+02 2.262e+02 2.937e+02, threshold=4.017e+02, percent-clipped=0.0 2023-09-29 15:36:54,232 INFO [train.py:1039] (1/4) Epoch 12, batch 3100, loss[loss=0.1739, simple_loss=0.2458, pruned_loss=0.05099, over 24467.00 frames. ], tot_loss[loss=0.1979, simple_loss=0.2698, pruned_loss=0.06298, over 4700632.24 frames. ], batch size: 58, lr: 8.62e-03, grad_scale: 32.0 2023-09-29 15:36:54,534 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:36:56,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:36:59,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 15:37:01,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 15:37:03,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 15:37:05,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 15:37:07,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:37:10,425 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:37:10,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:37:12,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 15:37:15,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:37:19,365 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=410286.6666666667, ans=0.09899494936611666 2023-09-29 15:37:22,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 15:37:26,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 15:37:27,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:37:27,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:37:27,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:37:29,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 15:37:30,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:37:30,790 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 15:37:30,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:37:32,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:37:32,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 15:37:33,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:37:34,620 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.54 vs. limit=15.0 2023-09-29 15:37:37,905 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=410353.3333333333, ans=0.0 2023-09-29 15:37:39,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:37:40,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 15:37:41,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 15:37:42,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:37:43,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:37:45,176 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:37:45,194 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:37:46,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:37:46,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 15:37:46,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:37:50,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:37:50,612 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:37:50,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:37:50,624 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 15:37:56,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:37:57,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 15:37:58,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:37:59,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 15:37:59,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:37:59,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:38:01,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 15:38:12,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 15:38:14,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:38:15,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:38:17,009 INFO [train.py:1039] (1/4) Epoch 12, batch 3150, loss[loss=0.1677, simple_loss=0.2456, pruned_loss=0.04492, over 24620.00 frames. ], tot_loss[loss=0.1963, simple_loss=0.2676, pruned_loss=0.06247, over 4691190.93 frames. ], batch size: 60, lr: 8.62e-03, grad_scale: 32.0 2023-09-29 15:38:17,151 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:38:17,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:38:17,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 15:38:18,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:38:20,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 15:38:22,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 15:38:23,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:38:27,338 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 15:38:29,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 15:38:31,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:38:32,588 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 15:38:34,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 15:38:34,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 15:38:35,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 15:38:35,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 15:38:35,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:38:35,702 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:38:37,243 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:38:38,999 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=410620.0, ans=0.1 2023-09-29 15:38:40,346 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 15:38:40,762 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=410620.0, ans=0.0 2023-09-29 15:38:41,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:38:42,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:38:43,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:38:46,003 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 15:38:49,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 15:38:50,515 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:38:53,621 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 15:38:53,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:38:53,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 15:38:58,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 15:38:58,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:39:00,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 15:39:00,259 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 15:39:00,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:39:00,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 15:39:02,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 15:39:02,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 15:39:04,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 15:39:06,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 15:39:06,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:39:07,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:39:07,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:39:07,850 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 15:39:07,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:39:09,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 15:39:10,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:39:11,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 15:39:12,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 15:39:12,847 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:39:14,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:39:14,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 15:39:16,491 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 15:39:17,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:39:19,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:39:21,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:39:21,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:39:27,190 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:39:27,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:39:31,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 15:39:37,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:39:37,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 15:39:41,205 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.622e+02 2.037e+02 2.402e+02 2.745e+02 4.943e+02, threshold=4.804e+02, percent-clipped=1.0 2023-09-29 15:39:41,248 INFO [train.py:1039] (1/4) Epoch 12, batch 3200, loss[loss=0.2116, simple_loss=0.2866, pruned_loss=0.06831, over 24473.00 frames. ], tot_loss[loss=0.1958, simple_loss=0.2667, pruned_loss=0.06244, over 4694398.42 frames. ], batch size: 77, lr: 8.61e-03, grad_scale: 32.0 2023-09-29 15:39:42,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:39:43,047 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:39:43,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 15:39:43,354 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=410886.6666666667, ans=0.1 2023-09-29 15:39:46,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:39:49,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 15:39:52,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:40:03,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:40:14,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 15:40:15,090 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.38 vs. limit=22.5 2023-09-29 15:40:15,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:40:18,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 15:40:20,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 15:40:23,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:40:23,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 15:40:25,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:40:29,965 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 15:40:31,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 15:40:33,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 15:40:35,125 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 15:40:38,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 15:40:43,171 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:40:45,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:40:45,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:40:45,391 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 15:40:45,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 15:40:46,177 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.70 vs. limit=10.0 2023-09-29 15:40:49,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:40:51,348 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 15:40:52,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 15:40:54,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 15:40:54,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 15:40:55,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:41:00,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 15:41:00,112 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 15:41:00,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:41:00,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:41:01,662 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 15:41:03,200 INFO [train.py:1039] (1/4) Epoch 12, batch 3250, loss[loss=0.2179, simple_loss=0.2819, pruned_loss=0.07698, over 20300.00 frames. ], tot_loss[loss=0.196, simple_loss=0.2665, pruned_loss=0.06275, over 4691315.68 frames. ], batch size: 44, lr: 8.61e-03, grad_scale: 32.0 2023-09-29 15:41:06,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:41:10,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:41:15,261 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=411220.0, ans=0.05 2023-09-29 15:41:19,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:41:19,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 15:41:20,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:41:20,086 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:41:20,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:41:23,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:41:23,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 15:41:25,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:41:27,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:41:27,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:41:27,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:41:27,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:41:28,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:41:30,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:41:32,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:41:34,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:41:35,018 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:41:37,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:41:37,908 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:41:37,923 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:41:41,352 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=411353.3333333333, ans=0.07 2023-09-29 15:41:44,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 15:41:44,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:41:45,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:41:46,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:41:47,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 15:41:49,492 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 15:41:52,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 15:42:00,836 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:42:00,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:42:00,883 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 15:42:00,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:42:00,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 15:42:00,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:42:02,022 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=411420.0, ans=0.2 2023-09-29 15:42:04,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 15:42:04,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 15:42:06,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:42:06,716 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=411420.0, ans=0.125 2023-09-29 15:42:07,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:42:07,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:42:09,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 15:42:09,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:42:14,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:42:14,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:42:17,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 15:42:17,631 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:42:20,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:42:20,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 15:42:25,025 INFO [train.py:1039] (1/4) Epoch 12, batch 3300, loss[loss=0.2173, simple_loss=0.2918, pruned_loss=0.0714, over 23360.00 frames. ], tot_loss[loss=0.1963, simple_loss=0.2673, pruned_loss=0.06264, over 4710980.48 frames. ], batch size: 93, lr: 8.61e-03, grad_scale: 16.0 2023-09-29 15:42:25,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:42:25,181 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 15:42:26,586 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.957e+02 2.272e+02 2.906e+02 4.656e+02, threshold=4.545e+02, percent-clipped=0.0 2023-09-29 15:42:26,799 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 15:42:27,032 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=411553.3333333333, ans=0.1 2023-09-29 15:42:28,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 15:42:28,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:42:33,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:42:34,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:42:36,281 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:42:37,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 15:42:37,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 15:42:40,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:42:41,089 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=411620.0, ans=0.125 2023-09-29 15:42:41,457 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.30 vs. limit=22.5 2023-09-29 15:42:42,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:42:46,773 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 15:42:46,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:42:46,932 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:42:48,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:42:48,629 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 15:42:50,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:42:50,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 15:42:51,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 15:42:51,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:42:53,214 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 15:42:55,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:42:55,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 15:42:58,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:42:58,635 WARNING [train.py:1197] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 15:43:01,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 15:43:01,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:43:01,789 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=411686.6666666667, ans=0.125 2023-09-29 15:43:02,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:43:04,606 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 15:43:06,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 15:43:08,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:43:11,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 15:43:12,046 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=411686.6666666667, ans=0.125 2023-09-29 15:43:13,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:43:15,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:43:17,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:43:20,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:43:20,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:43:20,191 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:43:20,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:43:23,436 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=411753.3333333333, ans=0.125 2023-09-29 15:43:23,457 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=411753.3333333333, ans=0.2 2023-09-29 15:43:24,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:43:24,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:43:26,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:43:26,359 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=411753.3333333333, ans=0.1 2023-09-29 15:43:27,576 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 15:43:29,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 15:43:30,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 15:43:30,739 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:43:30,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:43:34,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:43:34,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:43:35,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 15:43:36,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:43:36,097 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 15:43:37,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:43:39,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 15:43:42,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 15:43:42,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:43:44,280 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:43:47,610 INFO [train.py:1039] (1/4) Epoch 12, batch 3350, loss[loss=0.1759, simple_loss=0.2521, pruned_loss=0.04981, over 24298.00 frames. ], tot_loss[loss=0.197, simple_loss=0.2681, pruned_loss=0.06292, over 4716700.90 frames. ], batch size: 61, lr: 8.60e-03, grad_scale: 16.0 2023-09-29 15:43:47,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 15:43:47,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:43:49,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:43:51,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:43:51,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:43:54,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:43:56,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:43:57,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:44:00,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:44:02,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:44:03,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:44:05,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:44:05,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 15:44:05,391 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 15:44:05,620 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=411953.3333333333, ans=0.2 2023-09-29 15:44:06,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:44:08,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 15:44:08,701 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 15:44:12,169 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:44:12,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:44:12,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:44:12,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 15:44:13,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:44:13,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:44:16,401 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:44:18,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:44:18,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:44:20,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:44:23,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:44:28,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:44:28,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:44:32,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:44:33,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:44:33,268 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=412020.0, ans=0.2 2023-09-29 15:44:36,003 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:44:36,020 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:44:37,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:44:40,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 15:44:40,728 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 15:44:42,207 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 15:44:42,295 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:44:44,579 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 15:44:45,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:44:46,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:44:51,573 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.48 vs. limit=6.0 2023-09-29 15:44:53,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:44:53,502 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=412153.3333333333, ans=0.0 2023-09-29 15:44:55,002 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 15:44:55,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 15:44:55,222 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:44:56,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:45:02,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:45:05,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 15:45:05,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 15:45:05,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:45:08,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:45:08,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 15:45:09,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:45:09,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 15:45:11,201 INFO [train.py:1039] (1/4) Epoch 12, batch 3400, loss[loss=0.2166, simple_loss=0.2728, pruned_loss=0.08019, over 23897.00 frames. ], tot_loss[loss=0.1986, simple_loss=0.2695, pruned_loss=0.06387, over 4721944.84 frames. ], batch size: 196, lr: 8.60e-03, grad_scale: 16.0 2023-09-29 15:45:11,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:45:11,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:45:11,466 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 15:45:13,411 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.428e+02 1.868e+02 2.094e+02 2.439e+02 4.049e+02, threshold=4.189e+02, percent-clipped=0.0 2023-09-29 15:45:13,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:45:15,040 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 15:45:18,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 15:45:18,210 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 15:45:18,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:45:18,623 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=412220.0, ans=0.2 2023-09-29 15:45:21,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:45:21,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 15:45:22,943 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:45:25,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:45:30,287 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=412286.6666666667, ans=0.0 2023-09-29 15:45:33,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:45:34,351 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.98 vs. limit=15.0 2023-09-29 15:45:36,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 15:45:42,714 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:45:44,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:45:44,547 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:45:46,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 15:45:50,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:45:54,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 15:45:57,945 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=412353.3333333333, ans=0.1 2023-09-29 15:46:00,819 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:46:00,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:46:03,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 15:46:04,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:46:04,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:46:06,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:46:06,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:46:10,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:46:15,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 15:46:16,002 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:46:20,891 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:46:22,520 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 15:46:26,767 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.65 vs. limit=15.0 2023-09-29 15:46:29,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 15:46:33,843 INFO [train.py:1039] (1/4) Epoch 12, batch 3450, loss[loss=0.2097, simple_loss=0.2781, pruned_loss=0.07069, over 23276.00 frames. ], tot_loss[loss=0.1979, simple_loss=0.2692, pruned_loss=0.06335, over 4728912.87 frames. ], batch size: 105, lr: 8.59e-03, grad_scale: 16.0 2023-09-29 15:46:33,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 15:46:37,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 15:46:37,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:46:39,291 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:46:41,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 15:46:42,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:46:45,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 15:46:51,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:46:52,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:46:52,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 15:46:52,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:46:52,967 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=412620.0, ans=0.125 2023-09-29 15:46:54,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:46:57,365 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=412620.0, ans=0.125 2023-09-29 15:46:58,754 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=412620.0, ans=0.125 2023-09-29 15:47:00,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 15:47:07,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 15:47:07,790 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 15:47:07,863 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:47:10,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:47:16,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 15:47:17,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 15:47:21,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:47:21,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:47:22,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 15:47:24,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:47:27,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 15:47:27,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:47:27,792 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=412753.3333333333, ans=0.125 2023-09-29 15:47:28,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:47:31,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:47:34,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 15:47:38,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:47:39,261 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=412820.0, ans=0.0 2023-09-29 15:47:43,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:47:46,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:47:48,988 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:47:53,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:47:53,815 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:47:55,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:47:57,199 INFO [train.py:1039] (1/4) Epoch 12, batch 3500, loss[loss=0.2153, simple_loss=0.2697, pruned_loss=0.08043, over 23808.00 frames. ], tot_loss[loss=0.1972, simple_loss=0.2684, pruned_loss=0.06302, over 4725148.36 frames. ], batch size: 212, lr: 8.59e-03, grad_scale: 16.0 2023-09-29 15:47:57,303 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:47:58,594 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.929e+02 2.065e+02 2.305e+02 4.202e+02, threshold=4.129e+02, percent-clipped=1.0 2023-09-29 15:48:01,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:48:04,987 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:48:05,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 15:48:07,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 15:48:11,801 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 15:48:13,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:48:13,515 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 15:48:16,821 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:48:17,144 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=412953.3333333333, ans=0.125 2023-09-29 15:48:18,376 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:48:20,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:48:20,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:48:20,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 15:48:20,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:48:22,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:48:22,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 15:48:27,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:48:27,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 15:48:27,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:48:28,749 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.93 vs. limit=15.0 2023-09-29 15:48:32,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:48:34,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 15:48:34,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:48:37,232 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:48:37,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:48:38,959 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:48:41,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:48:41,120 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:48:42,580 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 15:48:44,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 15:48:44,567 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=413020.0, ans=0.125 2023-09-29 15:48:45,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 15:48:45,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:48:47,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:48:47,517 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=413086.6666666667, ans=0.125 2023-09-29 15:48:47,560 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=413086.6666666667, ans=0.125 2023-09-29 15:48:48,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:48:48,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 15:48:49,083 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=413086.6666666667, ans=0.5 2023-09-29 15:48:51,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 15:48:53,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:48:57,687 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:48:59,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 15:48:59,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 15:48:59,182 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:49:02,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:49:04,367 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:49:05,941 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:49:07,543 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 15:49:08,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:49:10,580 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:49:10,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 15:49:14,164 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 15:49:15,821 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=413153.3333333333, ans=0.125 2023-09-29 15:49:17,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:49:18,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:49:18,600 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:49:18,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:49:19,006 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=413220.0, ans=0.07 2023-09-29 15:49:20,010 INFO [train.py:1039] (1/4) Epoch 12, batch 3550, loss[loss=0.1786, simple_loss=0.2656, pruned_loss=0.04579, over 24316.00 frames. ], tot_loss[loss=0.1962, simple_loss=0.2671, pruned_loss=0.0627, over 4719602.73 frames. ], batch size: 74, lr: 8.59e-03, grad_scale: 16.0 2023-09-29 15:49:21,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:49:29,807 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.79 vs. limit=15.0 2023-09-29 15:49:33,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:49:33,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 15:49:39,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:49:40,515 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:49:42,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:49:43,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:49:43,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 15:49:46,767 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:49:46,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:49:48,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:49:48,514 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 15:49:50,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:49:55,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:49:55,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:49:56,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:49:56,878 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:49:56,999 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=413353.3333333333, ans=0.0 2023-09-29 15:49:58,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:49:58,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 15:49:58,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:50:01,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:50:03,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 15:50:10,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:50:10,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:50:11,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:50:13,364 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.61 vs. limit=15.0 2023-09-29 15:50:13,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 15:50:14,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:50:15,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 15:50:17,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:50:18,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:50:18,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:50:21,726 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 15:50:23,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:50:28,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:50:28,916 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 15:50:29,117 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=413486.6666666667, ans=0.1 2023-09-29 15:50:30,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:50:35,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:50:35,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 15:50:43,740 INFO [train.py:1039] (1/4) Epoch 12, batch 3600, loss[loss=0.2057, simple_loss=0.2656, pruned_loss=0.07292, over 23877.00 frames. ], tot_loss[loss=0.1958, simple_loss=0.2664, pruned_loss=0.06261, over 4725288.97 frames. ], batch size: 195, lr: 8.58e-03, grad_scale: 32.0 2023-09-29 15:50:43,820 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 15:50:43,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:50:44,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:50:45,370 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.32 vs. limit=15.0 2023-09-29 15:50:45,975 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.608e+02 1.995e+02 2.200e+02 2.637e+02 4.261e+02, threshold=4.399e+02, percent-clipped=1.0 2023-09-29 15:50:47,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:50:47,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:50:49,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:50:53,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:50:55,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:50:57,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:50:57,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:50:58,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:50:58,594 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 15:51:02,471 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 15:51:02,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:51:05,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:51:08,634 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:51:11,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 15:51:12,010 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:51:12,050 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 15:51:12,151 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:51:15,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:51:17,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:51:19,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:51:21,057 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=413686.6666666667, ans=0.1 2023-09-29 15:51:22,476 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:51:22,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:51:24,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 15:51:31,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:51:31,976 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=413753.3333333333, ans=0.125 2023-09-29 15:51:33,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 15:51:33,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 15:51:39,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:51:45,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:51:48,324 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:51:49,426 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.67 vs. limit=22.5 2023-09-29 15:51:50,087 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=413820.0, ans=0.95 2023-09-29 15:51:56,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 15:51:56,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:51:56,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 15:51:58,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 15:51:59,903 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 15:52:02,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:52:03,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:52:04,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 15:52:05,925 INFO [train.py:1039] (1/4) Epoch 12, batch 3650, loss[loss=0.197, simple_loss=0.2645, pruned_loss=0.06473, over 23797.00 frames. ], tot_loss[loss=0.1961, simple_loss=0.2671, pruned_loss=0.06253, over 4727174.95 frames. ], batch size: 179, lr: 8.58e-03, grad_scale: 32.0 2023-09-29 15:52:05,981 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:52:06,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:52:06,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:52:07,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 15:52:07,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 15:52:10,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:52:13,033 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 15:52:15,665 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=9.78 vs. limit=22.5 2023-09-29 15:52:17,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 15:52:19,314 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:52:22,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 15:52:24,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 15:52:26,225 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=413953.3333333333, ans=0.1 2023-09-29 15:52:29,628 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:52:29,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:52:29,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 15:52:32,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 15:52:34,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:52:34,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 15:52:34,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 15:52:36,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:52:36,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 15:52:37,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 15:52:38,159 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=414020.0, ans=0.0 2023-09-29 15:52:39,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:52:39,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:52:42,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:52:43,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 15:52:45,476 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 15:52:46,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:52:49,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 15:52:50,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:52:50,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 15:52:57,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:52:59,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:52:59,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 15:53:00,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:53:02,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:53:04,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:53:07,402 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:53:09,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:53:09,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:53:12,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 15:53:14,144 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:53:14,240 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:53:20,573 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 15:53:23,651 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:53:23,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:53:27,093 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 15:53:27,191 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:53:28,617 INFO [train.py:1039] (1/4) Epoch 12, batch 3700, loss[loss=0.1977, simple_loss=0.2855, pruned_loss=0.05496, over 24072.00 frames. ], tot_loss[loss=0.196, simple_loss=0.2677, pruned_loss=0.06209, over 4731410.97 frames. ], batch size: 80, lr: 8.58e-03, grad_scale: 32.0 2023-09-29 15:53:28,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:53:28,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:53:30,805 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.903e+02 2.176e+02 2.360e+02 3.995e+02, threshold=4.353e+02, percent-clipped=0.0 2023-09-29 15:53:31,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 15:53:31,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:53:32,765 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 15:53:36,346 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:53:36,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:53:39,575 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:53:39,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 15:53:39,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:53:39,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 15:53:41,314 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 15:53:42,380 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.68 vs. limit=22.5 2023-09-29 15:53:45,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 15:53:48,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:53:49,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:53:49,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:53:51,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:53:51,354 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 15:53:52,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:53:53,091 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=414286.6666666667, ans=0.0 2023-09-29 15:53:54,492 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 15:54:01,350 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 15:54:04,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:54:05,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 15:54:07,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 15:54:07,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 15:54:07,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:54:11,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:54:11,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 15:54:14,662 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:54:16,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:54:18,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:54:18,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 15:54:20,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 15:54:24,938 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:54:26,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 15:54:26,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:54:27,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 15:54:31,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:54:31,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 15:54:35,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:54:36,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 15:54:38,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:54:38,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 15:54:39,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:54:39,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:54:43,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:54:44,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 15:54:46,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 15:54:46,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:54:46,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:54:48,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:54:48,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:54:53,608 INFO [train.py:1039] (1/4) Epoch 12, batch 3750, loss[loss=0.2007, simple_loss=0.2752, pruned_loss=0.06311, over 23304.00 frames. ], tot_loss[loss=0.1981, simple_loss=0.2692, pruned_loss=0.0635, over 4714568.48 frames. ], batch size: 93, lr: 8.57e-03, grad_scale: 32.0 2023-09-29 15:54:53,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:54:55,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:54:56,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:54:58,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 15:54:58,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 15:55:01,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 15:55:03,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 15:55:03,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:55:04,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:55:04,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:55:06,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:55:11,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:55:14,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:55:16,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 15:55:21,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:55:22,421 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.39 vs. limit=15.0 2023-09-29 15:55:22,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:55:24,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 15:55:24,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:55:26,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:55:26,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:55:28,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 15:55:34,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 15:55:34,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:55:34,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:55:37,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:55:38,022 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=414686.6666666667, ans=0.125 2023-09-29 15:55:39,729 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.75 vs. limit=15.0 2023-09-29 15:55:42,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:55:44,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 15:55:48,625 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.96 vs. limit=12.0 2023-09-29 15:55:49,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 15:55:52,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:55:56,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:55:56,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:56:01,148 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:56:03,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 15:56:04,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 15:56:06,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 15:56:07,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:56:08,200 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=414820.0, ans=0.125 2023-09-29 15:56:09,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:56:15,910 INFO [train.py:1039] (1/4) Epoch 12, batch 3800, loss[loss=0.1788, simple_loss=0.2395, pruned_loss=0.05903, over 23697.00 frames. ], tot_loss[loss=0.1972, simple_loss=0.2682, pruned_loss=0.06309, over 4717277.22 frames. ], batch size: 232, lr: 8.57e-03, grad_scale: 8.0 2023-09-29 15:56:19,176 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 15:56:21,129 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 2.017e+02 2.225e+02 2.467e+02 3.965e+02, threshold=4.450e+02, percent-clipped=0.0 2023-09-29 15:56:21,582 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=414886.6666666667, ans=0.2 2023-09-29 15:56:24,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:56:24,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 15:56:25,833 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 15:56:27,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:56:29,586 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:56:31,134 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 15:56:32,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=414953.3333333333, ans=0.125 2023-09-29 15:56:33,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 15:56:33,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:56:33,676 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=414953.3333333333, ans=0.2 2023-09-29 15:56:34,868 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:56:36,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:56:36,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:56:37,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:56:39,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 15:56:42,729 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=414953.3333333333, ans=0.1 2023-09-29 15:56:43,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 15:56:44,032 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:56:46,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:56:49,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:56:49,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 15:56:51,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 15:56:52,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:56:54,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:56:57,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:57:02,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 15:57:02,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 15:57:03,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:57:04,076 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=415086.6666666667, ans=0.125 2023-09-29 15:57:11,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:57:15,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:57:19,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 15:57:21,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 15:57:21,384 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:57:24,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:57:24,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:57:26,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 15:57:28,403 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=415153.3333333333, ans=0.125 2023-09-29 15:57:31,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 15:57:31,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 15:57:31,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:57:32,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:57:38,728 INFO [train.py:1039] (1/4) Epoch 12, batch 3850, loss[loss=0.2053, simple_loss=0.2836, pruned_loss=0.06352, over 24050.00 frames. ], tot_loss[loss=0.1968, simple_loss=0.2679, pruned_loss=0.06283, over 4723805.99 frames. ], batch size: 80, lr: 8.57e-03, grad_scale: 4.0 2023-09-29 15:57:38,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:57:40,481 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 15:57:45,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:57:47,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 15:57:47,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 15:57:47,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:57:53,036 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 15:57:55,943 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:57:57,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 15:57:57,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 15:58:05,844 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:58:06,302 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=415286.6666666667, ans=0.125 2023-09-29 15:58:07,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:58:10,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:58:10,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:58:11,006 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=415353.3333333333, ans=0.125 2023-09-29 15:58:13,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:58:13,935 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:58:16,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:58:16,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:58:16,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:58:17,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:58:18,018 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=415353.3333333333, ans=0.09899494936611666 2023-09-29 15:58:19,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:58:19,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:58:20,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 15:58:22,096 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 15:58:22,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:58:22,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:58:25,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:58:27,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:58:27,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 15:58:30,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 15:58:32,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:58:34,351 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 15:58:35,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 15:58:42,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:58:43,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:58:46,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:58:48,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 15:58:50,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 15:58:53,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:58:55,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:58:57,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:58:57,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:58:58,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:59:00,074 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:59:00,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:59:00,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 15:59:01,979 INFO [train.py:1039] (1/4) Epoch 12, batch 3900, loss[loss=0.1938, simple_loss=0.251, pruned_loss=0.06832, over 23558.00 frames. ], tot_loss[loss=0.1958, simple_loss=0.2665, pruned_loss=0.06253, over 4698055.03 frames. ], batch size: 232, lr: 8.56e-03, grad_scale: 8.0 2023-09-29 15:59:02,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:59:03,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 15:59:03,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:59:03,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:59:05,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:59:05,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:59:06,997 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:59:07,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:59:07,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:59:07,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:59:07,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 15:59:08,537 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.480e+02 1.940e+02 2.154e+02 2.415e+02 3.457e+02, threshold=4.308e+02, percent-clipped=0.0 2023-09-29 15:59:08,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:59:11,230 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=415553.3333333333, ans=0.1 2023-09-29 15:59:12,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:59:15,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 15:59:15,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:59:16,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:59:18,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 15:59:18,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:59:21,684 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 15:59:22,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 15:59:22,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:59:23,536 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.38 vs. limit=15.0 2023-09-29 15:59:25,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 15:59:25,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:59:27,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 15:59:30,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 15:59:30,815 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=415620.0, ans=0.125 2023-09-29 15:59:33,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:59:35,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:59:35,223 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 15:59:35,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:59:37,699 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 15:59:42,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:59:42,954 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.62 vs. limit=15.0 2023-09-29 15:59:44,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:59:45,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:59:45,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:59:47,244 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:59:54,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:59:54,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:00:02,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 16:00:05,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:00:11,458 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=415820.0, ans=0.0 2023-09-29 16:00:13,548 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:00:17,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:00:17,985 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 16:00:18,700 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.70 vs. limit=22.5 2023-09-29 16:00:20,087 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 16:00:20,109 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:00:22,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 16:00:24,269 INFO [train.py:1039] (1/4) Epoch 12, batch 3950, loss[loss=0.2171, simple_loss=0.2844, pruned_loss=0.07488, over 23424.00 frames. ], tot_loss[loss=0.1955, simple_loss=0.2664, pruned_loss=0.06225, over 4712006.57 frames. ], batch size: 105, lr: 8.56e-03, grad_scale: 8.0 2023-09-29 16:00:24,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:00:24,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 16:00:32,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:00:33,844 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 16:00:34,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:00:35,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:00:37,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:00:44,284 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 16:00:45,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:00:45,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 16:00:45,884 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 16:00:45,923 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:00:48,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:00:48,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:00:48,396 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:00:50,191 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=415953.3333333333, ans=0.125 2023-09-29 16:00:52,059 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=415953.3333333333, ans=0.025 2023-09-29 16:00:53,821 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 16:00:55,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:00:56,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:00:56,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:00:56,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:00:58,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 16:01:08,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:01:08,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:01:09,005 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=416020.0, ans=0.0 2023-09-29 16:01:16,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 16:01:20,605 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.00 vs. limit=10.0 2023-09-29 16:01:22,088 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 16:01:22,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 16:01:23,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:01:23,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:01:32,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 16:01:33,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 16:01:33,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:01:33,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:01:33,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 16:01:40,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:01:42,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:01:46,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 16:01:47,929 INFO [train.py:1039] (1/4) Epoch 12, batch 4000, loss[loss=0.1596, simple_loss=0.238, pruned_loss=0.04061, over 20554.00 frames. ], tot_loss[loss=0.1952, simple_loss=0.2666, pruned_loss=0.06196, over 4703061.19 frames. ], batch size: 45, lr: 8.56e-03, grad_scale: 16.0 2023-09-29 16:01:52,116 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=416220.0, ans=0.1 2023-09-29 16:01:55,125 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 2.007e+02 2.286e+02 2.878e+02 4.961e+02, threshold=4.572e+02, percent-clipped=2.0 2023-09-29 16:01:55,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:02:01,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:02:08,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:02:08,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:02:08,435 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:02:08,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 16:02:09,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 16:02:10,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 16:02:10,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:02:10,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 16:02:11,898 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=416286.6666666667, ans=0.125 2023-09-29 16:02:13,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:02:14,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:02:14,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:02:14,958 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:02:16,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:02:16,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 16:02:17,328 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=416286.6666666667, ans=0.125 2023-09-29 16:02:18,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:02:21,473 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 16:02:21,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:02:23,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:02:25,394 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 16:02:27,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 16:02:27,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:02:37,090 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 16:02:37,173 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:02:37,425 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=416420.0, ans=0.1 2023-09-29 16:02:39,036 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=416420.0, ans=0.07 2023-09-29 16:02:40,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:02:41,892 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 16:02:43,431 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:02:43,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 16:02:43,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:02:45,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:02:45,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:02:46,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:02:46,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:02:46,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:02:49,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 16:02:49,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:02:51,420 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 16:02:58,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:03:01,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 16:03:04,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:03:04,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:03:04,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:03:08,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:03:11,723 INFO [train.py:1039] (1/4) Epoch 12, batch 4050, loss[loss=0.2135, simple_loss=0.2714, pruned_loss=0.07779, over 23535.00 frames. ], tot_loss[loss=0.1969, simple_loss=0.268, pruned_loss=0.06293, over 4708343.40 frames. ], batch size: 134, lr: 8.55e-03, grad_scale: 16.0 2023-09-29 16:03:11,940 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:03:14,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 16:03:14,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 16:03:16,555 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:03:16,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:03:18,060 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:03:19,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 16:03:21,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:03:25,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:03:25,997 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=416620.0, ans=0.1 2023-09-29 16:03:27,964 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:03:29,359 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 16:03:30,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:03:31,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:03:36,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:03:38,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 16:03:41,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 16:03:42,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 16:03:44,877 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 16:03:47,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:03:52,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 16:03:53,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:03:56,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:03:59,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:04:01,303 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:04:01,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:04:04,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:04:06,410 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:04:07,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 16:04:07,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 16:04:09,090 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:04:11,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 16:04:16,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:04:21,867 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=416820.0, ans=0.0 2023-09-29 16:04:23,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 16:04:24,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:04:24,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:04:26,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 16:04:26,298 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 16:04:26,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:04:29,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:04:31,363 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:04:31,391 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:04:31,645 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:04:34,242 INFO [train.py:1039] (1/4) Epoch 12, batch 4100, loss[loss=0.2256, simple_loss=0.2819, pruned_loss=0.0846, over 23560.00 frames. ], tot_loss[loss=0.1992, simple_loss=0.2698, pruned_loss=0.06433, over 4697912.88 frames. ], batch size: 285, lr: 8.55e-03, grad_scale: 8.0 2023-09-29 16:04:37,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 16:04:39,320 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=416886.6666666667, ans=0.025 2023-09-29 16:04:40,529 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 16:04:40,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 16:04:41,083 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=416886.6666666667, ans=0.0 2023-09-29 16:04:42,521 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 2.020e+02 2.338e+02 2.754e+02 3.996e+02, threshold=4.676e+02, percent-clipped=0.0 2023-09-29 16:04:42,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 16:04:42,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:04:44,137 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:04:44,221 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:04:45,668 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:04:47,226 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 16:04:48,902 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:04:50,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:04:51,015 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:04:52,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:04:55,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 16:04:57,344 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:04:57,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:04:57,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 16:04:58,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:04:58,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:04:59,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:04:59,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:04:59,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 16:05:02,913 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=416953.3333333333, ans=0.0 2023-09-29 16:05:04,126 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:05:05,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 16:05:07,166 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:05:08,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:05:08,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 16:05:10,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:05:10,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:05:11,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:05:13,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 16:05:15,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:05:16,436 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:05:18,030 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 16:05:20,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:05:20,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 16:05:23,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:05:27,336 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=417086.6666666667, ans=0.125 2023-09-29 16:05:29,907 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:05:33,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:05:34,973 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:05:36,921 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=417086.6666666667, ans=0.125 2023-09-29 16:05:42,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:05:42,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:05:46,127 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=417153.3333333333, ans=0.2 2023-09-29 16:05:48,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:05:50,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:05:52,083 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 16:05:53,579 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:05:55,479 INFO [train.py:1039] (1/4) Epoch 12, batch 4150, loss[loss=0.1956, simple_loss=0.2707, pruned_loss=0.06027, over 23325.00 frames. ], tot_loss[loss=0.2004, simple_loss=0.2709, pruned_loss=0.06493, over 4690581.71 frames. ], batch size: 93, lr: 8.55e-03, grad_scale: 8.0 2023-09-29 16:05:55,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:05:55,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:05:57,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 16:05:59,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:05:59,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 16:06:00,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 16:06:01,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 16:06:02,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:06:03,223 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=417220.0, ans=0.125 2023-09-29 16:06:03,394 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=417220.0, ans=0.125 2023-09-29 16:06:06,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:06:06,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:06:11,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:06:12,949 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:06:14,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 16:06:16,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 16:06:17,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:06:17,755 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 16:06:22,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:06:26,145 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=417286.6666666667, ans=0.0 2023-09-29 16:06:27,337 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:06:29,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 16:06:29,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 16:06:29,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:06:31,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 16:06:31,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:06:31,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:06:35,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:06:36,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:06:41,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 16:06:46,342 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 16:06:46,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:06:47,970 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 16:06:48,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:06:50,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 16:06:51,090 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=417420.0, ans=0.125 2023-09-29 16:06:52,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:06:55,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:06:55,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:06:57,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 16:06:57,212 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:06:57,215 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 16:07:00,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 16:07:03,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 16:07:03,262 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:07:03,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 16:07:05,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 16:07:06,687 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 16:07:06,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:07:06,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 16:07:08,933 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:07:10,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:07:10,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 16:07:10,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 16:07:16,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:07:18,202 INFO [train.py:1039] (1/4) Epoch 12, batch 4200, loss[loss=0.173, simple_loss=0.252, pruned_loss=0.04701, over 24317.00 frames. ], tot_loss[loss=0.199, simple_loss=0.2697, pruned_loss=0.06415, over 4705444.11 frames. ], batch size: 61, lr: 8.54e-03, grad_scale: 8.0 2023-09-29 16:07:18,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 16:07:20,514 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:07:22,212 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:07:24,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:07:26,353 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.483e+02 1.937e+02 2.271e+02 2.682e+02 4.339e+02, threshold=4.541e+02, percent-clipped=0.0 2023-09-29 16:07:26,515 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:07:26,518 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:07:29,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 16:07:29,812 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=417553.3333333333, ans=0.125 2023-09-29 16:07:32,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 16:07:32,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:07:35,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:07:37,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:07:39,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 16:07:41,363 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:07:42,786 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:07:42,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 16:07:42,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:07:45,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:07:45,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:07:45,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:07:48,003 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=417620.0, ans=0.07 2023-09-29 16:07:49,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:07:49,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 16:07:49,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:07:54,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 16:07:55,525 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=6.18 vs. limit=15.0 2023-09-29 16:07:56,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:07:59,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:08:00,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:08:02,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:08:02,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 16:08:02,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:08:05,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:08:08,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:08:10,305 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:08:10,547 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=417753.3333333333, ans=0.125 2023-09-29 16:08:17,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 16:08:18,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 16:08:20,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:08:27,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 16:08:27,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:08:30,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 16:08:34,883 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 16:08:38,390 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=417886.6666666667, ans=0.125 2023-09-29 16:08:39,283 INFO [train.py:1039] (1/4) Epoch 12, batch 4250, loss[loss=0.182, simple_loss=0.2196, pruned_loss=0.07224, over 19065.00 frames. ], tot_loss[loss=0.1967, simple_loss=0.2672, pruned_loss=0.06311, over 4691591.05 frames. ], batch size: 388, lr: 8.54e-03, grad_scale: 8.0 2023-09-29 16:08:39,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:08:39,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 16:08:39,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=417886.6666666667, ans=0.125 2023-09-29 16:08:41,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:08:48,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:08:49,478 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 16:08:49,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:08:51,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:08:57,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:08:58,124 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:09:02,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:09:02,608 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:09:04,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:09:04,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:09:05,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:09:05,952 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:09:07,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:09:09,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:09:10,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:09:12,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 16:09:16,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 16:09:18,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:09:19,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:09:19,083 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:09:21,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:09:21,141 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:09:21,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:09:25,273 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=418020.0, ans=0.0 2023-09-29 16:09:26,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 16:09:26,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:09:31,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:09:33,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:09:33,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 16:09:34,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:09:34,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 16:09:36,361 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 16:09:37,106 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.28 vs. limit=15.0 2023-09-29 16:09:38,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:09:40,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:09:40,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:09:43,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 16:09:44,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 16:09:44,654 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten.whitening_limit, batch_count=418153.3333333333, ans=22.5 2023-09-29 16:09:45,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 16:09:50,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:09:51,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:09:54,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:09:55,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:09:56,513 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.77 vs. limit=15.0 2023-09-29 16:09:57,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:09:59,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:09:59,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:09:59,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 16:10:01,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:10:03,570 INFO [train.py:1039] (1/4) Epoch 12, batch 4300, loss[loss=0.1926, simple_loss=0.2706, pruned_loss=0.05733, over 23958.00 frames. ], tot_loss[loss=0.1959, simple_loss=0.2662, pruned_loss=0.06278, over 4688784.83 frames. ], batch size: 86, lr: 8.54e-03, grad_scale: 8.0 2023-09-29 16:10:08,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:10:08,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:10:11,195 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.525e+02 1.977e+02 2.264e+02 2.605e+02 3.860e+02, threshold=4.528e+02, percent-clipped=0.0 2023-09-29 16:10:11,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:10:18,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:10:18,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 16:10:19,389 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=418286.6666666667, ans=0.1 2023-09-29 16:10:20,527 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:10:22,151 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:10:22,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:10:22,215 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 16:10:25,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 16:10:29,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:10:32,952 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 16:10:32,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:10:34,330 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 16:10:36,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 16:10:38,989 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:10:42,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:10:42,510 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:10:42,668 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:10:43,016 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=418353.3333333333, ans=0.0 2023-09-29 16:10:44,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:10:45,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:10:45,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 16:10:45,875 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 16:10:48,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:10:49,260 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:10:50,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:10:50,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 16:10:50,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:10:50,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:10:50,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 16:10:50,666 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 16:10:52,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 16:10:53,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:10:53,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 16:10:53,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 16:10:55,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:10:57,062 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 16:10:59,345 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:11:02,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:11:02,790 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:11:04,325 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 16:11:06,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:11:06,504 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:11:06,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:11:06,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:11:08,100 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:11:10,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:11:13,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:11:13,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:11:14,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:11:20,219 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.91 vs. limit=22.5 2023-09-29 16:11:20,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 16:11:22,283 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 16:11:25,247 INFO [train.py:1039] (1/4) Epoch 12, batch 4350, loss[loss=0.1957, simple_loss=0.2721, pruned_loss=0.05964, over 24020.00 frames. ], tot_loss[loss=0.1972, simple_loss=0.268, pruned_loss=0.06315, over 4692673.32 frames. ], batch size: 86, lr: 8.53e-03, grad_scale: 8.0 2023-09-29 16:11:25,526 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:11:28,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:11:30,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:11:30,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:11:34,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:11:34,545 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=418553.3333333333, ans=0.2 2023-09-29 16:11:39,328 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:11:39,847 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.30 vs. limit=15.0 2023-09-29 16:11:41,649 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=418620.0, ans=0.0 2023-09-29 16:11:43,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:11:44,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:11:46,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:11:49,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:11:50,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:11:56,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 16:11:57,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:11:58,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:11:58,713 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=418686.6666666667, ans=0.0 2023-09-29 16:12:04,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:12:07,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 16:12:09,774 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=418686.6666666667, ans=0.0 2023-09-29 16:12:11,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:12:11,657 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:12:12,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 16:12:18,151 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 16:12:19,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:12:19,837 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 16:12:21,241 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 16:12:21,374 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 16:12:21,384 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:12:21,612 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=418753.3333333333, ans=0.2 2023-09-29 16:12:22,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:12:22,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:12:24,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:12:24,410 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=418753.3333333333, ans=0.2 2023-09-29 16:12:25,723 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:12:25,791 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:12:30,223 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 16:12:30,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:12:30,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:12:30,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:12:30,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 16:12:31,945 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 16:12:31,953 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 16:12:31,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 16:12:35,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:12:35,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:12:35,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:12:36,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:12:38,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 16:12:41,316 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 16:12:41,328 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:12:45,188 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=418886.6666666667, ans=0.1 2023-09-29 16:12:47,066 INFO [train.py:1039] (1/4) Epoch 12, batch 4400, loss[loss=0.1808, simple_loss=0.2614, pruned_loss=0.05013, over 24452.00 frames. ], tot_loss[loss=0.1973, simple_loss=0.268, pruned_loss=0.06327, over 4703576.97 frames. ], batch size: 69, lr: 8.53e-03, grad_scale: 16.0 2023-09-29 16:12:47,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:12:47,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:12:50,092 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:12:50,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 16:12:52,258 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 16:12:52,321 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 16:12:52,366 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 16:12:54,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 16:12:54,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:12:55,957 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.634e+02 1.963e+02 2.169e+02 2.661e+02 4.171e+02, threshold=4.339e+02, percent-clipped=0.0 2023-09-29 16:12:56,246 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 16:12:59,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:12:59,735 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=418886.6666666667, ans=0.0 2023-09-29 16:13:00,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:13:00,896 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 16:13:05,376 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:13:05,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 16:13:05,485 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 16:13:08,847 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=418953.3333333333, ans=0.0 2023-09-29 16:13:09,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 16:13:10,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 16:13:10,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 16:13:10,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:13:11,649 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:13:13,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:13:14,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:13:16,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 16:13:16,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 16:13:16,653 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.85 vs. limit=15.0 2023-09-29 16:13:17,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:13:19,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:13:19,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:13:21,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:13:23,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:13:23,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 16:13:24,615 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 16:13:25,469 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.06 vs. limit=22.5 2023-09-29 16:13:27,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:13:35,321 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:13:36,966 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 16:13:38,909 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=419086.6666666667, ans=0.125 2023-09-29 16:13:41,593 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:13:43,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:13:47,737 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:13:47,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 16:13:47,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:13:47,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:13:47,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:13:49,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 16:13:49,801 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=419086.6666666667, ans=0.2 2023-09-29 16:13:53,422 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=419153.3333333333, ans=0.2 2023-09-29 16:13:54,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 16:13:57,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 16:13:59,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 16:13:59,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:13:59,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 16:14:00,600 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.71 vs. limit=10.0 2023-09-29 16:14:01,313 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:14:04,863 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:14:05,645 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.54 vs. limit=15.0 2023-09-29 16:14:06,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 16:14:09,896 INFO [train.py:1039] (1/4) Epoch 12, batch 4450, loss[loss=0.1833, simple_loss=0.2679, pruned_loss=0.04936, over 24469.00 frames. ], tot_loss[loss=0.1981, simple_loss=0.2695, pruned_loss=0.06338, over 4705364.21 frames. ], batch size: 66, lr: 8.53e-03, grad_scale: 16.0 2023-09-29 16:14:12,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:14:14,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:14:14,586 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:14:20,994 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=419220.0, ans=0.125 2023-09-29 16:14:23,493 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:14:24,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:14:26,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:14:28,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:14:32,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:14:33,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:14:35,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 16:14:35,631 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:14:37,117 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:14:37,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:14:37,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:14:38,064 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.55 vs. limit=10.0 2023-09-29 16:14:38,867 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 16:14:44,726 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.01 vs. limit=15.0 2023-09-29 16:14:45,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:14:45,650 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:14:45,989 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=419353.3333333333, ans=0.125 2023-09-29 16:14:47,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:14:47,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:14:49,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:14:53,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 16:14:55,350 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 16:14:56,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 16:14:56,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:14:58,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:14:58,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 16:14:58,656 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=419420.0, ans=0.1 2023-09-29 16:15:02,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 16:15:04,097 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=419420.0, ans=0.1 2023-09-29 16:15:05,522 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=419420.0, ans=0.0 2023-09-29 16:15:06,738 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:15:08,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 16:15:08,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:15:08,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:15:10,239 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:15:10,250 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:15:10,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:15:13,701 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 16:15:15,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 16:15:17,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 16:15:19,121 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:15:20,909 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:15:22,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:15:23,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:15:24,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 16:15:25,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 16:15:28,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 16:15:30,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:15:31,783 INFO [train.py:1039] (1/4) Epoch 12, batch 4500, loss[loss=0.1659, simple_loss=0.2443, pruned_loss=0.04372, over 24458.00 frames. ], tot_loss[loss=0.1983, simple_loss=0.2697, pruned_loss=0.06343, over 4700886.60 frames. ], batch size: 58, lr: 8.52e-03, grad_scale: 16.0 2023-09-29 16:15:35,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:15:37,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 16:15:37,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 16:15:38,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:15:40,287 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.479e+02 1.947e+02 2.224e+02 2.499e+02 3.956e+02, threshold=4.448e+02, percent-clipped=0.0 2023-09-29 16:15:44,143 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:15:44,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:15:45,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 16:15:47,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:15:47,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:15:47,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:15:54,943 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.93 vs. limit=10.0 2023-09-29 16:15:57,831 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=419620.0, ans=0.125 2023-09-29 16:16:00,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:16:00,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:16:03,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:16:04,018 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:16:05,575 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:16:13,462 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 16:16:17,458 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=419686.6666666667, ans=0.09899494936611666 2023-09-29 16:16:18,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:16:23,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 16:16:27,056 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:16:27,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 16:16:27,213 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:16:27,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:16:30,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:16:30,998 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:16:33,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:16:34,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 16:16:34,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 16:16:34,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:16:39,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:16:39,146 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:16:43,368 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:16:45,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:16:45,272 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=419820.0, ans=0.125 2023-09-29 16:16:46,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:16:47,023 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=419820.0, ans=0.125 2023-09-29 16:16:48,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 16:16:50,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 16:16:50,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 16:16:54,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 16:16:56,321 INFO [train.py:1039] (1/4) Epoch 12, batch 4550, loss[loss=0.1721, simple_loss=0.2459, pruned_loss=0.04913, over 24590.00 frames. ], tot_loss[loss=0.1979, simple_loss=0.2686, pruned_loss=0.06361, over 4684710.39 frames. ], batch size: 60, lr: 8.52e-03, grad_scale: 16.0 2023-09-29 16:16:59,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 16:16:59,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:17:02,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:17:04,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:17:07,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:17:10,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:17:13,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:17:14,307 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=419953.3333333333, ans=0.125 2023-09-29 16:17:15,517 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:17:15,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:17:15,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:17:17,402 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=419953.3333333333, ans=0.0 2023-09-29 16:17:18,577 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:17:18,638 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:17:22,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:17:24,402 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 16:17:26,294 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 16:17:26,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:17:28,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 16:17:33,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 16:17:34,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:17:36,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 16:17:37,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 16:17:42,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:17:44,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:17:44,344 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 16:17:44,734 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=420086.6666666667, ans=0.125 2023-09-29 16:17:46,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 16:17:46,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:17:49,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:17:49,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:17:49,683 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=420086.6666666667, ans=0.125 2023-09-29 16:17:52,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:17:53,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 16:17:55,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 16:17:55,264 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:17:57,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 16:17:57,381 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 16:17:57,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:17:59,814 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:17:59,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:18:01,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:18:01,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:18:03,866 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer_ff2.min_abs, batch_count=420153.3333333333, ans=0.1 2023-09-29 16:18:04,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 16:18:04,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 16:18:06,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:18:06,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 16:18:07,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 16:18:07,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:18:07,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 16:18:10,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:18:10,971 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:18:13,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:18:14,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:18:14,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 16:18:16,401 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=420153.3333333333, ans=0.125 2023-09-29 16:18:16,576 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=420153.3333333333, ans=0.0 2023-09-29 16:18:17,815 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:18:18,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 16:18:19,368 INFO [train.py:1039] (1/4) Epoch 12, batch 4600, loss[loss=0.1912, simple_loss=0.2609, pruned_loss=0.06078, over 23649.00 frames. ], tot_loss[loss=0.1968, simple_loss=0.2674, pruned_loss=0.06311, over 4684643.42 frames. ], batch size: 149, lr: 8.52e-03, grad_scale: 16.0 2023-09-29 16:18:22,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:18:23,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:18:25,580 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:18:25,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:18:26,889 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.954e+02 2.198e+02 2.471e+02 4.636e+02, threshold=4.396e+02, percent-clipped=1.0 2023-09-29 16:18:27,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:18:27,249 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 16:18:28,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:18:36,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:18:36,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:18:39,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:18:47,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 16:18:47,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:18:49,584 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=11.61 vs. limit=10.0 2023-09-29 16:18:50,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:18:52,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:18:52,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:18:57,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 16:18:57,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 16:18:59,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:18:59,365 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=420353.3333333333, ans=0.0 2023-09-29 16:19:04,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:19:04,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:19:07,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:19:11,667 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 16:19:13,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 16:19:15,149 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=420420.0, ans=0.5 2023-09-29 16:19:18,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:19:19,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:19:22,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:19:22,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 16:19:22,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:19:24,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 16:19:24,037 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:19:24,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:19:26,272 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:19:26,387 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:19:27,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:19:28,047 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=420486.6666666667, ans=0.125 2023-09-29 16:19:28,154 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=420486.6666666667, ans=0.2 2023-09-29 16:19:29,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 16:19:30,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 16:19:30,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 16:19:30,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:19:32,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:19:32,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:19:34,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:19:43,605 INFO [train.py:1039] (1/4) Epoch 12, batch 4650, loss[loss=0.1825, simple_loss=0.2509, pruned_loss=0.05708, over 23413.00 frames. ], tot_loss[loss=0.1958, simple_loss=0.2663, pruned_loss=0.06267, over 4680207.13 frames. ], batch size: 134, lr: 8.51e-03, grad_scale: 8.0 2023-09-29 16:19:45,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:19:49,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:19:51,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:19:51,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:19:51,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:19:52,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:19:54,099 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:19:57,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 16:20:01,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:20:03,377 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 16:20:03,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:20:03,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 16:20:05,012 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:20:05,112 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 16:20:05,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 16:20:05,150 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:20:07,095 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:20:10,175 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:20:11,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:20:11,766 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 16:20:16,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:20:17,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 16:20:18,224 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=420686.6666666667, ans=0.125 2023-09-29 16:20:21,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:20:22,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:20:22,822 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 16:20:24,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:20:28,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:20:30,645 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:20:35,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:20:39,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:20:39,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:20:39,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:20:43,381 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=420753.3333333333, ans=0.125 2023-09-29 16:20:44,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 16:20:44,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 16:20:44,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 16:20:44,803 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 16:20:46,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:20:51,077 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=420820.0, ans=0.1 2023-09-29 16:20:55,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 16:20:55,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:20:55,232 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 16:20:55,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:20:58,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:20:58,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:21:00,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:21:03,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:21:03,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:21:03,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:21:04,949 INFO [train.py:1039] (1/4) Epoch 12, batch 4700, loss[loss=0.1949, simple_loss=0.2736, pruned_loss=0.0581, over 24387.00 frames. ], tot_loss[loss=0.1962, simple_loss=0.2674, pruned_loss=0.06247, over 4705233.52 frames. ], batch size: 77, lr: 8.51e-03, grad_scale: 8.0 2023-09-29 16:21:09,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:21:09,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:21:10,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 16:21:11,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 16:21:11,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 16:21:11,475 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=420886.6666666667, ans=10.0 2023-09-29 16:21:12,756 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 16:21:14,004 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.531e+02 1.872e+02 2.064e+02 2.331e+02 3.087e+02, threshold=4.129e+02, percent-clipped=0.0 2023-09-29 16:21:14,455 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=420886.6666666667, ans=0.125 2023-09-29 16:21:19,352 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=420953.3333333333, ans=0.125 2023-09-29 16:21:20,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:21:21,088 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:21:21,278 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=420953.3333333333, ans=0.125 2023-09-29 16:21:22,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:21:22,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:21:24,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 16:21:30,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 16:21:30,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 16:21:33,185 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:21:33,335 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:21:34,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:21:37,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:21:45,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 16:21:45,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 16:21:48,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:21:55,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 16:21:55,795 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=421086.6666666667, ans=0.125 2023-09-29 16:21:57,047 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:22:00,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:22:04,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 16:22:04,877 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:22:09,422 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:22:09,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 16:22:12,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:22:12,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:22:14,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:22:15,551 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:22:15,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 16:22:17,096 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 16:22:18,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:22:20,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:22:20,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:22:20,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 16:22:20,496 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:22:21,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:22:25,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 16:22:27,033 INFO [train.py:1039] (1/4) Epoch 12, batch 4750, loss[loss=0.2084, simple_loss=0.2735, pruned_loss=0.07164, over 23858.00 frames. ], tot_loss[loss=0.1965, simple_loss=0.2679, pruned_loss=0.0625, over 4719960.41 frames. ], batch size: 195, lr: 8.51e-03, grad_scale: 8.0 2023-09-29 16:22:28,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:22:30,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:22:34,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:22:34,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:22:36,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 16:22:37,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:22:41,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 16:22:42,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:22:44,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:22:44,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:22:46,461 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=421286.6666666667, ans=0.0 2023-09-29 16:22:49,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 16:22:52,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=421286.6666666667, ans=0.0 2023-09-29 16:22:53,920 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:22:55,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 16:22:55,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:22:59,129 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=421353.3333333333, ans=0.0 2023-09-29 16:23:02,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:23:02,287 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:23:02,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:23:04,416 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 16:23:04,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 16:23:11,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 16:23:14,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:23:14,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:23:16,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 16:23:16,619 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 16:23:16,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:23:19,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 16:23:23,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:23:24,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 16:23:25,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 16:23:26,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:23:27,449 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:23:27,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:23:28,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 16:23:29,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 16:23:33,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 16:23:36,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:23:40,008 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:23:40,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 16:23:40,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:23:40,323 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=421486.6666666667, ans=0.0 2023-09-29 16:23:41,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:23:43,645 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:23:43,958 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=421486.6666666667, ans=0.125 2023-09-29 16:23:45,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:23:45,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 16:23:50,774 INFO [train.py:1039] (1/4) Epoch 12, batch 4800, loss[loss=0.2065, simple_loss=0.2796, pruned_loss=0.0667, over 23967.00 frames. ], tot_loss[loss=0.1975, simple_loss=0.2691, pruned_loss=0.06291, over 4712178.86 frames. ], batch size: 86, lr: 8.50e-03, grad_scale: 16.0 2023-09-29 16:23:50,906 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:23:50,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 16:23:52,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 16:23:52,487 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 16:23:55,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 16:23:55,608 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:23:57,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 16:23:59,993 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 2.054e+02 2.346e+02 2.832e+02 5.942e+02, threshold=4.692e+02, percent-clipped=5.0 2023-09-29 16:24:03,121 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:24:04,547 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:24:07,893 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:24:08,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:24:09,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:24:09,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 16:24:09,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:24:11,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:24:11,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:24:17,128 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:24:17,579 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=421620.0, ans=0.125 2023-09-29 16:24:18,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:24:20,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:24:20,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:24:22,268 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 16:24:22,301 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:24:23,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:24:25,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:24:28,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:24:30,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:24:30,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:24:31,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 16:24:34,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:24:36,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 16:24:36,051 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 16:24:36,181 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:24:36,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:24:37,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:24:37,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:24:37,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 16:24:39,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:24:39,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:24:42,578 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:24:46,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:24:48,336 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:24:53,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 16:24:53,286 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:24:55,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:24:55,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 16:24:55,832 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.65 vs. limit=15.0 2023-09-29 16:24:56,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:24:57,181 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=421820.0, ans=0.125 2023-09-29 16:25:01,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:25:01,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:25:01,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:25:01,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:25:02,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:25:02,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:25:07,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:25:08,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:25:08,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:25:10,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 16:25:11,808 INFO [train.py:1039] (1/4) Epoch 12, batch 4850, loss[loss=0.1728, simple_loss=0.2374, pruned_loss=0.05413, over 23703.00 frames. ], tot_loss[loss=0.1978, simple_loss=0.2692, pruned_loss=0.06324, over 4712104.96 frames. ], batch size: 232, lr: 8.50e-03, grad_scale: 16.0 2023-09-29 16:25:12,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 16:25:12,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:25:12,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:25:13,652 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:25:13,654 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:25:16,695 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:25:24,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 16:25:27,777 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:25:33,067 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:25:33,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 16:25:34,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:25:37,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:25:39,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:25:40,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:25:42,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 16:25:42,865 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.78 vs. limit=15.0 2023-09-29 16:25:45,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:25:48,231 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:25:48,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 16:25:48,375 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 16:25:48,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 16:25:50,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:25:50,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:25:55,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:25:55,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 16:25:55,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 16:25:57,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 16:26:03,370 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=422086.6666666667, ans=0.125 2023-09-29 16:26:06,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:26:07,745 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 16:26:09,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:26:09,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:26:12,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 16:26:12,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 16:26:12,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:26:13,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 16:26:13,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:26:15,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:26:16,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 16:26:24,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:26:30,108 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:26:32,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:26:34,904 INFO [train.py:1039] (1/4) Epoch 12, batch 4900, loss[loss=0.1899, simple_loss=0.2573, pruned_loss=0.06125, over 24343.00 frames. ], tot_loss[loss=0.197, simple_loss=0.2684, pruned_loss=0.06284, over 4712266.79 frames. ], batch size: 56, lr: 8.50e-03, grad_scale: 16.0 2023-09-29 16:26:38,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 16:26:38,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:26:43,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:26:43,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:26:44,648 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.724e+02 2.050e+02 2.285e+02 2.620e+02 3.714e+02, threshold=4.569e+02, percent-clipped=0.0 2023-09-29 16:26:44,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:26:48,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 16:26:52,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 16:26:54,854 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=422286.6666666667, ans=0.125 2023-09-29 16:26:57,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 16:26:58,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 16:26:58,960 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:26:59,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:27:00,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:27:00,431 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:27:00,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:27:00,549 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 16:27:05,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 16:27:06,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 16:27:08,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:27:08,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:27:12,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:27:12,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:27:13,659 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:27:13,673 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 16:27:15,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:27:16,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:27:18,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 16:27:18,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 16:27:21,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 16:27:22,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 16:27:24,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:27:24,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:27:24,479 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=422420.0, ans=0.125 2023-09-29 16:27:25,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:27:25,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 16:27:25,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:27:25,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 16:27:27,637 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=422420.0, ans=0.125 2023-09-29 16:27:29,015 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:27:31,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 16:27:33,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:27:37,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 16:27:38,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:27:39,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 16:27:39,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 16:27:46,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:27:48,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:27:49,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 16:27:49,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 16:27:49,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:27:49,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:27:54,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:27:54,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:27:54,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:27:54,635 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 16:27:57,494 INFO [train.py:1039] (1/4) Epoch 12, batch 4950, loss[loss=0.1794, simple_loss=0.2188, pruned_loss=0.07002, over 19207.00 frames. ], tot_loss[loss=0.1955, simple_loss=0.2662, pruned_loss=0.06242, over 4696240.97 frames. ], batch size: 389, lr: 8.49e-03, grad_scale: 8.0 2023-09-29 16:27:57,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 16:27:59,293 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:28:00,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 16:28:03,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 16:28:03,822 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 16:28:05,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 16:28:06,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 16:28:06,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:28:06,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:28:06,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 16:28:06,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:28:09,001 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:28:11,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:28:12,630 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:28:12,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:28:15,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:28:15,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:28:19,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 16:28:25,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:28:27,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:28:30,780 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:28:30,870 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:28:32,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:28:33,957 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 16:28:35,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 16:28:38,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:28:41,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:28:41,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:28:43,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:28:43,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:28:43,648 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 16:28:45,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:28:46,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 16:28:51,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:28:53,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:28:53,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:28:53,768 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.80 vs. limit=15.0 2023-09-29 16:28:54,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 16:28:54,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:28:56,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:28:59,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:29:00,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:29:00,802 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:29:00,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:29:01,167 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=422753.3333333333, ans=0.2 2023-09-29 16:29:02,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:29:02,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:29:03,320 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.19 vs. limit=15.0 2023-09-29 16:29:05,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:29:05,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:29:06,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:29:08,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 16:29:09,278 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.63 vs. limit=22.5 2023-09-29 16:29:11,517 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:29:16,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 16:29:18,351 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 16:29:20,390 INFO [train.py:1039] (1/4) Epoch 12, batch 5000, loss[loss=0.1943, simple_loss=0.2777, pruned_loss=0.05542, over 24684.00 frames. ], tot_loss[loss=0.1952, simple_loss=0.2662, pruned_loss=0.06211, over 4705024.49 frames. ], batch size: 68, lr: 8.49e-03, grad_scale: 8.0 2023-09-29 16:29:26,461 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:29:26,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:29:27,988 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 16:29:28,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 16:29:31,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:29:31,478 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=422886.6666666667, ans=0.125 2023-09-29 16:29:32,530 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.648e+02 1.922e+02 2.238e+02 2.801e+02 3.922e+02, threshold=4.477e+02, percent-clipped=0.0 2023-09-29 16:29:32,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 16:29:32,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:29:32,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:29:34,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 16:29:35,801 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:29:35,914 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:29:37,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 16:29:37,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:29:37,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:29:39,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 16:29:40,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 16:29:40,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:29:40,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 16:29:40,689 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 16:29:42,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:29:42,259 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 16:29:42,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 16:29:42,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 16:29:43,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 16:29:45,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:29:45,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:29:46,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 16:29:47,004 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:29:49,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:29:51,524 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:29:54,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 16:29:56,309 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.74 vs. limit=15.0 2023-09-29 16:29:57,052 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 16:29:57,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:29:59,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:30:02,440 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=423020.0, ans=0.125 2023-09-29 16:30:03,854 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 16:30:06,875 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:30:08,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:30:08,416 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:30:11,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 16:30:11,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:30:12,940 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:30:13,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:30:14,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 16:30:14,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:30:17,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:30:17,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:30:18,617 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=6.92 vs. limit=12.0 2023-09-29 16:30:23,088 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.07 vs. limit=15.0 2023-09-29 16:30:25,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 16:30:31,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:30:31,769 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=423153.3333333333, ans=0.1 2023-09-29 16:30:40,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:30:41,616 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:30:41,640 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:30:41,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:30:41,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 16:30:43,054 INFO [train.py:1039] (1/4) Epoch 12, batch 5050, loss[loss=0.202, simple_loss=0.2763, pruned_loss=0.06386, over 23262.00 frames. ], tot_loss[loss=0.1953, simple_loss=0.2664, pruned_loss=0.06214, over 4708717.41 frames. ], batch size: 93, lr: 8.49e-03, grad_scale: 8.0 2023-09-29 16:30:43,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:30:43,228 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:30:47,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:30:47,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 16:30:49,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:30:51,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:30:52,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:30:54,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 16:30:54,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:30:55,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:30:57,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 16:30:58,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:30:58,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 16:31:09,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 16:31:09,723 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 16:31:11,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:31:11,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 16:31:12,649 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:31:14,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:31:15,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:31:17,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:31:17,165 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 16:31:18,617 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 16:31:20,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:31:21,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:31:23,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:31:24,178 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten.whitening_limit, batch_count=423353.3333333333, ans=15.0 2023-09-29 16:31:25,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 16:31:26,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:31:29,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 16:31:32,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:31:32,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:31:34,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:31:35,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:31:35,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:31:39,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:31:39,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:31:40,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:31:40,205 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=423420.0, ans=0.1 2023-09-29 16:31:41,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:31:41,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 16:31:41,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:31:43,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:31:46,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:31:46,203 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 16:31:46,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 16:31:47,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:31:49,276 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:31:49,319 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 16:31:52,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:31:52,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 16:31:52,324 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:31:53,169 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.64 vs. limit=12.0 2023-09-29 16:31:56,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:31:57,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:31:58,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 16:31:58,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 16:32:01,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:32:01,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:32:01,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:32:03,311 INFO [train.py:1039] (1/4) Epoch 12, batch 5100, loss[loss=0.2016, simple_loss=0.2859, pruned_loss=0.05864, over 24440.00 frames. ], tot_loss[loss=0.1957, simple_loss=0.267, pruned_loss=0.0622, over 4704398.09 frames. ], batch size: 69, lr: 8.48e-03, grad_scale: 8.0 2023-09-29 16:32:04,934 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 16:32:06,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:32:10,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 16:32:10,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 16:32:12,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:32:13,231 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.63 vs. limit=10.0 2023-09-29 16:32:15,479 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.497e+02 1.923e+02 2.120e+02 2.504e+02 4.528e+02, threshold=4.241e+02, percent-clipped=1.0 2023-09-29 16:32:15,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:32:18,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:32:20,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 16:32:20,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 16:32:25,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:32:25,145 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:32:25,486 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=423620.0, ans=0.125 2023-09-29 16:32:28,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:32:30,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 16:32:31,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:32:33,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:32:33,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 16:32:35,038 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=423686.6666666667, ans=0.125 2023-09-29 16:32:36,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:32:37,623 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:32:37,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 16:32:39,318 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 16:32:41,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:32:42,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 16:32:42,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 16:32:46,619 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.19 vs. limit=15.0 2023-09-29 16:32:47,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:32:52,648 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=423753.3333333333, ans=0.2 2023-09-29 16:32:54,083 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=423753.3333333333, ans=0.0 2023-09-29 16:32:56,823 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:32:58,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 16:32:58,539 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 16:32:58,568 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 16:33:00,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 16:33:00,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:33:00,408 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=423753.3333333333, ans=0.0 2023-09-29 16:33:03,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 16:33:07,628 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 16:33:10,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 16:33:11,042 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=423820.0, ans=0.0 2023-09-29 16:33:12,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:33:16,426 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 16:33:17,063 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.57 vs. limit=15.0 2023-09-29 16:33:18,689 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 16:33:18,771 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 16:33:24,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:33:24,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:33:24,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:33:25,097 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=423886.6666666667, ans=0.125 2023-09-29 16:33:25,302 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.25 vs. limit=15.0 2023-09-29 16:33:25,448 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.59 vs. limit=15.0 2023-09-29 16:33:26,713 INFO [train.py:1039] (1/4) Epoch 12, batch 5150, loss[loss=0.1751, simple_loss=0.2483, pruned_loss=0.05092, over 24442.00 frames. ], tot_loss[loss=0.1966, simple_loss=0.2682, pruned_loss=0.06256, over 4705907.77 frames. ], batch size: 58, lr: 8.48e-03, grad_scale: 8.0 2023-09-29 16:33:26,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:33:26,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 16:33:28,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:33:29,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 16:33:29,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 16:33:31,276 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 16:33:31,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:33:31,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 16:33:32,916 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:33:32,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 16:33:34,543 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:33:36,069 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:33:41,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 16:33:41,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 16:33:42,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:33:44,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:33:45,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 16:33:45,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:33:45,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:33:45,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:33:45,873 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:33:47,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 16:33:47,794 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=423953.3333333333, ans=0.1 2023-09-29 16:33:49,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:33:49,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:33:52,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 16:33:55,786 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 16:33:55,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:34:03,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 16:34:05,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 16:34:08,414 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:34:15,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:34:17,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:34:22,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:34:22,091 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:34:23,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 16:34:28,921 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:34:29,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:34:30,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:34:34,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:34:35,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:34:35,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 16:34:43,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:34:43,573 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 16:34:45,203 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:34:45,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:34:46,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 16:34:46,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 16:34:46,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:34:48,043 INFO [train.py:1039] (1/4) Epoch 12, batch 5200, loss[loss=0.2111, simple_loss=0.2711, pruned_loss=0.07557, over 23735.00 frames. ], tot_loss[loss=0.1965, simple_loss=0.2683, pruned_loss=0.06234, over 4717468.81 frames. ], batch size: 164, lr: 8.48e-03, grad_scale: 16.0 2023-09-29 16:34:48,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:34:49,851 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.min_positive, batch_count=424220.0, ans=0.025 2023-09-29 16:34:51,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:34:53,117 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=424220.0, ans=0.125 2023-09-29 16:34:54,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:34:55,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:34:58,799 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 1.937e+02 2.192e+02 2.501e+02 3.290e+02, threshold=4.383e+02, percent-clipped=0.0 2023-09-29 16:34:59,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 16:35:00,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:35:01,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:35:05,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:35:05,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:35:05,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:35:08,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 16:35:11,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 16:35:11,656 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:35:12,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:35:15,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 16:35:18,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 16:35:18,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:35:19,129 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=424353.3333333333, ans=0.125 2023-09-29 16:35:20,350 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 16:35:20,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 16:35:23,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 16:35:23,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:35:23,655 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 16:35:23,667 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:35:25,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:35:25,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:35:26,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 16:35:26,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:35:29,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:35:32,829 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 16:35:32,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 16:35:33,063 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=424353.3333333333, ans=0.125 2023-09-29 16:35:34,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 16:35:38,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 16:35:40,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 16:35:45,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:35:45,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:35:48,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 16:35:48,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:35:48,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 16:35:48,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:35:50,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 16:35:54,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:35:56,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:35:57,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:35:59,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:35:59,505 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:36:05,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:36:06,893 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 16:36:07,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:36:07,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:36:08,358 INFO [train.py:1039] (1/4) Epoch 12, batch 5250, loss[loss=0.2067, simple_loss=0.2766, pruned_loss=0.06845, over 23765.00 frames. ], tot_loss[loss=0.1965, simple_loss=0.2675, pruned_loss=0.06273, over 4702349.78 frames. ], batch size: 135, lr: 8.47e-03, grad_scale: 16.0 2023-09-29 16:36:08,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:36:10,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 16:36:11,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:36:15,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:36:17,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:36:17,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:36:18,983 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:36:25,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:36:26,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:36:26,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:36:28,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 16:36:31,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 16:36:31,367 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:36:32,925 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:36:33,433 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.28 vs. limit=15.0 2023-09-29 16:37:12,719 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=424820.0, ans=0.2 2023-09-29 16:37:15,516 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=424820.0, ans=0.125 2023-09-29 16:37:18,391 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=424820.0, ans=0.125 2023-09-29 16:37:22,517 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=424886.6666666667, ans=0.5 2023-09-29 16:37:23,493 INFO [train.py:1039] (1/4) Epoch 12, batch 5300, loss[loss=0.1987, simple_loss=0.2662, pruned_loss=0.0656, over 23390.00 frames. ], tot_loss[loss=0.1957, simple_loss=0.2666, pruned_loss=0.06235, over 4695769.01 frames. ], batch size: 119, lr: 8.47e-03, grad_scale: 16.0 2023-09-29 16:37:33,071 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.629e+02 1.874e+02 2.092e+02 2.441e+02 3.524e+02, threshold=4.184e+02, percent-clipped=0.0 2023-09-29 16:37:38,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:37:38,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 16:37:38,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 16:37:38,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:37:38,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:37:38,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:37:38,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:37:38,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:37:38,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:37:39,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:37:39,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 16:37:39,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:37:40,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 16:37:40,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 16:37:40,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 16:37:40,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 16:37:40,553 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 16:37:40,683 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 16:37:40,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:37:41,363 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:37:41,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:37:41,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:37:41,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:37:42,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:37:42,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:37:42,313 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:37:42,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:37:42,498 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:37:42,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:37:42,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:37:42,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:37:44,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 16:37:44,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:37:44,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:37:44,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 16:37:44,594 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 16:37:44,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:37:44,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:37:44,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 16:37:45,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 16:37:45,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 16:37:45,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:37:46,084 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:37:46,238 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 16:37:46,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 16:37:46,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:37:46,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:37:46,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 16:37:46,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 16:37:46,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 16:37:47,172 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 16:37:57,177 INFO [train.py:1039] (1/4) Epoch 13, batch 0, loss[loss=0.2046, simple_loss=0.2739, pruned_loss=0.06766, over 23820.00 frames. ], tot_loss[loss=0.2046, simple_loss=0.2739, pruned_loss=0.06766, over 23820.00 frames. ], batch size: 164, lr: 8.14e-03, grad_scale: 32.0 2023-09-29 16:37:57,177 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-29 16:38:10,944 INFO [train.py:1071] (1/4) Epoch 13, validation: loss=0.2695, simple_loss=0.2756, pruned_loss=0.1317, over 1125622.00 frames. 2023-09-29 16:38:10,945 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-29 16:38:12,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 16:38:13,070 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=424966.6666666667, ans=0.125 2023-09-29 16:38:13,081 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=424966.6666666667, ans=0.125 2023-09-29 16:38:14,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:38:15,922 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:38:17,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=424966.6666666667, ans=0.125 2023-09-29 16:38:20,578 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:38:20,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:38:22,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:38:22,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 16:38:24,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 16:38:27,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:38:28,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:38:32,571 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=425033.3333333333, ans=0.035 2023-09-29 16:38:33,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:38:33,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:38:35,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:38:35,332 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:38:36,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 16:38:38,430 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:38:45,992 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:38:46,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:38:48,855 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 16:38:52,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:38:52,259 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:38:53,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:38:59,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:39:05,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:39:06,594 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=425166.6666666667, ans=0.125 2023-09-29 16:39:10,536 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=425166.6666666667, ans=0.1 2023-09-29 16:39:11,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 16:39:14,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 16:39:14,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:39:14,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:39:15,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:39:16,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:39:18,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 16:39:19,717 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=425233.3333333333, ans=0.125 2023-09-29 16:39:22,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:39:24,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:39:27,305 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:39:30,565 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 16:39:32,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:39:34,206 INFO [train.py:1039] (1/4) Epoch 13, batch 50, loss[loss=0.1886, simple_loss=0.2728, pruned_loss=0.05222, over 24537.00 frames. ], tot_loss[loss=0.201, simple_loss=0.2735, pruned_loss=0.06428, over 1072520.93 frames. ], batch size: 71, lr: 8.14e-03, grad_scale: 32.0 2023-09-29 16:39:35,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:39:38,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:39:38,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 16:39:39,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 16:39:39,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:39:44,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:39:46,351 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:39:48,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:39:51,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 16:39:51,215 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:39:58,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 16:40:00,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 16:40:02,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 16:40:02,169 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=425366.6666666667, ans=0.0 2023-09-29 16:40:04,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:40:05,719 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.44 vs. limit=22.5 2023-09-29 16:40:06,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:40:06,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:40:06,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:40:07,106 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=425433.3333333333, ans=0.2 2023-09-29 16:40:08,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 16:40:08,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=425433.3333333333, ans=0.125 2023-09-29 16:40:10,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 16:40:10,187 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:40:17,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:40:18,698 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:40:18,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:40:20,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 16:40:23,488 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:40:24,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:40:24,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 16:40:26,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:40:27,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 16:40:35,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:40:35,814 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:40:38,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:40:40,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:40:40,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 16:40:40,821 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=425566.6666666667, ans=0.07 2023-09-29 16:40:43,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 16:40:43,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 16:40:45,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:40:46,809 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.884e+02 2.162e+02 2.621e+02 5.674e+02, threshold=4.324e+02, percent-clipped=3.0 2023-09-29 16:40:46,936 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 16:40:48,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:40:48,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:40:48,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 16:40:48,981 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=425566.6666666667, ans=0.2 2023-09-29 16:40:50,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 16:40:50,964 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 16:40:52,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:40:52,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:40:54,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 16:40:54,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 16:40:54,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:40:55,959 INFO [train.py:1039] (1/4) Epoch 13, batch 100, loss[loss=0.193, simple_loss=0.2606, pruned_loss=0.0627, over 23594.00 frames. ], tot_loss[loss=0.1983, simple_loss=0.2709, pruned_loss=0.06283, over 1875448.55 frames. ], batch size: 256, lr: 8.13e-03, grad_scale: 32.0 2023-09-29 16:40:56,106 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:40:57,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 16:40:57,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:41:01,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:41:01,541 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.max_positive, batch_count=425633.3333333333, ans=0.95 2023-09-29 16:41:04,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:41:06,224 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=425633.3333333333, ans=0.0 2023-09-29 16:41:07,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:41:08,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 16:41:08,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:41:13,772 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:41:13,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:41:13,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:41:13,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:41:13,890 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:41:15,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 16:41:15,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:41:15,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:41:15,946 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=425700.0, ans=0.125 2023-09-29 16:41:17,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:41:17,106 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:41:20,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 16:41:23,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:41:23,356 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=425700.0, ans=0.0 2023-09-29 16:41:24,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:41:25,206 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.57 vs. limit=15.0 2023-09-29 16:41:25,988 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 16:41:29,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 16:41:29,926 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=425766.6666666667, ans=0.2 2023-09-29 16:41:32,690 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 16:41:32,714 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 16:41:34,844 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:41:34,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:41:39,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 16:41:42,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:41:42,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:41:48,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:41:48,539 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=425833.3333333333, ans=0.0 2023-09-29 16:41:49,748 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 16:41:51,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 16:41:56,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:41:57,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:42:01,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:42:03,465 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=425900.0, ans=0.0 2023-09-29 16:42:04,583 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:42:06,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:42:09,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:42:10,065 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=425900.0, ans=15.0 2023-09-29 16:42:10,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:42:12,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:42:14,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:42:14,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:42:14,953 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:42:16,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 16:42:16,384 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 16:42:16,573 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=425966.6666666667, ans=0.125 2023-09-29 16:42:16,688 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=425966.6666666667, ans=0.125 2023-09-29 16:42:17,683 INFO [train.py:1039] (1/4) Epoch 13, batch 150, loss[loss=0.226, simple_loss=0.2854, pruned_loss=0.08331, over 23940.00 frames. ], tot_loss[loss=0.1981, simple_loss=0.2698, pruned_loss=0.06324, over 2502930.29 frames. ], batch size: 195, lr: 8.13e-03, grad_scale: 32.0 2023-09-29 16:42:17,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:42:19,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:42:19,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:42:19,337 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:42:19,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 16:42:19,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 16:42:19,478 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 16:42:19,488 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:42:21,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:42:22,585 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:42:24,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:42:24,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:42:27,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:42:29,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:42:29,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:42:31,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:42:34,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:42:34,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:42:37,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:42:38,047 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:42:42,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 16:42:42,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 16:42:42,571 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 16:42:46,664 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:42:46,688 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:42:48,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:42:48,345 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:42:49,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:42:49,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:42:51,323 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:42:52,854 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 16:42:54,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:42:59,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:43:02,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:43:04,498 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 16:43:07,101 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=426166.6666666667, ans=0.1 2023-09-29 16:43:08,487 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=426166.6666666667, ans=0.125 2023-09-29 16:43:09,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:43:09,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:43:09,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:43:11,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:43:12,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:43:14,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 16:43:15,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:43:15,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 16:43:17,986 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=426166.6666666667, ans=0.125 2023-09-29 16:43:23,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:43:24,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:43:24,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:43:24,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:43:27,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:43:27,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 16:43:30,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:43:32,023 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.958e+02 2.151e+02 2.617e+02 4.145e+02, threshold=4.302e+02, percent-clipped=0.0 2023-09-29 16:43:32,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:43:33,738 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:43:35,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 16:43:35,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 16:43:35,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:43:37,445 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 16:43:40,253 INFO [train.py:1039] (1/4) Epoch 13, batch 200, loss[loss=0.1891, simple_loss=0.2705, pruned_loss=0.05387, over 24531.00 frames. ], tot_loss[loss=0.2004, simple_loss=0.2718, pruned_loss=0.06452, over 3001265.90 frames. ], batch size: 71, lr: 8.13e-03, grad_scale: 32.0 2023-09-29 16:43:42,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:43:45,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:43:45,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:43:49,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 16:43:51,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:43:51,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:43:53,696 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 16:43:55,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 16:43:56,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:43:58,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:43:59,130 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=426366.6666666667, ans=0.125 2023-09-29 16:44:03,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:44:03,528 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:44:05,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:44:19,624 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.25 vs. limit=15.0 2023-09-29 16:44:22,053 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=426433.3333333333, ans=0.1 2023-09-29 16:44:23,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:44:24,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:44:24,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:44:25,047 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=426433.3333333333, ans=0.1 2023-09-29 16:44:26,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:44:26,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 16:44:26,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 16:44:28,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:44:28,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:44:28,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:44:28,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:44:32,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 16:44:32,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 16:44:32,532 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:44:39,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:44:44,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:44:51,319 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:44:51,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:44:58,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:44:59,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 16:45:01,901 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:45:01,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 16:45:01,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:45:03,246 INFO [train.py:1039] (1/4) Epoch 13, batch 250, loss[loss=0.2147, simple_loss=0.2679, pruned_loss=0.08079, over 23777.00 frames. ], tot_loss[loss=0.1993, simple_loss=0.2704, pruned_loss=0.06407, over 3373260.30 frames. ], batch size: 212, lr: 8.12e-03, grad_scale: 32.0 2023-09-29 16:45:03,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:45:03,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 16:45:05,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:45:05,702 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 16:45:08,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:45:10,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:45:10,251 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:45:15,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:45:17,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:45:17,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:45:18,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:45:23,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:45:30,853 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=426700.0, ans=0.125 2023-09-29 16:45:32,509 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=426700.0, ans=0.0 2023-09-29 16:45:35,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:45:36,720 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:45:36,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:45:45,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 16:45:46,024 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.65 vs. limit=15.0 2023-09-29 16:45:46,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 16:45:48,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:45:48,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:45:48,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 16:45:48,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:45:50,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:45:52,504 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.00 vs. limit=15.0 2023-09-29 16:45:53,234 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:45:56,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 16:45:56,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:45:57,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:45:57,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 16:45:57,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:46:00,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:46:01,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:46:01,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:46:04,927 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:46:06,551 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:46:06,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:46:09,889 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:46:10,327 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=426833.3333333333, ans=0.125 2023-09-29 16:46:13,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:46:15,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:46:17,763 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.31 vs. limit=15.0 2023-09-29 16:46:20,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:46:21,731 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.855e+02 2.073e+02 2.477e+02 4.320e+02, threshold=4.145e+02, percent-clipped=1.0 2023-09-29 16:46:23,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:46:26,631 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 16:46:28,104 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:46:29,467 INFO [train.py:1039] (1/4) Epoch 13, batch 300, loss[loss=0.1945, simple_loss=0.2675, pruned_loss=0.06072, over 24317.00 frames. ], tot_loss[loss=0.1967, simple_loss=0.2672, pruned_loss=0.0631, over 3666565.92 frames. ], batch size: 61, lr: 8.12e-03, grad_scale: 32.0 2023-09-29 16:46:29,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:46:31,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 16:46:32,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 16:46:32,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:46:32,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 16:46:39,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:46:40,039 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:46:40,393 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=426966.6666666667, ans=0.5 2023-09-29 16:46:43,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:46:43,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 16:46:45,263 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:46:45,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 16:46:45,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 16:46:45,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:46:49,373 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.70 vs. limit=15.0 2023-09-29 16:46:49,381 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.30 vs. limit=12.0 2023-09-29 16:46:50,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:46:52,356 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=427033.3333333333, ans=0.125 2023-09-29 16:46:55,206 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:46:56,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 16:47:01,078 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 16:47:01,154 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:47:02,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:47:05,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:47:05,710 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 16:47:05,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:47:09,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:47:12,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:47:12,735 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:47:17,270 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 16:47:18,816 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 16:47:18,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:47:22,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:47:23,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 16:47:25,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:47:28,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:47:33,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:47:33,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 16:47:36,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:47:36,988 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 16:47:40,706 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:47:42,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 16:47:42,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 16:47:42,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 16:47:43,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:47:45,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 16:47:47,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:47:47,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:47:48,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:47:49,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:47:50,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:47:51,849 INFO [train.py:1039] (1/4) Epoch 13, batch 350, loss[loss=0.2029, simple_loss=0.2667, pruned_loss=0.06959, over 23685.00 frames. ], tot_loss[loss=0.1952, simple_loss=0.2662, pruned_loss=0.06209, over 3901471.68 frames. ], batch size: 164, lr: 8.12e-03, grad_scale: 32.0 2023-09-29 16:47:55,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:47:55,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 16:47:55,292 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=427300.0, ans=0.125 2023-09-29 16:47:58,662 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:47:58,838 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=427300.0, ans=0.125 2023-09-29 16:48:05,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:48:07,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:48:07,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:48:10,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 16:48:12,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:48:12,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 16:48:15,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:48:15,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 16:48:15,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:48:18,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 16:48:20,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:48:22,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:48:23,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:48:24,205 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=427433.3333333333, ans=0.125 2023-09-29 16:48:25,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:48:26,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:48:26,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:48:26,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:48:26,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:48:28,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:48:28,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:48:29,127 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=15.43 vs. limit=15.0 2023-09-29 16:48:30,897 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=427433.3333333333, ans=0.125 2023-09-29 16:48:36,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:48:36,131 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 16:48:37,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:48:37,728 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:48:44,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 16:48:45,859 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:48:49,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:48:49,252 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:48:50,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:48:50,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 16:48:54,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:48:54,746 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 16:48:57,567 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 16:48:57,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:49:00,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:49:00,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 16:49:02,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:49:04,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:49:05,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:49:07,238 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 1.930e+02 2.101e+02 2.393e+02 3.670e+02, threshold=4.202e+02, percent-clipped=0.0 2023-09-29 16:49:07,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:49:07,421 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:49:09,244 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=427566.6666666667, ans=0.05 2023-09-29 16:49:10,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:49:13,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:49:14,286 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=427633.3333333333, ans=0.0 2023-09-29 16:49:15,487 INFO [train.py:1039] (1/4) Epoch 13, batch 400, loss[loss=0.2115, simple_loss=0.2686, pruned_loss=0.07721, over 23788.00 frames. ], tot_loss[loss=0.1944, simple_loss=0.2649, pruned_loss=0.06197, over 4078512.29 frames. ], batch size: 179, lr: 8.11e-03, grad_scale: 32.0 2023-09-29 16:49:15,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 16:49:17,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 16:49:17,864 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:49:17,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:49:19,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:49:20,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:49:24,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:49:26,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:49:28,988 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 16:49:30,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 16:49:30,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:49:32,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 16:49:32,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:49:38,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:49:38,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:49:38,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 16:49:38,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:49:38,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:49:38,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:49:40,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:49:43,541 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 16:49:43,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 16:49:48,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:49:50,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:49:50,672 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=427766.6666666667, ans=0.125 2023-09-29 16:49:50,679 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=427766.6666666667, ans=0.125 2023-09-29 16:49:52,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 16:49:52,612 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 16:49:53,127 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.04 vs. limit=15.0 2023-09-29 16:49:54,399 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=427766.6666666667, ans=0.125 2023-09-29 16:49:55,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:49:55,883 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=427766.6666666667, ans=0.0 2023-09-29 16:49:59,647 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=427766.6666666667, ans=0.125 2023-09-29 16:50:00,663 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:50:02,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=427766.6666666667, ans=0.0 2023-09-29 16:50:07,027 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 16:50:08,747 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 16:50:11,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 16:50:13,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:50:15,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:50:15,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 16:50:18,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:50:21,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 16:50:23,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:50:25,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:50:27,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 16:50:28,890 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 16:50:30,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 16:50:30,838 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=427900.0, ans=0.2 2023-09-29 16:50:31,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:50:31,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:50:34,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 16:50:34,298 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=427900.0, ans=0.125 2023-09-29 16:50:36,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:50:37,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:50:38,457 INFO [train.py:1039] (1/4) Epoch 13, batch 450, loss[loss=0.1997, simple_loss=0.2835, pruned_loss=0.05793, over 24442.00 frames. ], tot_loss[loss=0.1946, simple_loss=0.2656, pruned_loss=0.06181, over 4219424.72 frames. ], batch size: 69, lr: 8.11e-03, grad_scale: 32.0 2023-09-29 16:50:38,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 16:50:40,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 16:50:40,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:50:40,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:50:40,384 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=427966.6666666667, ans=0.125 2023-09-29 16:50:41,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:50:41,869 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=427966.6666666667, ans=10.0 2023-09-29 16:50:43,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 16:50:43,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:50:43,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:50:46,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:50:58,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:50:58,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:51:00,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 16:51:01,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 16:51:03,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 16:51:08,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:51:09,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:51:13,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:51:14,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:51:17,641 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.67 vs. limit=22.5 2023-09-29 16:51:18,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 16:51:18,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 16:51:21,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 16:51:21,354 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:51:22,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:51:22,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:51:25,119 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 16:51:25,142 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 16:51:26,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:51:26,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:51:26,976 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=428166.6666666667, ans=0.09899494936611666 2023-09-29 16:51:28,280 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 16:51:31,776 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 16:51:31,839 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:51:31,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 16:51:33,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 16:51:36,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:51:39,962 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 16:51:40,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 16:51:41,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 16:51:44,155 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=428233.3333333333, ans=0.125 2023-09-29 16:51:45,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:51:46,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 16:51:48,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 16:51:49,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:51:52,578 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.942e+02 2.289e+02 2.754e+02 3.873e+02, threshold=4.578e+02, percent-clipped=0.0 2023-09-29 16:51:55,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:51:58,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:52:00,615 INFO [train.py:1039] (1/4) Epoch 13, batch 500, loss[loss=0.2355, simple_loss=0.2879, pruned_loss=0.09151, over 22911.00 frames. ], tot_loss[loss=0.1951, simple_loss=0.2668, pruned_loss=0.06169, over 4347502.60 frames. ], batch size: 322, lr: 8.11e-03, grad_scale: 16.0 2023-09-29 16:52:00,726 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:52:00,779 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 16:52:03,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:52:05,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:52:05,425 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:52:05,441 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 16:52:07,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 16:52:07,631 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:52:10,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 16:52:16,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 16:52:17,518 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:52:21,077 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:52:21,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:52:21,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:52:33,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:52:33,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 16:52:34,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:52:34,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:52:34,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 16:52:34,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:52:38,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:52:39,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:52:39,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:52:39,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:52:40,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 16:52:43,563 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 16:52:49,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:52:50,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:52:51,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:52:51,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:52:51,841 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=428500.0, ans=0.0 2023-09-29 16:52:53,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 16:52:54,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 16:52:58,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:53:00,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:53:03,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:53:06,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:53:06,507 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=428566.6666666667, ans=0.0 2023-09-29 16:53:12,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:53:14,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 16:53:14,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:53:14,694 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:53:17,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 16:53:19,855 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 16:53:20,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:53:23,496 INFO [train.py:1039] (1/4) Epoch 13, batch 550, loss[loss=0.2185, simple_loss=0.2761, pruned_loss=0.0804, over 23804.00 frames. ], tot_loss[loss=0.1964, simple_loss=0.268, pruned_loss=0.06242, over 4432690.12 frames. ], batch size: 164, lr: 8.11e-03, grad_scale: 16.0 2023-09-29 16:53:27,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 16:53:27,460 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=428633.3333333333, ans=0.1 2023-09-29 16:53:28,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 16:53:30,026 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:53:30,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 16:53:31,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:53:31,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:53:31,701 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:53:33,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:53:33,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:53:33,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:53:35,198 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=428633.3333333333, ans=0.0 2023-09-29 16:53:36,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:53:37,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 16:53:39,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:53:39,731 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=428700.0, ans=0.125 2023-09-29 16:53:43,957 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:53:43,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:53:47,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:53:47,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:53:51,230 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=428700.0, ans=0.0 2023-09-29 16:53:52,375 WARNING [train.py:1197] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 16:53:54,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 16:53:54,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:54:03,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:54:03,281 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:54:03,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 16:54:07,922 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:54:07,943 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 16:54:08,084 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:54:09,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 16:54:12,445 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:54:12,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:54:12,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 16:54:14,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:54:15,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 16:54:15,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 16:54:17,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:54:19,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:54:19,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:54:19,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:54:20,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:54:20,955 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=428833.3333333333, ans=0.0 2023-09-29 16:54:22,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:54:25,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:54:25,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:54:25,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 16:54:27,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:54:28,583 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=13.90 vs. limit=15.0 2023-09-29 16:54:29,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:54:31,429 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:54:32,811 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:54:34,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 16:54:34,377 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 16:54:36,775 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=428900.0, ans=0.0 2023-09-29 16:54:39,420 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.595e+02 1.968e+02 2.209e+02 2.597e+02 3.344e+02, threshold=4.418e+02, percent-clipped=0.0 2023-09-29 16:54:39,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 16:54:44,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 16:54:44,477 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:54:44,810 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer_na.min_abs, batch_count=428966.6666666667, ans=0.02 2023-09-29 16:54:44,854 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=428966.6666666667, ans=0.1 2023-09-29 16:54:45,779 INFO [train.py:1039] (1/4) Epoch 13, batch 600, loss[loss=0.2021, simple_loss=0.2636, pruned_loss=0.0703, over 23691.00 frames. ], tot_loss[loss=0.1971, simple_loss=0.2686, pruned_loss=0.06286, over 4495365.50 frames. ], batch size: 149, lr: 8.10e-03, grad_scale: 16.0 2023-09-29 16:54:45,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 16:54:45,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:54:52,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:54:52,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 16:54:54,314 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 16:54:55,868 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 16:54:59,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:55:02,873 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:55:05,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 16:55:05,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:55:09,104 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=10.50 vs. limit=15.0 2023-09-29 16:55:10,791 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.83 vs. limit=22.5 2023-09-29 16:55:13,266 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer_na.min_abs, batch_count=429033.3333333333, ans=0.02 2023-09-29 16:55:14,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 16:55:18,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:55:18,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:55:18,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:55:25,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:55:25,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:55:27,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:55:33,268 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:55:39,250 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:55:39,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:55:39,274 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:55:43,972 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=429166.6666666667, ans=0.5 2023-09-29 16:55:48,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 16:55:54,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 16:55:54,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:55:55,475 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=429233.3333333333, ans=0.125 2023-09-29 16:55:56,912 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=429233.3333333333, ans=0.125 2023-09-29 16:55:58,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 16:55:58,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:56:00,576 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=429233.3333333333, ans=0.2 2023-09-29 16:56:01,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 16:56:02,468 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=429233.3333333333, ans=10.0 2023-09-29 16:56:03,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:56:03,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 16:56:08,412 INFO [train.py:1039] (1/4) Epoch 13, batch 650, loss[loss=0.2183, simple_loss=0.2789, pruned_loss=0.07882, over 23858.00 frames. ], tot_loss[loss=0.1964, simple_loss=0.2678, pruned_loss=0.06256, over 4540864.88 frames. ], batch size: 164, lr: 8.10e-03, grad_scale: 16.0 2023-09-29 16:56:08,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 16:56:11,366 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:56:13,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:56:15,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:56:15,571 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=429300.0, ans=0.0 2023-09-29 16:56:17,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:56:19,630 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=429300.0, ans=0.2 2023-09-29 16:56:20,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 16:56:20,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:56:28,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:56:28,204 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:56:30,017 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:56:34,458 WARNING [train.py:1197] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 16:56:34,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:56:36,529 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:56:40,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:56:40,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 16:56:42,476 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.89 vs. limit=15.0 2023-09-29 16:56:45,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:56:45,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:56:45,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 16:56:47,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:56:48,423 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:56:50,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:56:50,037 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 16:56:50,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:56:50,096 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:56:52,532 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=429433.3333333333, ans=0.0 2023-09-29 16:56:55,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:56:56,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:56:56,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:56:58,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 16:56:58,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 16:56:59,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:56:59,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:57:01,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 16:57:01,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:57:02,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 16:57:03,215 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 16:57:04,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 16:57:04,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:57:04,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:57:04,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:57:06,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:57:07,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:57:14,593 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:57:14,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:57:18,166 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:57:19,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:57:21,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 16:57:21,992 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:57:22,210 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=429566.6666666667, ans=0.125 2023-09-29 16:57:24,835 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.615e+02 2.054e+02 2.276e+02 2.735e+02 4.255e+02, threshold=4.551e+02, percent-clipped=0.0 2023-09-29 16:57:26,951 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=429566.6666666667, ans=0.0 2023-09-29 16:57:28,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 16:57:28,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:57:28,350 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:57:29,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:57:31,326 INFO [train.py:1039] (1/4) Epoch 13, batch 700, loss[loss=0.2011, simple_loss=0.2861, pruned_loss=0.05802, over 24076.00 frames. ], tot_loss[loss=0.1952, simple_loss=0.2662, pruned_loss=0.06214, over 4579750.12 frames. ], batch size: 80, lr: 8.10e-03, grad_scale: 16.0 2023-09-29 16:57:34,470 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 16:57:35,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 16:57:39,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 16:57:40,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:57:42,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:57:45,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 16:57:49,225 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:57:49,517 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=429700.0, ans=0.2 2023-09-29 16:57:50,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:57:51,076 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=429700.0, ans=0.1 2023-09-29 16:57:52,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:57:56,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 16:57:56,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:58:01,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:58:01,624 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=429700.0, ans=0.125 2023-09-29 16:58:04,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 16:58:04,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:58:05,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 16:58:07,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 16:58:12,121 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 16:58:12,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:58:13,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 16:58:14,353 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.62 vs. limit=15.0 2023-09-29 16:58:18,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:58:18,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 16:58:24,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:58:24,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:58:24,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 16:58:29,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:58:29,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:58:32,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:58:37,505 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=429900.0, ans=0.125 2023-09-29 16:58:38,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:58:40,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 16:58:43,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 16:58:43,459 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 16:58:45,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:58:45,414 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=429900.0, ans=0.125 2023-09-29 16:58:50,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:58:50,165 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:58:53,026 INFO [train.py:1039] (1/4) Epoch 13, batch 750, loss[loss=0.1734, simple_loss=0.249, pruned_loss=0.04887, over 24626.00 frames. ], tot_loss[loss=0.1942, simple_loss=0.2652, pruned_loss=0.0616, over 4603036.90 frames. ], batch size: 60, lr: 8.09e-03, grad_scale: 16.0 2023-09-29 16:58:53,093 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:58:53,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 16:58:57,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 16:58:57,689 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 16:58:57,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 16:58:57,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 16:58:57,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 16:58:59,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:59:00,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 16:59:02,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:59:04,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 16:59:06,140 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=429966.6666666667, ans=10.0 2023-09-29 16:59:07,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:59:08,759 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:59:08,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 16:59:08,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:59:11,888 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:59:13,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:59:15,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:59:16,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:59:16,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:59:16,690 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 16:59:18,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:59:19,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:59:21,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:59:24,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 16:59:26,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 16:59:26,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:59:27,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 16:59:27,980 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 16:59:28,696 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.78 vs. limit=15.0 2023-09-29 16:59:29,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 16:59:29,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:59:29,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 16:59:33,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:59:35,672 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=430100.0, ans=0.0 2023-09-29 16:59:43,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 16:59:43,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:59:43,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 16:59:44,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:59:45,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:59:46,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 16:59:46,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:59:48,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 16:59:49,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:59:51,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:59:52,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 16:59:54,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:59:58,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:59:59,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:59:59,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:00:03,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 17:00:08,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 17:00:08,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:00:09,478 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.868e+02 2.099e+02 2.383e+02 3.939e+02, threshold=4.199e+02, percent-clipped=0.0 2023-09-29 17:00:09,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:00:09,843 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=430233.3333333333, ans=0.015 2023-09-29 17:00:11,966 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:00:13,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:00:15,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:00:15,171 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 17:00:16,497 INFO [train.py:1039] (1/4) Epoch 13, batch 800, loss[loss=0.2006, simple_loss=0.2802, pruned_loss=0.06051, over 24473.00 frames. ], tot_loss[loss=0.1951, simple_loss=0.2667, pruned_loss=0.06176, over 4632959.09 frames. ], batch size: 69, lr: 8.09e-03, grad_scale: 32.0 2023-09-29 17:00:23,503 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.09 vs. limit=15.0 2023-09-29 17:00:24,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:00:24,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:00:25,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:00:25,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:00:26,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:00:27,407 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:00:30,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:00:31,310 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=430366.6666666667, ans=0.2 2023-09-29 17:00:35,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:00:35,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=430366.6666666667, ans=0.0 2023-09-29 17:00:36,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:00:39,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 17:00:41,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:00:41,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:00:43,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:00:43,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:00:43,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 17:00:43,236 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:00:45,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 17:00:47,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:00:50,733 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:00:50,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:00:52,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:00:53,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:00:53,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:00:57,630 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.76 vs. limit=15.0 2023-09-29 17:00:59,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:00:59,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:00:59,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 17:01:01,560 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 17:01:01,596 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 17:01:01,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 17:01:01,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:01:03,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:01:03,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:01:08,544 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 17:01:09,254 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.21 vs. limit=6.0 2023-09-29 17:01:09,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 17:01:10,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:01:10,768 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=13.24 vs. limit=15.0 2023-09-29 17:01:11,957 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 17:01:13,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:01:17,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:01:21,935 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:01:22,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 17:01:23,533 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:01:25,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 17:01:30,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:01:30,434 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=430566.6666666667, ans=0.125 2023-09-29 17:01:33,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:01:33,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 17:01:33,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:01:34,816 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:01:36,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 17:01:36,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:01:38,233 INFO [train.py:1039] (1/4) Epoch 13, batch 850, loss[loss=0.2025, simple_loss=0.2665, pruned_loss=0.06922, over 23791.00 frames. ], tot_loss[loss=0.1955, simple_loss=0.2674, pruned_loss=0.06186, over 4654472.63 frames. ], batch size: 212, lr: 8.09e-03, grad_scale: 16.0 2023-09-29 17:01:38,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:01:38,903 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=430633.3333333333, ans=0.125 2023-09-29 17:01:39,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:01:41,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 17:01:42,987 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:01:45,228 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 17:01:45,305 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 17:01:45,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 17:01:47,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:01:47,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:01:51,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:01:52,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:01:52,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 17:01:54,952 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=430700.0, ans=0.125 2023-09-29 17:01:56,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:01:56,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:01:57,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 17:02:02,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 17:02:03,711 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:02:05,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 17:02:08,001 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.19 vs. limit=15.0 2023-09-29 17:02:08,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 17:02:10,097 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 17:02:13,581 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 17:02:13,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:02:13,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:02:13,622 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 17:02:17,969 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:02:18,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:02:20,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 17:02:21,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:02:21,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:02:24,105 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:02:24,150 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 17:02:27,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:02:27,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 17:02:27,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 17:02:28,279 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=430833.3333333333, ans=0.125 2023-09-29 17:02:31,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:02:31,101 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:02:32,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:02:32,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:02:34,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:02:37,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:02:38,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 17:02:40,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:02:41,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:02:41,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 17:02:47,240 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=430900.0, ans=0.125 2023-09-29 17:02:51,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 17:02:51,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:02:53,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 17:02:53,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:02:53,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:02:57,228 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.595e+02 2.022e+02 2.303e+02 2.741e+02 5.777e+02, threshold=4.606e+02, percent-clipped=1.0 2023-09-29 17:02:57,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 17:03:01,463 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=430966.6666666667, ans=0.0 2023-09-29 17:03:02,572 INFO [train.py:1039] (1/4) Epoch 13, batch 900, loss[loss=0.2371, simple_loss=0.2985, pruned_loss=0.08784, over 23468.00 frames. ], tot_loss[loss=0.1975, simple_loss=0.2691, pruned_loss=0.06299, over 4658817.10 frames. ], batch size: 285, lr: 8.08e-03, grad_scale: 16.0 2023-09-29 17:03:05,661 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:03:07,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:03:07,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 17:03:10,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:03:10,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 17:03:11,946 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 17:03:13,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:03:13,414 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:03:13,686 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=430966.6666666667, ans=0.1 2023-09-29 17:03:14,812 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 17:03:14,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:03:26,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:03:26,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:03:26,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 17:03:29,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:03:34,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 17:03:34,713 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=431100.0, ans=0.125 2023-09-29 17:03:37,241 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.52 vs. limit=15.0 2023-09-29 17:03:37,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:03:42,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 17:03:42,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:03:43,996 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 17:03:45,344 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 17:03:53,209 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 17:03:53,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:03:53,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 17:03:53,665 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=431166.6666666667, ans=0.1 2023-09-29 17:03:59,984 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:04:01,369 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:04:03,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 17:04:03,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:04:08,450 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 17:04:10,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:04:11,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:04:12,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:04:13,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:04:18,164 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 17:04:18,239 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 17:04:19,809 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 17:04:19,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 17:04:21,447 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:04:22,897 INFO [train.py:1039] (1/4) Epoch 13, batch 950, loss[loss=0.1914, simple_loss=0.2688, pruned_loss=0.05702, over 23717.00 frames. ], tot_loss[loss=0.1972, simple_loss=0.269, pruned_loss=0.0627, over 4675611.08 frames. ], batch size: 85, lr: 8.08e-03, grad_scale: 16.0 2023-09-29 17:04:24,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 17:04:29,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:04:31,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:04:33,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:04:33,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 17:04:38,158 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 17:04:40,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:04:41,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:04:43,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:04:43,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:04:43,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 17:04:45,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 17:04:47,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:04:47,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 17:04:48,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:04:52,341 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=431366.6666666667, ans=0.0 2023-09-29 17:04:53,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:04:53,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:04:53,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:04:54,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 17:04:56,622 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 17:04:58,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:05:01,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:05:07,563 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:05:07,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:05:11,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 17:05:14,324 WARNING [train.py:1197] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 17:05:14,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 17:05:15,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:05:16,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:05:16,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:05:17,075 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=431500.0, ans=0.0 2023-09-29 17:05:19,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 17:05:19,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:05:20,384 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2.whitening_limit, batch_count=431500.0, ans=15.0 2023-09-29 17:05:24,843 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:05:24,958 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:05:26,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 17:05:26,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:05:26,441 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:05:26,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 17:05:31,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 17:05:31,565 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 17:05:34,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:05:40,315 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.652e+02 2.121e+02 2.375e+02 2.805e+02 4.363e+02, threshold=4.749e+02, percent-clipped=0.0 2023-09-29 17:05:40,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:05:41,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 17:05:42,017 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 17:05:46,077 INFO [train.py:1039] (1/4) Epoch 13, batch 1000, loss[loss=0.1904, simple_loss=0.2653, pruned_loss=0.05778, over 24654.00 frames. ], tot_loss[loss=0.1957, simple_loss=0.2678, pruned_loss=0.06182, over 4698545.38 frames. ], batch size: 65, lr: 8.08e-03, grad_scale: 16.0 2023-09-29 17:05:46,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:05:48,067 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 17:05:48,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:05:54,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:05:56,203 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 17:05:56,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 17:06:01,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:06:01,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:06:03,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:06:06,118 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 17:06:10,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 17:06:11,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 17:06:12,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:06:14,942 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 17:06:15,122 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 17:06:15,377 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=431700.0, ans=0.2 2023-09-29 17:06:17,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 17:06:17,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:06:19,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:06:26,042 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=431766.6666666667, ans=0.0 2023-09-29 17:06:27,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:06:28,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:06:28,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:06:29,136 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=431766.6666666667, ans=0.125 2023-09-29 17:06:30,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:06:30,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 17:06:31,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:06:31,996 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:06:33,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:06:33,476 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 17:06:38,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 17:06:39,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 17:06:40,181 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=431833.3333333333, ans=0.125 2023-09-29 17:06:41,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 17:06:44,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:06:44,801 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=431833.3333333333, ans=0.04949747468305833 2023-09-29 17:06:47,783 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=431833.3333333333, ans=0.125 2023-09-29 17:06:51,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:06:51,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:06:51,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:06:54,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:06:56,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 17:06:58,607 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:06:58,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 17:07:00,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 17:07:00,230 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:07:00,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:07:01,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:07:04,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 17:07:07,638 INFO [train.py:1039] (1/4) Epoch 13, batch 1050, loss[loss=0.1995, simple_loss=0.2692, pruned_loss=0.0649, over 23408.00 frames. ], tot_loss[loss=0.1947, simple_loss=0.2661, pruned_loss=0.06169, over 4690908.70 frames. ], batch size: 93, lr: 8.07e-03, grad_scale: 16.0 2023-09-29 17:07:07,790 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:07:11,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:07:11,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:07:12,066 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=431966.6666666667, ans=0.125 2023-09-29 17:07:14,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 17:07:16,672 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:07:18,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:07:19,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:07:21,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:07:24,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:07:24,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:07:26,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 17:07:26,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:07:26,788 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=432033.3333333333, ans=0.125 2023-09-29 17:07:27,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 17:07:28,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:07:28,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 17:07:33,256 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:07:33,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 17:07:33,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 17:07:35,698 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.74 vs. limit=6.0 2023-09-29 17:07:39,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:07:41,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:07:41,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:07:42,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 17:07:44,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 17:07:44,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:07:49,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 17:07:50,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 17:07:52,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:07:55,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 17:07:57,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 17:07:57,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:07:57,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:08:02,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:08:06,178 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 17:08:09,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 17:08:09,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 17:08:09,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:08:09,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:08:10,762 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 17:08:15,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:08:18,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:08:18,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:08:18,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:08:18,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:08:24,671 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.979e+02 2.212e+02 2.486e+02 3.871e+02, threshold=4.425e+02, percent-clipped=0.0 2023-09-29 17:08:24,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:08:24,835 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 17:08:26,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:08:26,418 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 17:08:26,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 17:08:27,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:08:29,439 INFO [train.py:1039] (1/4) Epoch 13, batch 1100, loss[loss=0.1806, simple_loss=0.2471, pruned_loss=0.05711, over 24477.00 frames. ], tot_loss[loss=0.194, simple_loss=0.2652, pruned_loss=0.06141, over 4687845.84 frames. ], batch size: 58, lr: 8.07e-03, grad_scale: 16.0 2023-09-29 17:08:29,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:08:36,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:08:41,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 17:08:43,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:08:43,582 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:08:43,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 17:08:43,941 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=432300.0, ans=0.0 2023-09-29 17:08:45,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:08:48,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 17:08:49,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:08:53,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:08:53,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 17:08:54,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 17:08:54,941 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:08:54,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:08:58,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:08:59,642 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:09:05,084 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:09:08,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 17:09:10,235 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 17:09:10,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:09:13,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:09:15,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 17:09:15,528 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:09:17,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 17:09:17,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:09:18,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:09:18,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:09:18,669 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:09:18,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 17:09:24,817 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:09:26,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 17:09:28,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 17:09:31,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 17:09:36,741 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 17:09:36,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 17:09:38,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:09:39,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:09:41,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:09:41,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 17:09:42,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:09:42,840 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:09:43,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 17:09:43,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:09:45,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 17:09:48,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:09:48,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:09:50,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:09:50,448 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=432566.6666666667, ans=0.125 2023-09-29 17:09:53,102 INFO [train.py:1039] (1/4) Epoch 13, batch 1150, loss[loss=0.2059, simple_loss=0.2784, pruned_loss=0.06669, over 23301.00 frames. ], tot_loss[loss=0.1936, simple_loss=0.2651, pruned_loss=0.06104, over 4701745.35 frames. ], batch size: 105, lr: 8.07e-03, grad_scale: 16.0 2023-09-29 17:09:54,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:09:57,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:10:01,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:10:01,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:10:01,411 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 17:10:02,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:10:04,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 17:10:07,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:10:07,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 17:10:12,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 17:10:15,924 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:10:20,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:10:21,805 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:10:21,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 17:10:21,910 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:10:23,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:10:26,394 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=432766.6666666667, ans=0.5 2023-09-29 17:10:27,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 17:10:28,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:10:30,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:10:38,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:10:45,496 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:10:46,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 17:10:46,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:10:46,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:10:53,210 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 17:10:54,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:11:03,402 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 17:11:03,868 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=432900.0, ans=0.1 2023-09-29 17:11:06,561 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:11:08,116 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:11:09,441 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.599e+02 1.852e+02 2.092e+02 2.448e+02 3.672e+02, threshold=4.183e+02, percent-clipped=0.0 2023-09-29 17:11:09,563 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 17:11:09,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:11:13,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:11:15,568 INFO [train.py:1039] (1/4) Epoch 13, batch 1200, loss[loss=0.1916, simple_loss=0.2668, pruned_loss=0.05817, over 24293.00 frames. ], tot_loss[loss=0.1943, simple_loss=0.2662, pruned_loss=0.06119, over 4709260.73 frames. ], batch size: 61, lr: 8.07e-03, grad_scale: 32.0 2023-09-29 17:11:17,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:11:17,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:11:18,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:11:18,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:11:20,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:11:21,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:11:23,468 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 17:11:24,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:11:25,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:11:28,100 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=432966.6666666667, ans=0.0 2023-09-29 17:11:29,234 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 17:11:31,657 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 17:11:32,466 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.91 vs. limit=15.0 2023-09-29 17:11:36,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:11:39,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:11:41,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:11:43,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:11:43,162 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 17:11:44,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:11:44,978 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=433033.3333333333, ans=0.125 2023-09-29 17:11:48,981 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=433100.0, ans=0.07 2023-09-29 17:11:51,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 17:11:51,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:11:53,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 17:11:54,910 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:11:59,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 17:12:02,800 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=433166.6666666667, ans=0.125 2023-09-29 17:12:03,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 17:12:04,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:12:05,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:12:07,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:12:09,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 17:12:11,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:12:11,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:12:13,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:12:13,594 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 17:12:14,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 17:12:14,614 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=433166.6666666667, ans=0.07 2023-09-29 17:12:15,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:12:15,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 17:12:18,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:12:18,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:12:22,565 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 17:12:24,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:12:27,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 17:12:32,423 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 17:12:34,057 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:12:36,902 INFO [train.py:1039] (1/4) Epoch 13, batch 1250, loss[loss=0.2025, simple_loss=0.2602, pruned_loss=0.07242, over 23752.00 frames. ], tot_loss[loss=0.1945, simple_loss=0.2669, pruned_loss=0.06099, over 4720425.49 frames. ], batch size: 179, lr: 8.06e-03, grad_scale: 32.0 2023-09-29 17:12:37,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:12:38,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:12:40,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:12:41,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 17:12:42,327 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.94 vs. limit=6.0 2023-09-29 17:12:43,543 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=433300.0, ans=0.125 2023-09-29 17:12:47,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:12:47,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:12:49,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 17:12:50,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:12:52,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 17:12:59,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 17:12:59,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:13:01,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:13:01,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:13:02,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 17:13:04,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 17:13:04,804 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 17:13:04,812 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:13:06,319 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:13:06,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:13:09,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:13:10,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 17:13:16,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 17:13:17,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:13:21,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:13:21,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 17:13:22,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:13:22,825 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 17:13:24,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:13:24,189 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:13:29,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:13:35,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:13:35,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:13:37,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 17:13:37,546 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 17:13:37,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 17:13:39,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:13:39,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 17:13:39,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:13:42,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 17:13:44,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:13:45,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 17:13:45,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 17:13:47,146 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:13:47,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 17:13:48,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:13:50,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 17:13:52,415 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:13:54,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:13:55,655 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.895e+02 2.072e+02 2.279e+02 3.563e+02, threshold=4.144e+02, percent-clipped=0.0 2023-09-29 17:13:55,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 17:13:57,536 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 17:14:00,451 INFO [train.py:1039] (1/4) Epoch 13, batch 1300, loss[loss=0.1706, simple_loss=0.2434, pruned_loss=0.04894, over 24446.00 frames. ], tot_loss[loss=0.196, simple_loss=0.2682, pruned_loss=0.06194, over 4714229.68 frames. ], batch size: 58, lr: 8.06e-03, grad_scale: 32.0 2023-09-29 17:14:02,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:14:02,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 17:14:07,126 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:14:10,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 17:14:11,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:14:13,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:14:13,707 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:14:15,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 17:14:16,245 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.60 vs. limit=10.0 2023-09-29 17:14:19,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:14:21,053 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.26 vs. limit=12.0 2023-09-29 17:14:21,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 17:14:23,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 17:14:27,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 17:14:31,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:14:32,179 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=433766.6666666667, ans=0.2 2023-09-29 17:14:33,384 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:14:33,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:14:36,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:14:36,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 17:14:38,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 17:14:38,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 17:14:46,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:14:46,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 17:14:46,793 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 17:14:47,046 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=433766.6666666667, ans=0.125 2023-09-29 17:14:48,263 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 17:14:48,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:14:51,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:14:51,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 17:14:52,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:14:53,021 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 17:14:55,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:14:59,592 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:14:59,596 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:15:03,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 17:15:04,768 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 17:15:06,317 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 17:15:10,834 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:15:14,461 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 17:15:15,964 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:15:22,690 INFO [train.py:1039] (1/4) Epoch 13, batch 1350, loss[loss=0.2017, simple_loss=0.271, pruned_loss=0.0662, over 23380.00 frames. ], tot_loss[loss=0.1951, simple_loss=0.2673, pruned_loss=0.06145, over 4716641.73 frames. ], batch size: 93, lr: 8.06e-03, grad_scale: 16.0 2023-09-29 17:15:22,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 17:15:25,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:15:28,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:15:33,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:15:33,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:15:36,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:15:36,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:15:40,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:15:41,012 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=434033.3333333333, ans=0.125 2023-09-29 17:15:42,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 17:15:43,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 17:15:43,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:15:46,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 17:15:47,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:15:49,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:15:49,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 17:15:50,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 17:15:51,100 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=434033.3333333333, ans=0.1 2023-09-29 17:15:53,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 17:15:53,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:15:55,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 17:16:07,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:16:18,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:16:18,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:16:18,999 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 17:16:20,788 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=434166.6666666667, ans=0.0 2023-09-29 17:16:22,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:16:23,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 17:16:23,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 17:16:23,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:16:28,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:16:30,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 17:16:31,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:16:37,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 17:16:40,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 17:16:40,391 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer_na.min_abs, batch_count=434233.3333333333, ans=0.02 2023-09-29 17:16:41,599 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 1.935e+02 2.126e+02 2.533e+02 4.347e+02, threshold=4.251e+02, percent-clipped=1.0 2023-09-29 17:16:43,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 17:16:44,926 INFO [train.py:1039] (1/4) Epoch 13, batch 1400, loss[loss=0.1736, simple_loss=0.2605, pruned_loss=0.04333, over 24634.00 frames. ], tot_loss[loss=0.1949, simple_loss=0.2667, pruned_loss=0.06155, over 4717155.40 frames. ], batch size: 68, lr: 8.05e-03, grad_scale: 16.0 2023-09-29 17:16:47,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:16:50,826 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:16:52,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:16:56,861 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 17:16:58,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 17:17:10,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:17:11,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:17:14,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:17:14,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 17:17:18,937 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:17:20,451 WARNING [train.py:1197] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 17:17:30,221 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:17:30,318 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:17:34,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 17:17:34,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:17:36,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:17:37,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:17:39,331 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:17:39,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:17:39,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:17:41,629 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:17:43,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 17:17:43,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:17:43,949 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.86 vs. limit=15.0 2023-09-29 17:17:47,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:17:48,168 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=434500.0, ans=10.0 2023-09-29 17:17:48,503 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.83 vs. limit=22.5 2023-09-29 17:17:51,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:18:01,788 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 17:18:03,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 17:18:03,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:18:06,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 17:18:07,640 INFO [train.py:1039] (1/4) Epoch 13, batch 1450, loss[loss=0.1842, simple_loss=0.2702, pruned_loss=0.04905, over 24647.00 frames. ], tot_loss[loss=0.1945, simple_loss=0.2662, pruned_loss=0.06147, over 4724862.31 frames. ], batch size: 73, lr: 8.05e-03, grad_scale: 16.0 2023-09-29 17:18:07,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:18:12,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:18:13,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:18:17,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:18:17,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:18:17,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 17:18:19,768 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=434633.3333333333, ans=0.1 2023-09-29 17:18:22,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:18:22,765 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 17:18:25,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:18:25,723 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 17:18:27,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 17:18:27,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 17:18:29,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:18:30,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:18:30,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 17:18:31,584 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:18:32,306 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.85 vs. limit=15.0 2023-09-29 17:18:33,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 17:18:33,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 17:18:33,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:18:34,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:18:36,570 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:18:38,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:18:38,871 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.73 vs. limit=15.0 2023-09-29 17:18:42,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:18:44,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:18:45,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:18:45,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:18:46,154 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=434766.6666666667, ans=0.125 2023-09-29 17:18:48,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:18:48,779 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:18:48,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:18:50,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:18:53,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 17:18:55,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:19:00,140 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 17:19:02,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:19:03,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:19:04,638 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:19:06,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 17:19:07,932 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=434833.3333333333, ans=0.0 2023-09-29 17:19:09,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:19:11,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 17:19:14,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 17:19:15,642 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:19:17,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:19:18,892 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:19:20,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 17:19:22,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 17:19:23,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 17:19:25,168 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:19:25,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 17:19:27,163 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.823e+02 1.982e+02 2.343e+02 3.097e+02, threshold=3.963e+02, percent-clipped=0.0 2023-09-29 17:19:30,430 INFO [train.py:1039] (1/4) Epoch 13, batch 1500, loss[loss=0.2084, simple_loss=0.2697, pruned_loss=0.0736, over 23666.00 frames. ], tot_loss[loss=0.1938, simple_loss=0.2658, pruned_loss=0.06093, over 4728512.58 frames. ], batch size: 164, lr: 8.05e-03, grad_scale: 16.0 2023-09-29 17:19:39,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 17:19:39,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:19:39,130 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:19:41,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:19:42,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:19:42,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:19:44,214 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 17:19:45,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:19:45,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 17:19:45,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:19:47,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:19:47,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:19:49,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:19:54,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:19:54,078 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 17:19:55,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 17:19:55,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:19:56,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:20:01,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 17:20:06,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 17:20:08,621 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:20:08,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 17:20:11,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 17:20:13,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:20:14,788 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:20:16,943 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:20:18,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 17:20:18,560 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:20:18,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:20:20,063 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 17:20:20,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:20:20,421 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=435166.6666666667, ans=0.125 2023-09-29 17:20:26,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:20:26,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 17:20:32,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 17:20:32,745 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=435166.6666666667, ans=0.0 2023-09-29 17:20:34,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 17:20:39,160 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 17:20:39,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:20:40,621 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 17:20:40,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:20:42,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:20:42,787 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 17:20:44,224 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 17:20:47,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 17:20:48,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:20:51,114 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=435300.0, ans=0.09899494936611666 2023-09-29 17:20:52,194 INFO [train.py:1039] (1/4) Epoch 13, batch 1550, loss[loss=0.1719, simple_loss=0.2513, pruned_loss=0.04625, over 24605.00 frames. ], tot_loss[loss=0.1944, simple_loss=0.267, pruned_loss=0.06088, over 4731621.63 frames. ], batch size: 60, lr: 8.04e-03, grad_scale: 16.0 2023-09-29 17:20:54,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:20:54,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:20:54,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:20:56,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:20:56,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:20:57,823 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 17:20:57,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 17:20:59,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:21:00,812 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 17:21:00,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 17:21:02,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:21:04,088 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:21:05,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:21:05,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:21:07,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:21:07,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:21:10,259 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 17:21:10,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:21:10,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:21:10,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 17:21:15,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:21:15,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 17:21:16,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:21:16,959 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 17:21:19,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 17:21:19,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 17:21:19,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:21:20,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:21:25,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:21:27,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 17:21:27,570 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 17:21:33,580 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.80 vs. limit=22.5 2023-09-29 17:21:35,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:21:40,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:21:40,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 17:21:40,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:21:40,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 17:21:45,172 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=435500.0, ans=0.0 2023-09-29 17:21:46,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 17:21:49,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:21:51,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:21:53,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:21:53,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:21:55,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 17:21:55,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 17:21:57,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:21:58,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:21:58,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 17:21:58,718 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 17:22:01,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:22:03,454 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=435566.6666666667, ans=0.0 2023-09-29 17:22:08,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 17:22:11,514 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 2.000e+02 2.251e+02 2.787e+02 4.721e+02, threshold=4.502e+02, percent-clipped=2.0 2023-09-29 17:22:11,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:22:13,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:22:13,928 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.35 vs. limit=15.0 2023-09-29 17:22:14,589 INFO [train.py:1039] (1/4) Epoch 13, batch 1600, loss[loss=0.2011, simple_loss=0.2727, pruned_loss=0.06478, over 23234.00 frames. ], tot_loss[loss=0.1941, simple_loss=0.2675, pruned_loss=0.06037, over 4743023.20 frames. ], batch size: 93, lr: 8.04e-03, grad_scale: 32.0 2023-09-29 17:22:14,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 17:22:16,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 17:22:16,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:22:16,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:22:16,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:22:18,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:22:21,757 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=435633.3333333333, ans=0.2 2023-09-29 17:22:22,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:22:22,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 17:22:24,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 17:22:26,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 17:22:28,092 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:22:30,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 17:22:31,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:22:32,560 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.68 vs. limit=15.0 2023-09-29 17:22:34,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:22:35,223 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=435700.0, ans=0.1 2023-09-29 17:22:39,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:22:44,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 17:22:46,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:22:46,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 17:22:46,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:22:47,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 17:22:51,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 17:22:53,824 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=435766.6666666667, ans=0.2 2023-09-29 17:22:58,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:22:58,345 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=435766.6666666667, ans=0.0 2023-09-29 17:23:00,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 17:23:00,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:23:02,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:23:02,273 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:23:06,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 17:23:09,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 17:23:12,091 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:23:13,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:23:14,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:23:14,896 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:23:16,435 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:23:17,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:23:19,459 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:23:25,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:23:27,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:23:27,548 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=435900.0, ans=0.0 2023-09-29 17:23:30,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 17:23:30,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:23:30,368 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 17:23:35,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:23:37,175 INFO [train.py:1039] (1/4) Epoch 13, batch 1650, loss[loss=0.1853, simple_loss=0.2617, pruned_loss=0.05445, over 24483.00 frames. ], tot_loss[loss=0.1958, simple_loss=0.2688, pruned_loss=0.06139, over 4732201.28 frames. ], batch size: 66, lr: 8.04e-03, grad_scale: 32.0 2023-09-29 17:23:37,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:23:37,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:23:37,405 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 17:23:38,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 17:23:38,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 17:23:38,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 17:23:44,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:23:44,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:23:45,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:23:45,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 17:23:48,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:23:50,207 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 17:23:53,232 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:23:54,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:23:54,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:23:54,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:23:54,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 17:23:54,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 17:23:56,601 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=436033.3333333333, ans=0.0 2023-09-29 17:23:59,941 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 17:24:01,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 17:24:11,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 17:24:13,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:24:14,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 17:24:16,681 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=436100.0, ans=0.125 2023-09-29 17:24:18,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:24:20,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:24:20,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:24:21,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:24:23,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:24:23,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:24:26,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:24:27,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:24:29,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:24:29,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:24:29,676 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=436166.6666666667, ans=0.0 2023-09-29 17:24:30,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:24:30,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 17:24:36,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:24:36,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 17:24:39,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:24:39,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 17:24:41,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 17:24:41,629 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 17:24:43,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:24:43,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:24:43,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:24:44,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:24:44,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 17:24:50,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:24:51,585 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:24:51,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:24:55,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 17:24:55,439 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=436233.3333333333, ans=0.125 2023-09-29 17:24:56,552 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.665e+02 2.021e+02 2.239e+02 2.775e+02 4.189e+02, threshold=4.478e+02, percent-clipped=0.0 2023-09-29 17:24:59,918 INFO [train.py:1039] (1/4) Epoch 13, batch 1700, loss[loss=0.2069, simple_loss=0.2662, pruned_loss=0.07376, over 23779.00 frames. ], tot_loss[loss=0.1955, simple_loss=0.2675, pruned_loss=0.06172, over 4721728.41 frames. ], batch size: 212, lr: 8.03e-03, grad_scale: 32.0 2023-09-29 17:25:00,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:25:00,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:25:00,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 17:25:00,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:25:00,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:25:00,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:25:03,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:25:04,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:25:04,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 17:25:07,714 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 17:25:16,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:25:18,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:25:24,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 17:25:24,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:25:26,884 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:25:26,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:25:29,917 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 17:25:33,018 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:25:33,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:25:33,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:25:34,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 17:25:36,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 17:25:37,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 17:25:38,280 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=436433.3333333333, ans=0.125 2023-09-29 17:25:39,407 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:25:40,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 17:25:44,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:25:53,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:25:53,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:25:54,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:25:56,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 17:25:56,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 17:25:57,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:25:59,463 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:25:59,465 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 17:26:01,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:26:01,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:26:01,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:26:01,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:26:04,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:26:04,711 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:26:04,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:26:04,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:26:05,039 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:26:06,775 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=436566.6666666667, ans=0.1 2023-09-29 17:26:09,859 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:26:11,261 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 17:26:14,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:26:16,291 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:26:17,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 17:26:22,893 INFO [train.py:1039] (1/4) Epoch 13, batch 1750, loss[loss=0.1915, simple_loss=0.2784, pruned_loss=0.05234, over 24645.00 frames. ], tot_loss[loss=0.1938, simple_loss=0.2658, pruned_loss=0.06088, over 4715399.88 frames. ], batch size: 73, lr: 8.03e-03, grad_scale: 32.0 2023-09-29 17:26:24,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:26:26,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:26:28,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 17:26:28,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 17:26:28,449 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:26:32,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:26:32,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:26:39,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 17:26:40,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:26:43,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 17:26:44,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:26:45,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:26:48,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 17:26:50,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 17:26:51,003 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.80 vs. limit=15.0 2023-09-29 17:26:52,168 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:26:52,244 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 17:26:52,466 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=436700.0, ans=0.2 2023-09-29 17:27:00,783 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:27:04,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:27:04,435 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:27:07,525 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:27:07,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:27:11,093 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:27:12,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:27:13,389 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.91 vs. limit=12.0 2023-09-29 17:27:14,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:27:14,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:27:15,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 17:27:17,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:27:19,115 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=436833.3333333333, ans=0.2 2023-09-29 17:27:19,260 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=436833.3333333333, ans=0.125 2023-09-29 17:27:20,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 17:27:22,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:27:22,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:27:22,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:27:22,836 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.53 vs. limit=22.5 2023-09-29 17:27:26,437 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=436833.3333333333, ans=0.125 2023-09-29 17:27:28,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 17:27:28,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 17:27:29,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:27:31,262 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=436900.0, ans=0.125 2023-09-29 17:27:32,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:27:37,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:27:40,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:27:42,027 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.899e+02 2.041e+02 2.421e+02 3.023e+02, threshold=4.083e+02, percent-clipped=0.0 2023-09-29 17:27:43,656 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:27:44,684 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=436966.6666666667, ans=0.0 2023-09-29 17:27:45,434 INFO [train.py:1039] (1/4) Epoch 13, batch 1800, loss[loss=0.1972, simple_loss=0.2634, pruned_loss=0.06548, over 23715.00 frames. ], tot_loss[loss=0.1935, simple_loss=0.2655, pruned_loss=0.06073, over 4714604.88 frames. ], batch size: 232, lr: 8.03e-03, grad_scale: 32.0 2023-09-29 17:27:45,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 17:27:45,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:27:46,358 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.18 vs. limit=15.0 2023-09-29 17:27:48,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 17:27:48,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:27:48,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:27:48,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:27:48,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:27:51,902 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:27:52,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:27:55,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 17:27:56,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:27:59,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 17:28:01,364 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:28:04,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:28:05,268 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=437033.3333333333, ans=0.125 2023-09-29 17:28:08,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:28:08,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:28:08,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:28:12,406 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:28:12,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 17:28:13,935 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:28:18,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:28:22,032 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 17:28:23,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 17:28:23,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 17:28:25,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:28:26,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:28:26,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:28:28,077 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:28:32,923 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 17:28:34,509 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:28:38,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:28:38,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 17:28:39,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 17:28:41,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:28:41,839 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=437166.6666666667, ans=0.125 2023-09-29 17:28:43,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:28:44,051 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=437166.6666666667, ans=0.1 2023-09-29 17:28:45,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 17:28:48,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 17:28:50,160 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=437233.3333333333, ans=0.125 2023-09-29 17:28:52,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.whiten.whitening_limit, batch_count=437233.3333333333, ans=12.0 2023-09-29 17:28:54,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:28:56,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 17:28:56,125 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:28:56,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:28:56,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:28:58,357 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 17:29:01,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:29:01,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:29:03,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 17:29:03,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:29:04,797 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:29:04,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 17:29:04,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:29:07,703 INFO [train.py:1039] (1/4) Epoch 13, batch 1850, loss[loss=0.1523, simple_loss=0.2257, pruned_loss=0.0394, over 24322.00 frames. ], tot_loss[loss=0.1927, simple_loss=0.2656, pruned_loss=0.05991, over 4728700.86 frames. ], batch size: 56, lr: 8.03e-03, grad_scale: 16.0 2023-09-29 17:29:07,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:29:07,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:29:08,044 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 17:29:09,502 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:29:10,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:29:14,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:29:14,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:29:22,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:29:22,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 17:29:24,825 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=437366.6666666667, ans=0.0 2023-09-29 17:29:26,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 17:29:29,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 17:29:32,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:29:34,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 17:29:34,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 17:29:40,930 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=437433.3333333333, ans=0.0 2023-09-29 17:29:43,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:29:47,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 17:29:47,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:29:49,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:29:52,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 17:29:52,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:29:54,233 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 17:29:54,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:29:58,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:30:01,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:30:05,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 17:30:05,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:30:05,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 17:30:05,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:30:08,567 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:30:10,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:30:13,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 17:30:13,280 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:30:13,750 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=437566.6666666667, ans=0.125 2023-09-29 17:30:16,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:30:17,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:30:17,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 17:30:17,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 17:30:19,426 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 17:30:21,380 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 17:30:24,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 17:30:24,941 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:30:24,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:30:24,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:30:25,101 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 17:30:25,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:30:25,298 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=437566.6666666667, ans=0.125 2023-09-29 17:30:27,187 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:30:28,476 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.931e+02 2.222e+02 2.775e+02 3.962e+02, threshold=4.445e+02, percent-clipped=0.0 2023-09-29 17:30:28,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 17:30:28,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 17:30:30,327 INFO [train.py:1039] (1/4) Epoch 13, batch 1900, loss[loss=0.194, simple_loss=0.2765, pruned_loss=0.05581, over 24456.00 frames. ], tot_loss[loss=0.1933, simple_loss=0.2657, pruned_loss=0.06039, over 4719955.05 frames. ], batch size: 69, lr: 8.02e-03, grad_scale: 16.0 2023-09-29 17:30:30,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:30:30,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 17:30:33,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:30:33,573 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 17:30:33,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:30:35,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:30:40,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:30:43,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:30:43,728 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 17:30:45,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 17:30:45,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:30:46,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:30:46,808 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 17:30:47,089 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 17:30:48,202 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 17:30:48,719 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=437700.0, ans=0.1 2023-09-29 17:30:50,350 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=437700.0, ans=0.125 2023-09-29 17:30:51,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 17:30:51,883 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=437700.0, ans=0.125 2023-09-29 17:30:52,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:30:56,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 17:31:00,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 17:31:06,901 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=437766.6666666667, ans=0.125 2023-09-29 17:31:11,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 17:31:13,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 17:31:14,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:31:14,913 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 17:31:14,920 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 17:31:14,991 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 17:31:16,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 17:31:16,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:31:17,244 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.17 vs. limit=15.0 2023-09-29 17:31:21,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 17:31:25,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:31:27,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:31:27,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 17:31:27,333 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=437833.3333333333, ans=0.125 2023-09-29 17:31:28,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:31:31,343 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.out_whiten.whitening_limit, batch_count=437833.3333333333, ans=8.0 2023-09-29 17:31:34,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 17:31:36,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:31:41,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:31:41,163 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:31:41,184 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:31:42,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:31:46,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 17:31:46,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 17:31:46,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:31:49,264 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:31:49,266 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:31:49,931 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.38 vs. limit=10.0 2023-09-29 17:31:52,162 INFO [train.py:1039] (1/4) Epoch 13, batch 1950, loss[loss=0.1765, simple_loss=0.259, pruned_loss=0.04698, over 24680.00 frames. ], tot_loss[loss=0.1948, simple_loss=0.2672, pruned_loss=0.06117, over 4707469.23 frames. ], batch size: 73, lr: 8.02e-03, grad_scale: 8.0 2023-09-29 17:31:52,282 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:31:52,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:31:52,358 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:31:52,661 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=437966.6666666667, ans=0.1 2023-09-29 17:31:53,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:31:55,766 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:32:00,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:32:00,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:32:00,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 17:32:01,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 17:32:04,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 17:32:04,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:32:05,268 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=437966.6666666667, ans=0.1 2023-09-29 17:32:07,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:32:08,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:32:10,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:32:10,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:32:12,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:32:16,217 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:32:16,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 17:32:17,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:32:17,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:32:20,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:32:24,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:32:24,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:32:24,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 17:32:24,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 17:32:24,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 17:32:24,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:32:25,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:32:27,844 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=438100.0, ans=0.1 2023-09-29 17:32:29,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:32:30,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:32:35,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 17:32:38,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:32:38,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:32:40,362 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 17:32:40,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:32:45,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:32:47,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:32:48,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 17:32:55,546 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=12.06 vs. limit=15.0 2023-09-29 17:32:56,065 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:32:56,365 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=438166.6666666667, ans=0.125 2023-09-29 17:32:57,505 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:32:59,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:33:00,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:33:03,332 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.81 vs. limit=12.0 2023-09-29 17:33:03,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:33:03,812 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:33:05,216 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 17:33:05,235 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 17:33:06,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:33:06,891 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=438233.3333333333, ans=0.125 2023-09-29 17:33:08,239 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 17:33:11,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:33:14,894 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 2.051e+02 2.174e+02 2.503e+02 4.017e+02, threshold=4.347e+02, percent-clipped=0.0 2023-09-29 17:33:14,935 INFO [train.py:1039] (1/4) Epoch 13, batch 2000, loss[loss=0.1844, simple_loss=0.2625, pruned_loss=0.05313, over 24474.00 frames. ], tot_loss[loss=0.1957, simple_loss=0.2677, pruned_loss=0.06186, over 4705174.56 frames. ], batch size: 66, lr: 8.02e-03, grad_scale: 16.0 2023-09-29 17:33:15,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 17:33:17,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:33:17,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:33:18,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:33:21,802 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:33:24,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 17:33:25,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:33:27,283 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=438300.0, ans=0.2 2023-09-29 17:33:28,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:33:30,275 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 17:33:31,500 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 17:33:33,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 17:33:33,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:33:36,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:33:36,363 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=438366.6666666667, ans=0.0 2023-09-29 17:33:37,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 17:33:40,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:33:43,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:33:43,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:33:44,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 17:33:45,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 17:33:46,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 17:33:46,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:33:50,738 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:33:52,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 17:33:52,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:33:54,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:33:54,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:33:56,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 17:33:59,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 17:33:59,994 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:34:00,007 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:34:04,827 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=438500.0, ans=0.1 2023-09-29 17:34:06,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:34:07,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:34:07,638 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:34:07,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:34:09,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:34:10,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:34:10,885 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:34:10,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:34:12,459 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:34:15,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:34:15,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 17:34:21,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 17:34:23,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:34:27,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:34:27,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:34:33,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:34:35,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:34:35,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:34:36,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 17:34:36,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 17:34:37,992 INFO [train.py:1039] (1/4) Epoch 13, batch 2050, loss[loss=0.1892, simple_loss=0.2637, pruned_loss=0.05733, over 23330.00 frames. ], tot_loss[loss=0.1946, simple_loss=0.2664, pruned_loss=0.06134, over 4702459.48 frames. ], batch size: 93, lr: 8.01e-03, grad_scale: 16.0 2023-09-29 17:34:38,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:34:39,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:34:42,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:34:43,617 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.48 vs. limit=15.0 2023-09-29 17:34:44,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:34:49,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:34:52,203 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:34:52,298 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:34:53,102 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.71 vs. limit=22.5 2023-09-29 17:34:53,862 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:34:54,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 17:34:54,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:34:54,380 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=438700.0, ans=0.125 2023-09-29 17:34:55,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:34:55,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 17:34:55,898 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=438700.0, ans=0.125 2023-09-29 17:34:59,613 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=438700.0, ans=0.125 2023-09-29 17:35:05,826 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.65 vs. limit=15.0 2023-09-29 17:35:09,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:35:09,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:35:11,524 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 17:35:13,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:35:14,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 17:35:14,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:35:19,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:35:20,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:35:22,469 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 17:35:22,534 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:35:24,090 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:35:25,576 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:35:25,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:35:30,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:35:30,477 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=438833.3333333333, ans=0.0 2023-09-29 17:35:32,211 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:35:35,730 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 17:35:35,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:35:39,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:35:44,444 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:35:45,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 17:35:50,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:35:52,268 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:35:53,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:35:55,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 17:35:59,782 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.857e+02 2.010e+02 2.339e+02 3.458e+02, threshold=4.021e+02, percent-clipped=0.0 2023-09-29 17:35:59,846 INFO [train.py:1039] (1/4) Epoch 13, batch 2100, loss[loss=0.202, simple_loss=0.2596, pruned_loss=0.07216, over 23751.00 frames. ], tot_loss[loss=0.1929, simple_loss=0.2648, pruned_loss=0.06049, over 4708833.08 frames. ], batch size: 164, lr: 8.01e-03, grad_scale: 16.0 2023-09-29 17:36:00,015 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 17:36:00,016 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:36:01,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:36:01,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 17:36:03,009 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:36:03,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 17:36:03,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 17:36:05,360 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:36:08,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:36:09,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:36:11,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:36:11,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:36:11,700 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 17:36:13,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:36:13,802 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 17:36:13,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 17:36:15,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:36:17,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:36:17,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 17:36:17,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 17:36:17,570 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=439033.3333333333, ans=0.2 2023-09-29 17:36:23,376 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 17:36:23,377 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 17:36:27,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:36:27,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:36:30,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:36:31,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 17:36:32,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:36:32,501 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 17:36:34,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 17:36:35,483 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:36:35,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 17:36:37,668 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 17:36:37,749 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 17:36:39,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 17:36:42,285 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:36:44,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 17:36:44,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 17:36:47,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:36:49,178 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:36:49,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 17:36:49,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:36:51,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:36:52,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:36:52,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 17:36:54,025 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 17:36:54,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 17:36:58,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:37:01,811 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:37:03,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 17:37:07,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:37:09,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:37:09,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:37:09,739 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:37:09,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 17:37:09,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 17:37:13,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:37:13,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:37:14,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:37:14,991 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:37:16,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 17:37:18,329 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 17:37:18,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:37:21,811 INFO [train.py:1039] (1/4) Epoch 13, batch 2150, loss[loss=0.1849, simple_loss=0.2511, pruned_loss=0.05929, over 23560.00 frames. ], tot_loss[loss=0.1928, simple_loss=0.2641, pruned_loss=0.06071, over 4712014.56 frames. ], batch size: 149, lr: 8.01e-03, grad_scale: 16.0 2023-09-29 17:37:22,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:37:22,016 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:37:23,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:37:23,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:37:23,719 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=439300.0, ans=0.1 2023-09-29 17:37:30,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 17:37:31,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:37:33,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:37:34,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:37:34,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:37:36,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:37:39,546 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:37:40,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:37:40,981 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:37:42,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:37:42,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 17:37:48,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:37:50,905 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:37:52,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:37:52,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:37:52,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:37:53,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 17:37:53,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:37:53,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:37:55,320 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:37:56,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 17:37:58,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:37:59,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:37:59,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:38:01,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 17:38:01,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:38:05,267 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:38:06,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:38:06,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:38:06,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 17:38:06,928 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 17:38:11,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:38:12,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:38:13,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:38:14,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 17:38:14,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:38:16,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:38:16,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 17:38:18,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 17:38:19,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:38:19,994 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 17:38:21,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:38:21,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:38:23,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 17:38:23,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:38:23,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 17:38:23,617 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 17:38:23,617 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 17:38:23,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 17:38:25,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:38:26,797 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:38:26,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:38:28,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:38:29,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 17:38:31,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:38:31,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:38:39,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:38:41,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 17:38:44,268 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.622e+02 1.873e+02 2.053e+02 2.392e+02 4.399e+02, threshold=4.106e+02, percent-clipped=1.0 2023-09-29 17:38:44,310 INFO [train.py:1039] (1/4) Epoch 13, batch 2200, loss[loss=0.1843, simple_loss=0.2673, pruned_loss=0.05066, over 24350.00 frames. ], tot_loss[loss=0.1929, simple_loss=0.2639, pruned_loss=0.06098, over 4697495.87 frames. ], batch size: 74, lr: 8.00e-03, grad_scale: 16.0 2023-09-29 17:38:44,530 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:38:44,697 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=439633.3333333333, ans=0.0 2023-09-29 17:38:49,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:38:50,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:38:52,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:38:52,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:38:57,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:38:57,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:38:57,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 17:39:03,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 17:39:03,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 17:39:05,350 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=439700.0, ans=0.125 2023-09-29 17:39:05,788 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.68 vs. limit=22.5 2023-09-29 17:39:08,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 17:39:11,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:39:12,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:39:13,905 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:39:19,003 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:39:19,065 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 17:39:23,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 17:39:26,550 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:39:28,597 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 17:39:31,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:39:33,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:39:35,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:39:37,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:39:38,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 17:39:40,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:39:41,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 17:39:43,823 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=439833.3333333333, ans=0.125 2023-09-29 17:39:44,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:39:44,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 17:39:44,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:39:48,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:39:48,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:39:48,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:39:48,200 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:39:48,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 17:39:48,637 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=439900.0, ans=0.0 2023-09-29 17:39:49,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:39:51,440 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 17:39:54,621 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 17:39:54,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:39:57,103 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=439900.0, ans=0.125 2023-09-29 17:39:58,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 17:39:58,405 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 17:40:01,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 17:40:01,955 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 17:40:02,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 17:40:03,482 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 17:40:05,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:40:05,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 17:40:07,067 INFO [train.py:1039] (1/4) Epoch 13, batch 2250, loss[loss=0.2726, simple_loss=0.322, pruned_loss=0.1116, over 19740.00 frames. ], tot_loss[loss=0.194, simple_loss=0.2653, pruned_loss=0.06135, over 4700844.76 frames. ], batch size: 388, lr: 8.00e-03, grad_scale: 16.0 2023-09-29 17:40:08,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:40:08,790 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 17:40:11,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:40:13,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:40:16,459 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=439966.6666666667, ans=0.125 2023-09-29 17:40:16,546 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=439966.6666666667, ans=0.125 2023-09-29 17:40:19,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:40:21,101 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:40:25,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:40:27,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:40:27,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:40:28,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 17:40:30,842 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:40:30,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:40:33,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 17:40:34,090 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:40:34,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:40:37,668 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:40:40,108 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 17:40:43,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:40:45,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 17:40:45,219 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 17:40:46,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 17:40:48,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:40:49,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:40:55,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:40:57,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:40:58,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:40:58,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:41:00,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:41:01,581 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.36 vs. limit=15.0 2023-09-29 17:41:02,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:41:06,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:41:07,792 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 17:41:15,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 17:41:17,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 17:41:18,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:41:23,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 17:41:26,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 17:41:26,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 17:41:26,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:41:26,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:41:26,640 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=440233.3333333333, ans=0.125 2023-09-29 17:41:27,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 17:41:29,322 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.929e+02 2.161e+02 2.428e+02 3.244e+02, threshold=4.321e+02, percent-clipped=0.0 2023-09-29 17:41:29,379 INFO [train.py:1039] (1/4) Epoch 13, batch 2300, loss[loss=0.2298, simple_loss=0.2907, pruned_loss=0.08451, over 22787.00 frames. ], tot_loss[loss=0.1949, simple_loss=0.2665, pruned_loss=0.06164, over 4710610.42 frames. ], batch size: 322, lr: 8.00e-03, grad_scale: 16.0 2023-09-29 17:41:31,208 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=440300.0, ans=0.125 2023-09-29 17:41:32,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:41:33,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:41:35,987 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 17:41:38,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:41:38,920 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:41:42,420 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 17:41:44,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:41:44,356 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=440366.6666666667, ans=0.0 2023-09-29 17:41:52,805 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:41:52,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 17:41:54,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:41:54,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:41:54,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 17:41:54,582 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=440366.6666666667, ans=0.0 2023-09-29 17:41:55,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:41:56,130 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=440366.6666666667, ans=0.125 2023-09-29 17:41:58,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:41:59,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:42:02,197 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:42:06,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:42:08,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:42:13,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:42:13,260 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:42:16,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:42:19,071 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.05 vs. limit=15.0 2023-09-29 17:42:19,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:42:23,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:42:23,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 17:42:25,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:42:25,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 17:42:30,345 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 17:42:30,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:42:30,588 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=440500.0, ans=0.0 2023-09-29 17:42:31,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:42:32,009 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:42:33,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:42:34,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 17:42:34,949 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 17:42:35,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 17:42:35,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:42:35,068 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:42:35,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 17:42:41,549 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:42:44,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:42:49,390 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:42:49,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:42:51,367 INFO [train.py:1039] (1/4) Epoch 13, batch 2350, loss[loss=0.1825, simple_loss=0.2584, pruned_loss=0.05328, over 24649.00 frames. ], tot_loss[loss=0.1952, simple_loss=0.2669, pruned_loss=0.06176, over 4718784.52 frames. ], batch size: 65, lr: 8.00e-03, grad_scale: 16.0 2023-09-29 17:42:51,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 17:42:51,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 17:42:51,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:42:53,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 17:42:53,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 17:43:01,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:43:01,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 17:43:07,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 17:43:10,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:43:13,608 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:43:14,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:43:14,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:43:14,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:43:15,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 17:43:18,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:43:25,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 17:43:26,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:43:30,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 17:43:30,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:43:34,343 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:43:35,917 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 17:43:35,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:43:39,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:43:39,025 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:43:39,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:43:42,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:43:43,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 17:43:45,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:43:46,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:43:46,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:43:48,706 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=440833.3333333333, ans=0.125 2023-09-29 17:43:49,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 17:43:49,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:43:54,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 17:43:54,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 17:43:59,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 17:44:04,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 17:44:04,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:44:04,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 17:44:06,475 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 17:44:06,515 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 17:44:07,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 17:44:10,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:44:14,662 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.459e+02 1.807e+02 2.059e+02 2.357e+02 3.650e+02, threshold=4.118e+02, percent-clipped=0.0 2023-09-29 17:44:14,724 INFO [train.py:1039] (1/4) Epoch 13, batch 2400, loss[loss=0.2127, simple_loss=0.255, pruned_loss=0.08518, over 19687.00 frames. ], tot_loss[loss=0.195, simple_loss=0.2662, pruned_loss=0.06189, over 4705164.50 frames. ], batch size: 388, lr: 7.99e-03, grad_scale: 32.0 2023-09-29 17:44:14,838 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:44:15,286 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=440966.6666666667, ans=0.125 2023-09-29 17:44:18,003 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:44:20,927 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:44:21,027 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 17:44:21,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 17:44:21,229 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=440966.6666666667, ans=0.125 2023-09-29 17:44:30,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 17:44:30,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:44:32,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 17:44:32,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:44:32,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:44:33,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 17:44:40,733 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:44:42,304 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 17:44:46,290 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=441100.0, ans=0.125 2023-09-29 17:44:47,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 17:44:49,488 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=441100.0, ans=0.1 2023-09-29 17:44:50,671 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 17:44:53,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:44:53,850 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=441100.0, ans=0.125 2023-09-29 17:44:55,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:44:58,644 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=441100.0, ans=0.1 2023-09-29 17:44:59,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:44:59,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 17:44:59,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 17:45:08,707 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:45:12,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:45:14,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:45:16,176 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:45:17,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 17:45:17,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:45:17,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:45:17,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:45:19,171 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 17:45:20,968 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=441233.3333333333, ans=0.125 2023-09-29 17:45:23,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:45:24,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 17:45:24,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 17:45:25,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 17:45:26,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:45:26,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:45:26,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 17:45:28,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 17:45:29,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 17:45:29,842 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 17:45:31,514 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 17:45:32,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:45:34,461 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:45:34,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:45:35,884 INFO [train.py:1039] (1/4) Epoch 13, batch 2450, loss[loss=0.1802, simple_loss=0.2539, pruned_loss=0.05329, over 24454.00 frames. ], tot_loss[loss=0.1939, simple_loss=0.265, pruned_loss=0.06145, over 4704696.07 frames. ], batch size: 58, lr: 7.99e-03, grad_scale: 32.0 2023-09-29 17:45:36,014 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 17:45:36,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:45:37,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 17:45:39,457 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=441300.0, ans=0.0 2023-09-29 17:45:42,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:45:42,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:45:42,673 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=441300.0, ans=0.04949747468305833 2023-09-29 17:45:48,745 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:45:48,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:45:48,951 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=441300.0, ans=0.125 2023-09-29 17:45:50,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 17:45:56,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:45:56,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:45:59,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:45:59,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:45:59,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:45:59,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 17:46:04,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:46:05,818 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 17:46:07,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:46:09,806 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.65 vs. limit=22.5 2023-09-29 17:46:10,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 17:46:10,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:46:11,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:46:11,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:46:13,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 17:46:15,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:46:20,055 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=441433.3333333333, ans=0.125 2023-09-29 17:46:23,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:46:24,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:46:25,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:46:25,075 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:46:25,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:46:25,590 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=441500.0, ans=0.125 2023-09-29 17:46:26,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:46:28,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 17:46:28,486 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=441500.0, ans=0.125 2023-09-29 17:46:31,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:46:31,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:46:33,302 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.50 vs. limit=6.0 2023-09-29 17:46:35,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:46:35,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:46:40,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:46:40,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 17:46:42,154 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:46:43,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:46:45,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 17:46:45,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:46:46,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:46:49,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:46:53,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:46:53,206 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:46:58,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 17:46:59,950 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.565e+02 1.935e+02 2.161e+02 2.595e+02 3.888e+02, threshold=4.322e+02, percent-clipped=0.0 2023-09-29 17:46:59,992 INFO [train.py:1039] (1/4) Epoch 13, batch 2500, loss[loss=0.1761, simple_loss=0.257, pruned_loss=0.04762, over 24665.00 frames. ], tot_loss[loss=0.1936, simple_loss=0.2645, pruned_loss=0.06138, over 4690946.67 frames. ], batch size: 65, lr: 7.99e-03, grad_scale: 32.0 2023-09-29 17:47:00,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:47:00,553 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=441633.3333333333, ans=0.0 2023-09-29 17:47:06,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:47:09,469 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer_na.min_abs, batch_count=441633.3333333333, ans=0.02 2023-09-29 17:47:12,543 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=441633.3333333333, ans=0.1 2023-09-29 17:47:15,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:47:15,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:47:17,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:47:17,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 17:47:25,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 17:47:25,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:47:27,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 17:47:29,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 17:47:29,784 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 17:47:29,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:47:29,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:47:32,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 17:47:32,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:47:32,192 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 17:47:33,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:47:38,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:47:39,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:47:41,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 17:47:41,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 17:47:41,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:47:44,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:47:47,936 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:47:51,114 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:47:54,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:48:00,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 17:48:05,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 17:48:05,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:48:05,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 17:48:06,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:48:06,945 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 17:48:08,442 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 17:48:08,443 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 17:48:08,451 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 17:48:08,710 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=441900.0, ans=0.125 2023-09-29 17:48:12,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:48:15,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 17:48:17,337 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 17:48:17,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:48:18,941 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 17:48:21,459 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.28 vs. limit=15.0 2023-09-29 17:48:22,058 INFO [train.py:1039] (1/4) Epoch 13, batch 2550, loss[loss=0.1913, simple_loss=0.2745, pruned_loss=0.05403, over 23977.00 frames. ], tot_loss[loss=0.1939, simple_loss=0.2653, pruned_loss=0.06122, over 4686532.11 frames. ], batch size: 80, lr: 7.98e-03, grad_scale: 32.0 2023-09-29 17:48:22,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 17:48:25,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:48:25,530 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=441966.6666666667, ans=0.125 2023-09-29 17:48:25,569 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=441966.6666666667, ans=0.125 2023-09-29 17:48:26,000 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.14 vs. limit=15.0 2023-09-29 17:48:26,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:48:26,835 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:48:28,555 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:48:30,844 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 17:48:30,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 17:48:31,108 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=441966.6666666667, ans=0.0 2023-09-29 17:48:36,049 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 17:48:39,565 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:48:43,012 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:48:43,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:48:43,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 17:48:44,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 17:48:44,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:48:44,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:48:47,973 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:48:48,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 17:48:48,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 17:48:48,086 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:48:48,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 17:49:00,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:49:04,809 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.24 vs. limit=10.0 2023-09-29 17:49:05,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:49:05,621 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:49:05,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:49:07,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 17:49:12,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:49:16,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 17:49:17,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:49:17,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:49:17,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 17:49:19,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 17:49:22,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:49:23,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:49:26,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:49:26,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 17:49:26,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:49:28,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:49:29,829 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:49:31,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 17:49:32,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:49:37,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:49:41,857 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:49:44,259 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.55 vs. limit=12.0 2023-09-29 17:49:44,604 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.935e+02 2.262e+02 2.614e+02 3.523e+02, threshold=4.524e+02, percent-clipped=0.0 2023-09-29 17:49:44,646 INFO [train.py:1039] (1/4) Epoch 13, batch 2600, loss[loss=0.1769, simple_loss=0.2538, pruned_loss=0.05, over 24483.00 frames. ], tot_loss[loss=0.1943, simple_loss=0.2657, pruned_loss=0.06147, over 4694286.91 frames. ], batch size: 63, lr: 7.98e-03, grad_scale: 32.0 2023-09-29 17:49:44,993 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=442300.0, ans=0.2 2023-09-29 17:49:45,026 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 17:49:46,253 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 17:49:49,385 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 17:49:50,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:49:50,829 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 17:49:50,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 17:49:50,977 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 17:49:54,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:49:56,111 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 17:49:56,292 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 17:49:57,766 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 17:49:59,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:50:02,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 17:50:02,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 17:50:05,328 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 17:50:05,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 17:50:07,131 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=442366.6666666667, ans=0.125 2023-09-29 17:50:08,535 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 17:50:08,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 17:50:14,247 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=442366.6666666667, ans=0.0 2023-09-29 17:50:17,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:50:18,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:50:18,462 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=442433.3333333333, ans=0.125 2023-09-29 17:50:19,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:50:19,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 17:50:21,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:50:27,107 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 17:50:32,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:50:33,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:50:35,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 17:50:36,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:50:36,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:50:36,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 17:50:38,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 17:50:38,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:50:41,534 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.51 vs. limit=5.0 2023-09-29 17:50:41,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:50:45,886 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 17:50:45,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:50:45,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:50:50,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:50:53,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:50:53,467 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 17:50:54,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:50:56,276 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:50:57,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:50:58,095 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=442566.6666666667, ans=0.0 2023-09-29 17:51:01,934 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=442566.6666666667, ans=0.1 2023-09-29 17:51:03,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 17:51:04,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:51:06,337 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=442633.3333333333, ans=0.0 2023-09-29 17:51:07,434 INFO [train.py:1039] (1/4) Epoch 13, batch 2650, loss[loss=0.1955, simple_loss=0.282, pruned_loss=0.05451, over 24652.00 frames. ], tot_loss[loss=0.1951, simple_loss=0.2666, pruned_loss=0.06178, over 4686489.05 frames. ], batch size: 68, lr: 7.98e-03, grad_scale: 16.0 2023-09-29 17:51:07,521 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 17:51:09,608 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=442633.3333333333, ans=0.2 2023-09-29 17:51:10,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 17:51:10,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:51:12,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 17:51:12,562 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 17:51:12,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:51:15,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:51:18,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 17:51:20,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:51:24,287 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:51:25,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 17:51:25,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 17:51:25,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:51:28,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 17:51:29,829 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 17:51:32,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:51:35,835 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 17:51:35,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:51:37,356 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 17:51:42,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:51:42,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 17:51:42,693 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:51:42,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:51:47,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 17:51:47,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 17:51:50,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 17:51:55,057 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 17:51:55,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:51:55,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:51:57,291 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:51:57,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:51:59,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:52:00,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:52:03,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:52:04,774 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:52:06,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 17:52:07,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:52:08,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:52:09,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:52:11,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:52:11,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:52:12,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 17:52:13,300 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 17:52:16,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:52:16,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:52:16,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:52:18,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 17:52:19,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:52:22,869 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:52:23,190 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=442900.0, ans=0.125 2023-09-29 17:52:24,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:52:24,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:52:26,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:52:26,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:52:29,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:52:29,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 17:52:31,196 INFO [train.py:1039] (1/4) Epoch 13, batch 2700, loss[loss=0.1703, simple_loss=0.2477, pruned_loss=0.04645, over 24315.00 frames. ], tot_loss[loss=0.1954, simple_loss=0.2672, pruned_loss=0.06178, over 4694443.06 frames. ], batch size: 61, lr: 7.97e-03, grad_scale: 16.0 2023-09-29 17:52:32,536 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.954e+02 2.253e+02 2.566e+02 4.959e+02, threshold=4.505e+02, percent-clipped=1.0 2023-09-29 17:52:32,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:52:36,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 17:52:38,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:52:38,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:52:38,980 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.21 vs. limit=10.0 2023-09-29 17:52:39,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:52:39,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:52:39,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:52:39,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:52:40,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 17:52:40,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 17:52:41,470 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:52:41,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 17:52:41,848 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=442966.6666666667, ans=0.1 2023-09-29 17:52:43,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 17:52:44,791 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:52:48,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 17:52:50,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 17:52:50,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:52:56,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:52:56,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:52:56,601 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=443033.3333333333, ans=0.0 2023-09-29 17:53:01,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:53:01,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:53:01,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:53:01,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:53:03,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:53:04,083 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=443100.0, ans=0.125 2023-09-29 17:53:08,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:53:08,121 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:53:09,093 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.83 vs. limit=15.0 2023-09-29 17:53:09,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:53:12,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:53:12,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:53:23,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:53:23,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:53:27,216 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:53:27,231 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:53:31,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:53:33,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:53:33,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:53:33,578 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 17:53:34,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:53:36,300 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:53:37,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:53:40,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:53:42,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:53:42,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:53:46,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 17:53:46,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:53:48,318 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:53:48,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 17:53:49,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 17:53:51,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:53:53,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:53:53,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:53:53,548 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=443300.0, ans=0.0 2023-09-29 17:53:54,523 INFO [train.py:1039] (1/4) Epoch 13, batch 2750, loss[loss=0.2012, simple_loss=0.2781, pruned_loss=0.06212, over 23266.00 frames. ], tot_loss[loss=0.1957, simple_loss=0.2672, pruned_loss=0.06214, over 4681608.77 frames. ], batch size: 105, lr: 7.97e-03, grad_scale: 16.0 2023-09-29 17:53:57,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:53:57,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 17:53:57,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:54:01,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:54:01,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 17:54:01,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:54:01,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:54:01,322 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 17:54:01,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:54:02,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:54:07,535 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=443300.0, ans=0.125 2023-09-29 17:54:08,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 17:54:11,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:54:12,512 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:54:13,974 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:54:14,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 17:54:15,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:54:16,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:54:17,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:54:18,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:54:18,715 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=443366.6666666667, ans=0.1 2023-09-29 17:54:21,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 17:54:21,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 17:54:23,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:54:23,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:54:26,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 17:54:35,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:54:37,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 17:54:37,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:54:42,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:54:42,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 17:54:43,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:54:43,857 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=443500.0, ans=0.0 2023-09-29 17:54:51,199 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:54:51,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:54:51,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 17:54:52,961 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=443500.0, ans=0.0 2023-09-29 17:54:53,087 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=443500.0, ans=0.0 2023-09-29 17:54:57,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:54:58,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 17:55:03,596 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=443566.6666666667, ans=0.2 2023-09-29 17:55:04,764 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 17:55:06,336 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:55:06,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 17:55:07,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:55:09,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:55:09,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 17:55:10,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:55:14,474 INFO [train.py:1039] (1/4) Epoch 13, batch 2800, loss[loss=0.1947, simple_loss=0.2792, pruned_loss=0.05509, over 24561.00 frames. ], tot_loss[loss=0.1948, simple_loss=0.2667, pruned_loss=0.06145, over 4691234.33 frames. ], batch size: 71, lr: 7.97e-03, grad_scale: 32.0 2023-09-29 17:55:14,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 17:55:15,774 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.948e+02 2.222e+02 2.625e+02 4.530e+02, threshold=4.443e+02, percent-clipped=1.0 2023-09-29 17:55:15,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:55:15,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:55:17,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 17:55:17,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:55:17,547 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:55:21,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:55:21,119 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 17:55:21,120 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 17:55:23,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:55:23,614 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=443633.3333333333, ans=0.125 2023-09-29 17:55:27,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 17:55:27,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:55:30,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:55:33,339 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 17:55:33,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=443700.0, ans=0.07 2023-09-29 17:55:35,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 17:55:35,433 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=443700.0, ans=0.125 2023-09-29 17:55:36,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 17:55:36,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:55:38,204 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:55:38,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:55:43,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:55:43,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:55:43,273 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 17:55:44,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:55:46,826 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=443766.6666666667, ans=0.2 2023-09-29 17:55:52,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:55:56,586 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:55:58,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:55:59,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:55:59,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:56:05,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:56:05,408 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=443833.3333333333, ans=0.0 2023-09-29 17:56:06,342 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 17:56:06,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:56:08,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:56:08,017 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:56:11,356 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:56:12,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:56:17,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:56:17,621 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=443833.3333333333, ans=0.125 2023-09-29 17:56:18,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:56:18,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:56:18,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 17:56:20,920 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 17:56:21,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 17:56:22,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:56:22,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 17:56:22,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:56:24,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:56:24,167 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:56:25,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 17:56:27,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:56:27,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:56:28,939 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.09 vs. limit=15.0 2023-09-29 17:56:29,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:56:30,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 17:56:33,366 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=443900.0, ans=0.0 2023-09-29 17:56:33,385 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=443900.0, ans=0.125 2023-09-29 17:56:37,560 INFO [train.py:1039] (1/4) Epoch 13, batch 2850, loss[loss=0.2024, simple_loss=0.2834, pruned_loss=0.0607, over 24629.00 frames. ], tot_loss[loss=0.1931, simple_loss=0.2648, pruned_loss=0.06074, over 4690010.80 frames. ], batch size: 68, lr: 7.97e-03, grad_scale: 16.0 2023-09-29 17:56:37,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:56:37,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 17:56:39,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:56:41,069 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:56:44,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:56:45,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:56:46,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:56:50,498 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:56:50,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:56:52,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:56:52,293 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 17:56:57,646 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=444033.3333333333, ans=0.1 2023-09-29 17:56:59,039 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.min_positive, batch_count=444033.3333333333, ans=0.025 2023-09-29 17:57:00,360 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 17:57:00,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:57:02,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 17:57:02,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:57:04,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 17:57:06,391 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 17:57:07,826 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:57:20,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:57:22,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:57:22,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:57:23,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 17:57:23,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 17:57:23,794 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:57:25,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 17:57:25,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 17:57:27,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 17:57:27,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:57:28,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:57:30,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:57:33,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:57:33,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:57:37,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:57:38,815 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:57:40,313 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:57:40,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:57:41,508 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=444166.6666666667, ans=0.125 2023-09-29 17:57:42,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:57:42,890 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=444233.3333333333, ans=0.0 2023-09-29 17:57:45,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:57:45,896 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=444233.3333333333, ans=0.035 2023-09-29 17:57:48,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:57:50,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 17:57:50,430 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 17:57:52,111 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 17:57:53,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:57:53,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 17:57:55,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:57:55,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:57:55,151 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:57:55,185 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:57:55,186 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 17:57:56,673 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 17:57:56,679 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:57:56,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:57:59,691 INFO [train.py:1039] (1/4) Epoch 13, batch 2900, loss[loss=0.2116, simple_loss=0.2898, pruned_loss=0.06665, over 24021.00 frames. ], tot_loss[loss=0.193, simple_loss=0.2646, pruned_loss=0.06071, over 4696787.39 frames. ], batch size: 80, lr: 7.96e-03, grad_scale: 16.0 2023-09-29 17:58:01,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 17:58:02,697 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.932e+02 2.253e+02 2.547e+02 3.848e+02, threshold=4.506e+02, percent-clipped=0.0 2023-09-29 17:58:02,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:58:02,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:58:04,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 17:58:07,136 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.40 vs. limit=10.0 2023-09-29 17:58:09,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:58:10,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 17:58:11,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 17:58:13,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:58:13,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:58:16,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:58:16,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:58:20,136 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:58:21,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:58:24,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 17:58:24,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 17:58:24,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 17:58:26,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:58:29,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 17:58:30,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 17:58:31,177 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=444433.3333333333, ans=0.1 2023-09-29 17:58:33,402 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.71 vs. limit=22.5 2023-09-29 17:58:33,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:58:33,948 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 17:58:33,998 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:58:34,236 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=444433.3333333333, ans=0.125 2023-09-29 17:58:34,295 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=444433.3333333333, ans=0.125 2023-09-29 17:58:37,019 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:58:37,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 17:58:42,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:58:42,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:58:43,220 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.30 vs. limit=15.0 2023-09-29 17:58:45,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:58:49,564 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:58:53,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 17:58:53,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 17:58:53,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:58:56,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:58:58,496 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=444500.0, ans=0.125 2023-09-29 17:58:59,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 17:58:59,703 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:59:05,631 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:59:11,406 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.05 vs. limit=22.5 2023-09-29 17:59:13,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:59:13,960 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:59:15,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 17:59:18,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:59:18,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 17:59:20,042 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:59:22,143 INFO [train.py:1039] (1/4) Epoch 13, batch 2950, loss[loss=0.1777, simple_loss=0.2593, pruned_loss=0.048, over 24633.00 frames. ], tot_loss[loss=0.1934, simple_loss=0.2653, pruned_loss=0.06073, over 4708730.26 frames. ], batch size: 65, lr: 7.96e-03, grad_scale: 16.0 2023-09-29 17:59:22,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:59:29,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:59:30,907 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 17:59:31,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:59:31,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:59:32,734 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=444633.3333333333, ans=0.125 2023-09-29 17:59:32,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=444633.3333333333, ans=0.125 2023-09-29 17:59:34,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:59:34,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:59:35,737 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 17:59:37,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 17:59:37,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 17:59:37,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:59:39,369 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=444700.0, ans=0.125 2023-09-29 17:59:42,209 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:59:43,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:59:45,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:59:45,536 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=444700.0, ans=0.1 2023-09-29 17:59:47,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:59:49,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:59:49,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:59:50,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:59:52,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:59:52,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:59:52,694 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.45 vs. limit=15.0 2023-09-29 17:59:54,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 18:00:01,542 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 18:00:01,582 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 18:00:02,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:00:05,764 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 18:00:05,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 18:00:07,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:00:07,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:00:07,564 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 18:00:07,571 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 18:00:07,818 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=444766.6666666667, ans=0.1 2023-09-29 18:00:07,825 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=444766.6666666667, ans=0.125 2023-09-29 18:00:10,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 18:00:12,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:00:12,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:00:16,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:00:18,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:00:18,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:00:19,624 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 18:00:19,700 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:00:19,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 18:00:26,461 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:00:28,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 18:00:28,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 18:00:29,009 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.69 vs. limit=22.5 2023-09-29 18:00:29,450 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:00:32,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 18:00:35,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:00:37,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:00:38,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:00:38,995 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:00:39,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 18:00:40,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:00:40,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:00:40,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 18:00:42,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:00:42,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:00:43,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:00:45,226 INFO [train.py:1039] (1/4) Epoch 13, batch 3000, loss[loss=0.2022, simple_loss=0.2792, pruned_loss=0.06262, over 24090.00 frames. ], tot_loss[loss=0.1945, simple_loss=0.2667, pruned_loss=0.06115, over 4713750.11 frames. ], batch size: 80, lr: 7.96e-03, grad_scale: 16.0 2023-09-29 18:00:45,226 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-29 18:01:00,604 INFO [train.py:1071] (1/4) Epoch 13, validation: loss=0.3476, simple_loss=0.2869, pruned_loss=0.2041, over 1125622.00 frames. 2023-09-29 18:01:00,605 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-29 18:01:00,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:01:00,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 18:01:02,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:01:04,369 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.477e+02 1.886e+02 2.154e+02 2.482e+02 3.380e+02, threshold=4.309e+02, percent-clipped=0.0 2023-09-29 18:01:04,562 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:01:06,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 18:01:09,642 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 18:01:09,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 18:01:11,389 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 18:01:11,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:01:12,108 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.24 vs. limit=6.0 2023-09-29 18:01:12,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 18:01:14,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:01:21,744 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 18:01:30,913 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:01:36,277 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=445100.0, ans=0.125 2023-09-29 18:01:36,490 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=445100.0, ans=0.125 2023-09-29 18:01:40,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 18:01:41,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:01:44,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:01:45,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:01:45,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:01:47,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:01:47,347 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 18:01:49,039 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 18:01:50,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:01:50,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 18:01:52,481 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=445166.6666666667, ans=0.0 2023-09-29 18:01:53,676 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:01:53,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:01:55,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:01:55,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:01:59,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:02:01,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:02:01,097 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:02:02,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:02:04,320 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 18:02:05,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:02:05,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:02:06,140 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=445233.3333333333, ans=0.1 2023-09-29 18:02:07,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:02:09,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:02:09,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:02:11,170 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 18:02:11,219 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 18:02:12,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:02:12,702 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 18:02:13,437 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:02:16,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 18:02:19,446 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 18:02:20,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 18:02:21,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 18:02:22,401 INFO [train.py:1039] (1/4) Epoch 13, batch 3050, loss[loss=0.1925, simple_loss=0.2676, pruned_loss=0.05872, over 24466.00 frames. ], tot_loss[loss=0.1954, simple_loss=0.2675, pruned_loss=0.06166, over 4724578.78 frames. ], batch size: 63, lr: 7.95e-03, grad_scale: 16.0 2023-09-29 18:02:23,934 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 18:02:23,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 18:02:24,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:02:25,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:02:25,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 18:02:26,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:02:27,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:02:28,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 18:02:30,679 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:02:32,410 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=445300.0, ans=0.1 2023-09-29 18:02:33,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:02:33,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:02:38,128 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:02:39,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 18:02:45,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 18:02:45,781 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 18:02:47,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:02:52,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:02:55,256 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:02:55,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:02:56,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:02:59,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:03:00,110 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=445433.3333333333, ans=0.125 2023-09-29 18:03:01,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 18:03:01,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:03:02,002 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.43 vs. limit=15.0 2023-09-29 18:03:02,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:03:02,802 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:03:02,941 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:03:03,173 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=445433.3333333333, ans=0.0 2023-09-29 18:03:05,106 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.86 vs. limit=22.5 2023-09-29 18:03:06,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:03:09,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:03:09,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 18:03:09,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:03:09,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:03:12,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:03:12,778 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=445500.0, ans=0.2 2023-09-29 18:03:13,129 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.87 vs. limit=15.0 2023-09-29 18:03:13,928 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:03:14,020 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:03:14,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:03:19,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:03:19,362 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=445500.0, ans=0.125 2023-09-29 18:03:20,685 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:03:26,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:03:28,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:03:28,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:03:28,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:03:28,672 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=445566.6666666667, ans=0.125 2023-09-29 18:03:29,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 18:03:31,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:03:31,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 18:03:32,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:03:33,245 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=445566.6666666667, ans=0.0 2023-09-29 18:03:34,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:03:34,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 18:03:36,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:03:40,909 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=445566.6666666667, ans=0.1 2023-09-29 18:03:42,193 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:03:43,553 INFO [train.py:1039] (1/4) Epoch 13, batch 3100, loss[loss=0.1779, simple_loss=0.2357, pruned_loss=0.06006, over 22760.00 frames. ], tot_loss[loss=0.1942, simple_loss=0.2665, pruned_loss=0.06094, over 4732061.14 frames. ], batch size: 322, lr: 7.95e-03, grad_scale: 16.0 2023-09-29 18:03:43,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:03:45,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 18:03:46,665 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.826e+02 2.024e+02 2.314e+02 3.606e+02, threshold=4.048e+02, percent-clipped=0.0 2023-09-29 18:03:47,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 18:03:50,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 18:03:51,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 18:03:52,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:03:56,389 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:03:57,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:03:59,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 18:04:02,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:04:07,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 18:04:13,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 18:04:13,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:04:14,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:04:14,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:04:15,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 18:04:16,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:04:16,666 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 18:04:16,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:04:20,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:04:20,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 18:04:21,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:04:25,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 18:04:27,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 18:04:28,117 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=445766.6666666667, ans=0.09899494936611666 2023-09-29 18:04:29,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 18:04:29,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:04:30,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:04:33,770 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:04:33,793 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:04:33,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:04:34,061 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=445833.3333333333, ans=0.125 2023-09-29 18:04:35,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:04:35,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:04:37,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:04:37,163 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:04:37,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:04:37,175 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 18:04:41,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:04:42,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 18:04:43,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 18:04:45,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 18:04:45,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:04:45,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:04:46,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 18:04:47,072 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=445833.3333333333, ans=0.1 2023-09-29 18:04:54,136 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.70 vs. limit=10.0 2023-09-29 18:04:55,029 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=445900.0, ans=0.0 2023-09-29 18:04:59,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 18:05:02,245 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=445900.0, ans=0.2 2023-09-29 18:05:03,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:05:04,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:05:06,266 INFO [train.py:1039] (1/4) Epoch 13, batch 3150, loss[loss=0.2078, simple_loss=0.2795, pruned_loss=0.06802, over 23738.00 frames. ], tot_loss[loss=0.1932, simple_loss=0.2654, pruned_loss=0.06055, over 4726258.56 frames. ], batch size: 85, lr: 7.95e-03, grad_scale: 16.0 2023-09-29 18:05:06,470 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:05:06,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:05:08,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 18:05:09,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:05:09,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 18:05:09,942 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=445966.6666666667, ans=0.125 2023-09-29 18:05:11,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 18:05:14,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:05:15,735 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 18:05:16,203 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=445966.6666666667, ans=0.125 2023-09-29 18:05:17,675 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=445966.6666666667, ans=0.2 2023-09-29 18:05:18,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 18:05:18,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:05:20,485 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 18:05:23,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 18:05:24,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 18:05:25,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 18:05:25,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 18:05:25,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:05:25,118 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:05:27,202 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:05:28,802 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 18:05:30,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:05:30,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:05:30,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:05:34,468 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 18:05:39,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 18:05:39,778 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:05:41,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 18:05:43,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:05:44,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 18:05:47,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 18:05:48,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:05:48,340 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=446100.0, ans=0.125 2023-09-29 18:05:49,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 18:05:49,469 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 18:05:49,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:05:49,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:05:51,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 18:05:51,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 18:05:52,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 18:05:52,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:05:52,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:05:55,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:05:55,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:05:57,080 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 18:05:58,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:06:00,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 18:06:00,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:06:02,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 18:06:03,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 18:06:05,559 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:06:05,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:06:05,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 18:06:07,733 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 18:06:09,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:06:12,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:06:14,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:06:14,391 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:06:18,297 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:06:19,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:06:21,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 18:06:24,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:06:24,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 18:06:24,942 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=446233.3333333333, ans=0.2 2023-09-29 18:06:29,119 INFO [train.py:1039] (1/4) Epoch 13, batch 3200, loss[loss=0.18, simple_loss=0.256, pruned_loss=0.05198, over 24334.00 frames. ], tot_loss[loss=0.1919, simple_loss=0.264, pruned_loss=0.05991, over 4722046.93 frames. ], batch size: 61, lr: 7.95e-03, grad_scale: 32.0 2023-09-29 18:06:29,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:06:30,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:06:30,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 18:06:32,601 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.906e+02 2.221e+02 2.638e+02 3.823e+02, threshold=4.442e+02, percent-clipped=0.0 2023-09-29 18:06:34,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:06:37,672 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=446300.0, ans=0.04949747468305833 2023-09-29 18:06:39,612 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:06:44,611 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:06:44,864 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=446366.6666666667, ans=0.125 2023-09-29 18:06:55,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:07:05,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 18:07:05,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:07:05,725 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=446433.3333333333, ans=0.0 2023-09-29 18:07:07,944 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.31 vs. limit=15.0 2023-09-29 18:07:08,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 18:07:10,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 18:07:14,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:07:14,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 18:07:14,567 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=446433.3333333333, ans=0.125 2023-09-29 18:07:15,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:07:18,767 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.87 vs. limit=15.0 2023-09-29 18:07:20,720 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 18:07:20,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 18:07:23,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 18:07:24,196 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=446500.0, ans=0.1 2023-09-29 18:07:26,155 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 18:07:29,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:07:31,111 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=446500.0, ans=0.125 2023-09-29 18:07:36,766 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:07:36,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:07:36,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:07:38,301 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 18:07:38,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 18:07:41,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:07:43,092 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 18:07:43,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 18:07:45,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 18:07:45,538 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=446566.6666666667, ans=0.0 2023-09-29 18:07:47,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 18:07:49,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:07:52,558 INFO [train.py:1039] (1/4) Epoch 13, batch 3250, loss[loss=0.1953, simple_loss=0.2643, pruned_loss=0.06311, over 23259.00 frames. ], tot_loss[loss=0.1921, simple_loss=0.2639, pruned_loss=0.06016, over 4723738.58 frames. ], batch size: 119, lr: 7.94e-03, grad_scale: 16.0 2023-09-29 18:07:52,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 18:07:52,691 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 18:07:52,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:07:52,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:07:54,263 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 18:07:58,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:08:02,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:08:09,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:08:09,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 18:08:10,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:08:12,336 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:08:12,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:08:13,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:08:14,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:08:14,317 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=446700.0, ans=0.1 2023-09-29 18:08:16,598 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=9.63 vs. limit=15.0 2023-09-29 18:08:17,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:08:17,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 18:08:17,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:08:17,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:08:17,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:08:19,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:08:23,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:08:24,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:08:26,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:08:26,948 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:08:28,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:08:28,572 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:08:28,589 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:08:30,522 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=446766.6666666667, ans=0.0 2023-09-29 18:08:33,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 18:08:34,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:08:34,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:08:36,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:08:36,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:08:44,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:08:46,736 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=446833.3333333333, ans=0.0 2023-09-29 18:08:46,839 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=446833.3333333333, ans=0.2 2023-09-29 18:08:53,331 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:08:54,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:08:54,712 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 18:08:54,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:08:54,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 18:08:54,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:08:59,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 18:08:59,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 18:09:00,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:09:02,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:09:03,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:09:03,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 18:09:03,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:09:05,562 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=446900.0, ans=0.1 2023-09-29 18:09:06,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:09:06,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:09:07,403 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.20 vs. limit=12.0 2023-09-29 18:09:08,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 18:09:08,427 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:09:11,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 18:09:11,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 18:09:14,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:09:14,589 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 18:09:16,077 INFO [train.py:1039] (1/4) Epoch 13, batch 3300, loss[loss=0.1944, simple_loss=0.2747, pruned_loss=0.05705, over 24661.00 frames. ], tot_loss[loss=0.193, simple_loss=0.265, pruned_loss=0.06051, over 4732080.51 frames. ], batch size: 65, lr: 7.94e-03, grad_scale: 16.0 2023-09-29 18:09:16,166 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 18:09:18,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 18:09:18,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:09:18,640 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=446966.6666666667, ans=0.0 2023-09-29 18:09:21,251 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.922e+02 2.153e+02 2.771e+02 4.428e+02, threshold=4.306e+02, percent-clipped=0.0 2023-09-29 18:09:22,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:09:24,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:09:24,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:09:26,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 18:09:27,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 18:09:29,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:09:31,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:09:35,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 18:09:37,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:09:37,249 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:09:40,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:09:40,295 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 18:09:41,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:09:43,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 18:09:43,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:09:43,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:09:43,325 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 18:09:47,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:09:49,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 18:09:52,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:09:52,118 WARNING [train.py:1197] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 18:09:54,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 18:09:54,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:09:55,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:09:57,503 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 18:09:59,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 18:09:59,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:10:02,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 18:10:05,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:10:08,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 18:10:09,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:10:10,399 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.02 vs. limit=22.5 2023-09-29 18:10:11,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:10:12,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:10:12,889 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:10:12,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 18:10:15,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:10:16,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:10:17,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:10:18,999 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 18:10:20,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 18:10:22,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 18:10:23,555 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:10:23,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:10:25,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:10:25,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:10:25,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 18:10:27,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:10:27,040 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 18:10:28,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:10:31,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 18:10:34,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 18:10:34,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:10:35,358 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=447233.3333333333, ans=0.0 2023-09-29 18:10:36,357 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:10:36,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:10:36,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:10:38,620 INFO [train.py:1039] (1/4) Epoch 13, batch 3350, loss[loss=0.2014, simple_loss=0.2782, pruned_loss=0.06229, over 24386.00 frames. ], tot_loss[loss=0.1931, simple_loss=0.2656, pruned_loss=0.06029, over 4736460.38 frames. ], batch size: 77, lr: 7.94e-03, grad_scale: 16.0 2023-09-29 18:10:38,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:10:39,427 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=12.51 vs. limit=15.0 2023-09-29 18:10:41,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:10:41,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:10:45,813 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=447300.0, ans=0.125 2023-09-29 18:10:46,267 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.49 vs. limit=15.0 2023-09-29 18:10:47,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:10:50,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:10:51,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:10:53,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:10:55,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:10:56,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:10:58,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:10:59,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 18:11:01,148 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 18:11:02,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:11:02,806 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=447366.6666666667, ans=0.015 2023-09-29 18:11:04,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 18:11:04,354 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 18:11:06,463 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:11:06,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:11:07,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:11:08,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 18:11:08,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:11:09,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:11:11,017 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:11:12,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:11:14,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:11:14,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:11:14,554 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=447433.3333333333, ans=0.0 2023-09-29 18:11:18,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:11:21,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:11:22,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:11:23,120 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.75 vs. limit=22.5 2023-09-29 18:11:26,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:11:28,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:11:29,662 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:11:29,679 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:11:31,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:11:33,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 18:11:33,079 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 18:11:33,138 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 18:11:34,581 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:11:34,743 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 18:11:36,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:11:37,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:11:44,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:11:46,075 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 18:11:46,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 18:11:48,305 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:11:50,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:11:57,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:11:58,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 18:11:58,937 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=447566.6666666667, ans=0.125 2023-09-29 18:12:00,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 18:12:00,639 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=447633.3333333333, ans=0.0 2023-09-29 18:12:01,679 INFO [train.py:1039] (1/4) Epoch 13, batch 3400, loss[loss=0.2014, simple_loss=0.2861, pruned_loss=0.05837, over 24664.00 frames. ], tot_loss[loss=0.1946, simple_loss=0.2675, pruned_loss=0.06085, over 4732682.43 frames. ], batch size: 68, lr: 7.93e-03, grad_scale: 16.0 2023-09-29 18:12:01,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:12:03,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:12:03,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 18:12:03,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:12:03,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 18:12:06,370 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.641e+02 1.928e+02 2.132e+02 2.448e+02 3.305e+02, threshold=4.265e+02, percent-clipped=0.0 2023-09-29 18:12:06,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:12:06,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:12:08,014 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 18:12:08,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:12:08,233 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 18:12:13,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 18:12:13,271 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 18:12:13,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:12:18,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:12:18,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 18:12:20,063 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:12:20,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 18:12:25,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:12:28,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 18:12:34,212 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:12:37,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:12:37,242 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:12:38,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 18:12:43,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:12:49,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 18:12:54,262 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:12:54,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:12:56,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 18:12:56,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:12:56,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:12:58,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:12:59,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:13:02,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:13:06,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 18:13:06,601 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:13:11,340 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:13:14,356 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 18:13:19,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 18:13:23,693 INFO [train.py:1039] (1/4) Epoch 13, batch 3450, loss[loss=0.184, simple_loss=0.2607, pruned_loss=0.05362, over 24656.00 frames. ], tot_loss[loss=0.1941, simple_loss=0.2668, pruned_loss=0.06076, over 4726550.30 frames. ], batch size: 65, lr: 7.93e-03, grad_scale: 16.0 2023-09-29 18:13:23,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 18:13:28,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 18:13:28,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:13:30,485 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:13:30,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 18:13:32,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:13:37,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 18:13:40,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:13:41,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:13:43,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:13:43,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:13:45,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:13:52,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 18:13:58,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 18:13:58,877 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 18:14:00,212 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:14:00,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:14:09,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 18:14:09,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:14:12,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:14:12,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:14:14,030 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=448166.6666666667, ans=0.0 2023-09-29 18:14:15,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 18:14:16,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:14:19,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 18:14:19,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:14:19,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:14:22,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:14:25,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 18:14:28,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:14:33,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:14:34,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:14:37,367 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:14:42,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:14:42,961 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:14:44,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:14:44,523 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:14:48,031 INFO [train.py:1039] (1/4) Epoch 13, batch 3500, loss[loss=0.1821, simple_loss=0.2509, pruned_loss=0.05664, over 24489.00 frames. ], tot_loss[loss=0.1922, simple_loss=0.2645, pruned_loss=0.05995, over 4724167.96 frames. ], batch size: 58, lr: 7.93e-03, grad_scale: 16.0 2023-09-29 18:14:49,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:14:52,617 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.904e+02 2.170e+02 2.519e+02 3.488e+02, threshold=4.340e+02, percent-clipped=0.0 2023-09-29 18:14:52,809 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:14:54,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 18:14:55,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 18:14:59,046 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 18:15:02,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:15:02,047 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 18:15:02,706 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=10.19 vs. limit=15.0 2023-09-29 18:15:08,156 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:15:09,697 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:15:09,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:15:09,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:15:09,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 18:15:11,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:15:12,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:15:12,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 18:15:12,299 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=448366.6666666667, ans=0.125 2023-09-29 18:15:12,306 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=448366.6666666667, ans=0.125 2023-09-29 18:15:12,355 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=448366.6666666667, ans=0.125 2023-09-29 18:15:15,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:15:15,905 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 18:15:17,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:15:19,801 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.84 vs. limit=15.0 2023-09-29 18:15:21,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:15:22,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 18:15:22,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:15:25,758 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:15:27,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:15:28,787 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:15:30,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:15:31,709 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:15:33,265 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 18:15:33,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 18:15:33,631 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=448433.3333333333, ans=0.0 2023-09-29 18:15:34,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 18:15:34,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:15:37,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:15:37,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:15:37,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:15:38,277 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=448500.0, ans=0.125 2023-09-29 18:15:41,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 18:15:41,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:15:47,137 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:15:49,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 18:15:49,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 18:15:49,090 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:15:52,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:15:52,834 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:15:54,288 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:15:57,372 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 18:15:57,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:16:00,394 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:16:01,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 18:16:03,711 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 18:16:05,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:16:06,082 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.53 vs. limit=15.0 2023-09-29 18:16:06,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:16:06,909 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:16:06,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:16:09,891 INFO [train.py:1039] (1/4) Epoch 13, batch 3550, loss[loss=0.1927, simple_loss=0.2748, pruned_loss=0.05531, over 24411.00 frames. ], tot_loss[loss=0.1909, simple_loss=0.2631, pruned_loss=0.05936, over 4715058.67 frames. ], batch size: 69, lr: 7.92e-03, grad_scale: 16.0 2023-09-29 18:16:10,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:16:20,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:16:22,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 18:16:24,740 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=448633.3333333333, ans=0.125 2023-09-29 18:16:26,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:16:27,958 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:16:29,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:16:31,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:16:31,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:16:35,523 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:16:35,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:16:35,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:16:35,679 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 18:16:37,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 18:16:43,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:16:43,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:16:45,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:16:45,194 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:16:45,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:16:46,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 18:16:46,713 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:16:48,819 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.89 vs. limit=12.0 2023-09-29 18:16:49,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:16:51,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 18:16:57,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:16:58,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:17:00,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:17:02,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 18:17:02,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:17:02,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 18:17:03,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:17:05,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:17:05,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:17:08,734 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 18:17:10,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:17:10,634 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=448833.3333333333, ans=0.125 2023-09-29 18:17:16,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:17:16,666 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 18:17:18,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:17:21,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:17:23,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 18:17:30,433 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 18:17:30,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:17:31,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:17:33,875 INFO [train.py:1039] (1/4) Epoch 13, batch 3600, loss[loss=0.1834, simple_loss=0.2532, pruned_loss=0.0568, over 21581.00 frames. ], tot_loss[loss=0.1906, simple_loss=0.263, pruned_loss=0.05907, over 4717845.83 frames. ], batch size: 47, lr: 7.92e-03, grad_scale: 32.0 2023-09-29 18:17:34,439 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=448966.6666666667, ans=0.1 2023-09-29 18:17:35,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:17:37,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:17:37,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:17:39,124 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 1.817e+02 2.056e+02 2.414e+02 4.361e+02, threshold=4.112e+02, percent-clipped=1.0 2023-09-29 18:17:40,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:17:43,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:17:44,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:17:45,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:17:45,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:17:45,620 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 18:17:48,728 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 18:17:48,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:17:51,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:17:55,088 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:17:56,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:17:56,710 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:17:58,741 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 18:17:58,845 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:18:00,636 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:18:01,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:18:01,963 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:18:03,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:18:07,212 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:18:07,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:18:07,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=449100.0, ans=0.0 2023-09-29 18:18:09,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 18:18:16,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:18:18,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 18:18:18,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 18:18:21,730 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=449166.6666666667, ans=0.0 2023-09-29 18:18:23,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:18:28,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:18:31,690 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:18:37,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 18:18:37,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:18:37,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 18:18:38,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 18:18:40,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 18:18:42,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:18:44,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:18:45,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 18:18:45,976 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:18:47,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:18:47,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:18:47,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 18:18:49,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 18:18:49,364 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=449233.3333333333, ans=0.04949747468305833 2023-09-29 18:18:52,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:18:53,714 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 18:18:56,392 INFO [train.py:1039] (1/4) Epoch 13, batch 3650, loss[loss=0.1739, simple_loss=0.2493, pruned_loss=0.04923, over 23458.00 frames. ], tot_loss[loss=0.1921, simple_loss=0.2646, pruned_loss=0.05985, over 4710015.59 frames. ], batch size: 134, lr: 7.92e-03, grad_scale: 32.0 2023-09-29 18:18:58,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 18:18:59,730 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:19:03,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 18:19:03,858 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=449300.0, ans=0.1 2023-09-29 18:19:05,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 18:19:07,047 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=449300.0, ans=0.125 2023-09-29 18:19:11,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:19:11,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 18:19:13,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:19:16,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 18:19:16,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:19:17,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 18:19:18,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:19:18,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:19:20,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 18:19:20,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 18:19:20,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:19:22,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:19:24,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:19:28,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 18:19:28,525 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 18:19:29,463 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=6.71 vs. limit=15.0 2023-09-29 18:19:30,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:19:30,332 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=449433.3333333333, ans=0.0 2023-09-29 18:19:30,387 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=449433.3333333333, ans=0.2 2023-09-29 18:19:31,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 18:19:32,375 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.52 vs. limit=15.0 2023-09-29 18:19:33,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:19:33,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:19:39,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:19:41,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:19:41,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 18:19:42,694 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.30 vs. limit=8.0 2023-09-29 18:19:43,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:19:43,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:19:45,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:19:48,466 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:19:49,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:19:51,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:19:53,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 18:19:55,004 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:19:56,457 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:20:02,034 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 18:20:05,053 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:20:05,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:20:06,649 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 18:20:06,754 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:20:08,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 18:20:09,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:20:11,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 18:20:11,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:20:15,376 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 18:20:15,891 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=449566.6666666667, ans=0.09899494936611666 2023-09-29 18:20:17,005 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:20:17,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:20:20,444 INFO [train.py:1039] (1/4) Epoch 13, batch 3700, loss[loss=0.1754, simple_loss=0.2499, pruned_loss=0.0504, over 24639.00 frames. ], tot_loss[loss=0.1937, simple_loss=0.2661, pruned_loss=0.06069, over 4717952.55 frames. ], batch size: 60, lr: 7.92e-03, grad_scale: 32.0 2023-09-29 18:20:20,623 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:20:20,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 18:20:20,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:20:22,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 18:20:22,126 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 18:20:25,577 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.493e+02 1.943e+02 2.154e+02 2.473e+02 4.046e+02, threshold=4.307e+02, percent-clipped=0.0 2023-09-29 18:20:25,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 18:20:30,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:20:32,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:20:33,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:20:33,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:20:35,195 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 18:20:38,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:20:39,794 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 18:20:43,206 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=449700.0, ans=0.0 2023-09-29 18:20:49,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:20:49,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 18:20:50,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 18:20:50,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 18:20:50,941 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=449766.6666666667, ans=0.07 2023-09-29 18:20:51,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:20:54,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:20:56,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 18:20:57,822 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:20:58,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:21:01,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:21:02,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 18:21:04,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 18:21:08,476 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:21:08,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 18:21:09,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:21:10,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 18:21:16,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:21:17,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:21:20,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:21:20,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 18:21:23,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:21:23,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 18:21:23,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:21:23,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:21:26,484 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.66 vs. limit=15.0 2023-09-29 18:21:27,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:21:29,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 18:21:31,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 18:21:31,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:21:31,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:21:32,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:21:34,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:21:37,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:21:39,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:21:40,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:21:42,139 INFO [train.py:1039] (1/4) Epoch 13, batch 3750, loss[loss=0.1915, simple_loss=0.2694, pruned_loss=0.05675, over 23935.00 frames. ], tot_loss[loss=0.195, simple_loss=0.2675, pruned_loss=0.06122, over 4724227.37 frames. ], batch size: 80, lr: 7.91e-03, grad_scale: 32.0 2023-09-29 18:21:42,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 18:21:44,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 18:21:45,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 18:21:46,176 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=449966.6666666667, ans=0.1 2023-09-29 18:21:47,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 18:21:47,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:21:49,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:21:49,354 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:21:50,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:21:53,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:21:57,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:22:01,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:22:01,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 18:22:04,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:22:08,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:22:08,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 18:22:10,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:22:12,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:22:12,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:22:15,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 18:22:18,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 18:22:18,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=450100.0, ans=0.0 2023-09-29 18:22:20,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:22:21,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:22:23,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:22:25,380 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=450100.0, ans=0.125 2023-09-29 18:22:29,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:22:31,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 18:22:34,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 18:22:39,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:22:42,168 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=450166.6666666667, ans=0.1 2023-09-29 18:22:43,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:22:44,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:22:45,106 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=450166.6666666667, ans=0.1 2023-09-29 18:22:47,879 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:22:51,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 18:22:52,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 18:22:55,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:22:56,509 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.21 vs. limit=15.0 2023-09-29 18:22:57,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:22:59,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 18:23:03,487 INFO [train.py:1039] (1/4) Epoch 13, batch 3800, loss[loss=0.196, simple_loss=0.2818, pruned_loss=0.05507, over 24574.00 frames. ], tot_loss[loss=0.1955, simple_loss=0.2682, pruned_loss=0.06143, over 4722035.34 frames. ], batch size: 71, lr: 7.91e-03, grad_scale: 32.0 2023-09-29 18:23:06,904 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:23:08,362 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.674e+02 1.938e+02 2.125e+02 2.387e+02 3.006e+02, threshold=4.251e+02, percent-clipped=0.0 2023-09-29 18:23:12,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:23:12,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 18:23:12,511 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=450300.0, ans=0.125 2023-09-29 18:23:14,345 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 18:23:14,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:23:17,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:23:19,102 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 18:23:22,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 18:23:22,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:23:23,588 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:23:23,891 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=450366.6666666667, ans=0.125 2023-09-29 18:23:25,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:23:25,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:23:25,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:23:25,560 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=450366.6666666667, ans=0.07 2023-09-29 18:23:26,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 18:23:29,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 18:23:31,330 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:23:36,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:23:39,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:23:39,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 18:23:41,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 18:23:41,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:23:42,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:23:43,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:23:48,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 18:23:48,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 18:23:51,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:23:57,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:24:03,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:24:05,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 18:24:05,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 18:24:07,341 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:24:08,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:24:10,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:24:10,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 18:24:13,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 18:24:13,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 18:24:15,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:24:15,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:24:23,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:24:24,509 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 18:24:25,918 INFO [train.py:1039] (1/4) Epoch 13, batch 3850, loss[loss=0.1965, simple_loss=0.283, pruned_loss=0.05503, over 24657.00 frames. ], tot_loss[loss=0.1947, simple_loss=0.2666, pruned_loss=0.06142, over 4722568.00 frames. ], batch size: 68, lr: 7.91e-03, grad_scale: 16.0 2023-09-29 18:24:27,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:24:29,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 18:24:29,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:24:30,980 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:24:31,218 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=450633.3333333333, ans=0.125 2023-09-29 18:24:35,954 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 18:24:37,983 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:24:41,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 18:24:41,331 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=450700.0, ans=0.125 2023-09-29 18:24:41,595 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.41 vs. limit=15.0 2023-09-29 18:24:42,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 18:24:48,745 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:24:50,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:24:51,154 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=450700.0, ans=0.0 2023-09-29 18:24:52,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:24:54,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:24:55,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:24:57,187 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:24:57,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:24:57,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:24:58,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:25:01,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:25:03,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:25:03,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:25:03,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 18:25:03,475 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 18:25:05,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:25:05,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:25:09,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:25:09,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:25:09,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 18:25:11,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 18:25:13,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:25:13,600 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=450833.3333333333, ans=0.0 2023-09-29 18:25:16,301 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 18:25:19,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 18:25:22,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:25:24,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:25:26,412 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=450833.3333333333, ans=0.0 2023-09-29 18:25:29,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:25:29,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 18:25:32,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 18:25:33,244 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=450900.0, ans=0.125 2023-09-29 18:25:35,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:25:35,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:25:38,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 18:25:38,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:25:40,418 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:25:40,560 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:25:40,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:25:40,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 18:25:42,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:25:43,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 18:25:43,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:25:43,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:25:44,344 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=12.76 vs. limit=15.0 2023-09-29 18:25:47,088 INFO [train.py:1039] (1/4) Epoch 13, batch 3900, loss[loss=0.1781, simple_loss=0.2505, pruned_loss=0.05281, over 24295.00 frames. ], tot_loss[loss=0.1933, simple_loss=0.2645, pruned_loss=0.06101, over 4710086.68 frames. ], batch size: 61, lr: 7.90e-03, grad_scale: 8.0 2023-09-29 18:25:47,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:25:47,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:25:48,771 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:25:50,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:25:50,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:25:51,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:25:51,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 18:25:53,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:25:54,605 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.889e+02 2.168e+02 2.543e+02 3.582e+02, threshold=4.337e+02, percent-clipped=0.0 2023-09-29 18:25:56,292 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:25:57,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 18:25:57,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:25:57,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:26:03,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 18:26:04,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:26:06,367 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 18:26:07,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 18:26:07,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:26:08,184 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=451033.3333333333, ans=0.1 2023-09-29 18:26:09,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 18:26:11,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:26:11,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 18:26:11,688 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=451033.3333333333, ans=0.125 2023-09-29 18:26:12,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 18:26:18,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:26:20,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:26:20,697 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:26:22,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:26:25,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:26:25,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=451100.0, ans=0.0 2023-09-29 18:26:26,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:26:27,115 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=451100.0, ans=0.0 2023-09-29 18:26:29,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:26:29,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:26:30,171 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=451100.0, ans=0.0 2023-09-29 18:26:31,343 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:26:36,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:26:36,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:26:42,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 18:26:44,465 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:26:57,166 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:27:00,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:27:00,332 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 18:27:01,771 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 18:27:01,792 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:27:02,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 18:27:03,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:27:05,356 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=451300.0, ans=0.2 2023-09-29 18:27:06,298 INFO [train.py:1039] (1/4) Epoch 13, batch 3950, loss[loss=0.2126, simple_loss=0.2812, pruned_loss=0.07199, over 23360.00 frames. ], tot_loss[loss=0.1921, simple_loss=0.2641, pruned_loss=0.0601, over 4722969.16 frames. ], batch size: 105, lr: 7.90e-03, grad_scale: 8.0 2023-09-29 18:27:06,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 18:27:12,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:27:14,136 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 18:27:15,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:27:17,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:27:18,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:27:23,686 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 18:27:23,790 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=451366.6666666667, ans=0.125 2023-09-29 18:27:25,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:27:25,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 18:27:25,165 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 18:27:25,218 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:27:28,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:27:29,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 18:27:29,017 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:27:30,969 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=451366.6666666667, ans=0.125 2023-09-29 18:27:32,054 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 18:27:33,844 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=451366.6666666667, ans=0.0 2023-09-29 18:27:34,487 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.12 vs. limit=15.0 2023-09-29 18:27:35,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:27:35,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:27:35,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:27:36,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:27:36,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:27:38,964 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.58 vs. limit=10.0 2023-09-29 18:27:49,817 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=451433.3333333333, ans=0.0 2023-09-29 18:27:50,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:27:50,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:27:51,294 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=451433.3333333333, ans=0.2 2023-09-29 18:27:51,323 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=451433.3333333333, ans=0.07 2023-09-29 18:27:55,681 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=451500.0, ans=0.125 2023-09-29 18:27:57,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 18:28:05,064 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 18:28:05,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 18:28:05,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:28:05,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:28:05,395 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=451500.0, ans=0.125 2023-09-29 18:28:13,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:28:13,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 18:28:13,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:28:13,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:28:14,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 18:28:21,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:28:22,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:28:26,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 18:28:30,226 INFO [train.py:1039] (1/4) Epoch 13, batch 4000, loss[loss=0.2024, simple_loss=0.2687, pruned_loss=0.06809, over 23528.00 frames. ], tot_loss[loss=0.1934, simple_loss=0.2656, pruned_loss=0.06062, over 4724560.40 frames. ], batch size: 256, lr: 7.90e-03, grad_scale: 16.0 2023-09-29 18:28:37,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:28:38,703 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.910e+02 2.120e+02 2.727e+02 3.930e+02, threshold=4.239e+02, percent-clipped=0.0 2023-09-29 18:28:43,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:28:48,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:28:49,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:28:49,819 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:28:51,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 18:28:51,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 18:28:53,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 18:28:53,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:28:53,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 18:28:55,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:28:57,472 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=451700.0, ans=0.0 2023-09-29 18:28:58,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:28:58,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:28:58,849 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:29:00,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:29:00,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 18:29:01,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:29:03,305 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 18:29:04,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:29:04,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:29:07,970 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 18:29:09,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 18:29:09,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:29:16,228 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 18:29:17,715 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:29:19,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:29:20,929 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 18:29:23,804 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:29:23,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 18:29:25,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:29:27,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:29:27,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:29:28,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:29:29,085 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=451833.3333333333, ans=0.0 2023-09-29 18:29:30,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 18:29:30,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:29:32,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 18:29:32,880 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=451833.3333333333, ans=0.2 2023-09-29 18:29:33,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:29:35,566 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 18:29:37,504 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=451900.0, ans=0.125 2023-09-29 18:29:40,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 18:29:43,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 18:29:46,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:29:46,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:29:48,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:29:48,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:29:51,466 INFO [train.py:1039] (1/4) Epoch 13, batch 4050, loss[loss=0.2001, simple_loss=0.2638, pruned_loss=0.06825, over 23465.00 frames. ], tot_loss[loss=0.1935, simple_loss=0.2661, pruned_loss=0.06048, over 4734192.31 frames. ], batch size: 134, lr: 7.90e-03, grad_scale: 16.0 2023-09-29 18:29:54,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:29:56,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 18:29:56,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 18:29:59,560 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:29:59,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:30:01,611 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:30:03,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 18:30:04,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:30:08,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:30:11,747 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:30:13,145 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 18:30:14,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:30:16,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:30:19,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:30:20,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 18:30:22,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 18:30:25,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 18:30:25,561 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 18:30:28,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:30:35,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 18:30:37,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:30:40,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:30:43,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:30:44,595 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:30:44,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:30:46,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:30:49,760 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=452166.6666666667, ans=0.125 2023-09-29 18:30:50,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 18:30:50,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 18:30:52,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:30:54,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 18:30:58,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:31:07,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 18:31:07,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:31:07,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:31:10,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 18:31:12,612 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 18:31:12,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:31:14,148 INFO [train.py:1039] (1/4) Epoch 13, batch 4100, loss[loss=0.1743, simple_loss=0.242, pruned_loss=0.05331, over 24447.00 frames. ], tot_loss[loss=0.1937, simple_loss=0.2659, pruned_loss=0.06075, over 4723620.39 frames. ], batch size: 58, lr: 7.89e-03, grad_scale: 16.0 2023-09-29 18:31:14,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:31:15,126 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:31:16,436 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:31:16,830 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=452300.0, ans=0.125 2023-09-29 18:31:22,211 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 1.980e+02 2.231e+02 2.743e+02 3.910e+02, threshold=4.461e+02, percent-clipped=0.0 2023-09-29 18:31:22,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 18:31:25,464 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 18:31:25,696 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=452300.0, ans=0.125 2023-09-29 18:31:25,708 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=452300.0, ans=0.125 2023-09-29 18:31:26,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 18:31:28,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 18:31:28,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:31:29,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:31:30,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:31:30,049 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:31:30,169 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 18:31:33,355 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:31:34,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:31:34,888 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:31:38,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:31:38,950 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.43 vs. limit=15.0 2023-09-29 18:31:41,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 18:31:43,382 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:31:43,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:31:43,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 18:31:43,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:31:43,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:31:43,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:31:45,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:31:45,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 18:31:49,502 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:31:50,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 18:31:52,461 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:31:55,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:31:55,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 18:31:56,275 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.04 vs. limit=15.0 2023-09-29 18:31:56,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:31:57,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:31:58,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:32:00,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 18:32:00,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 18:32:00,644 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=452433.3333333333, ans=0.0 2023-09-29 18:32:01,711 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:32:04,695 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 18:32:04,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:32:06,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:32:09,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:32:11,151 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=452500.0, ans=0.1 2023-09-29 18:32:15,214 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:32:16,033 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.08 vs. limit=15.0 2023-09-29 18:32:17,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:32:19,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:32:26,110 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=452566.6666666667, ans=0.2 2023-09-29 18:32:26,572 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=11.77 vs. limit=15.0 2023-09-29 18:32:27,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:32:27,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:32:27,555 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=452566.6666666667, ans=0.2 2023-09-29 18:32:30,575 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=452566.6666666667, ans=0.125 2023-09-29 18:32:31,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:32:34,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:32:36,435 INFO [train.py:1039] (1/4) Epoch 13, batch 4150, loss[loss=0.2064, simple_loss=0.2696, pruned_loss=0.07159, over 23818.00 frames. ], tot_loss[loss=0.195, simple_loss=0.267, pruned_loss=0.06147, over 4718623.60 frames. ], batch size: 179, lr: 7.89e-03, grad_scale: 16.0 2023-09-29 18:32:38,064 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:32:38,231 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:32:39,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:32:39,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:32:41,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 18:32:42,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:32:44,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 18:32:46,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 18:32:46,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 18:32:48,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:32:52,050 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=452700.0, ans=0.125 2023-09-29 18:32:53,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:32:53,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:32:57,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:32:59,173 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:33:00,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 18:33:02,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 18:33:02,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:33:03,883 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 18:33:05,766 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=452700.0, ans=0.125 2023-09-29 18:33:08,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:33:12,031 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=452766.6666666667, ans=0.125 2023-09-29 18:33:13,303 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:33:14,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 18:33:16,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 18:33:16,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:33:18,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 18:33:18,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:33:18,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:33:20,528 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=452766.6666666667, ans=0.2 2023-09-29 18:33:21,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:33:22,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:33:25,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 18:33:30,002 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 18:33:31,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:33:31,716 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 18:33:33,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:33:33,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 18:33:35,184 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:33:35,758 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.35 vs. limit=22.5 2023-09-29 18:33:37,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:33:39,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:33:40,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:33:42,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 18:33:42,325 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:33:42,329 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 18:33:44,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 18:33:44,564 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=452900.0, ans=0.2 2023-09-29 18:33:45,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 18:33:45,747 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:33:45,762 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:33:47,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 18:33:48,735 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 18:33:48,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:33:48,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 18:33:50,148 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:33:51,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:33:51,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 18:33:51,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 18:33:59,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:34:00,823 INFO [train.py:1039] (1/4) Epoch 13, batch 4200, loss[loss=0.177, simple_loss=0.2218, pruned_loss=0.06613, over 19221.00 frames. ], tot_loss[loss=0.1946, simple_loss=0.266, pruned_loss=0.06158, over 4707589.03 frames. ], batch size: 389, lr: 7.89e-03, grad_scale: 16.0 2023-09-29 18:34:01,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 18:34:04,600 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:34:06,184 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:34:07,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:34:09,029 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.933e+02 2.202e+02 2.518e+02 4.955e+02, threshold=4.404e+02, percent-clipped=0.0 2023-09-29 18:34:09,170 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:34:09,173 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:34:10,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 18:34:15,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 18:34:15,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:34:15,850 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=453033.3333333333, ans=0.125 2023-09-29 18:34:17,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:34:21,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:34:24,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 18:34:26,253 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:34:26,316 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:34:28,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 18:34:28,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:34:28,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:34:30,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:34:30,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:34:31,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:34:31,809 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=453100.0, ans=0.1 2023-09-29 18:34:35,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 18:34:35,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:34:39,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 18:34:41,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:34:44,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:34:44,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:34:47,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:34:47,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 18:34:47,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:34:49,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:34:55,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 18:34:56,625 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:35:02,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:35:05,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 18:35:07,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:35:12,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 18:35:14,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:35:17,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 18:35:22,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 18:35:23,567 INFO [train.py:1039] (1/4) Epoch 13, batch 4250, loss[loss=0.2133, simple_loss=0.2712, pruned_loss=0.07765, over 23767.00 frames. ], tot_loss[loss=0.1934, simple_loss=0.2645, pruned_loss=0.06119, over 4711698.41 frames. ], batch size: 212, lr: 7.88e-03, grad_scale: 16.0 2023-09-29 18:35:25,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:35:25,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 18:35:27,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:35:36,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 18:35:37,050 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 18:35:37,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:35:40,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:35:43,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:35:46,492 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=453366.6666666667, ans=0.125 2023-09-29 18:35:49,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:35:49,378 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:35:51,023 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:35:51,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:35:52,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:35:52,700 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:35:53,035 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=453366.6666666667, ans=0.125 2023-09-29 18:35:54,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:35:57,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:35:58,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:36:00,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 18:36:03,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 18:36:03,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:36:04,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:36:04,770 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:36:06,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:36:06,265 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:36:06,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:36:11,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 18:36:11,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:36:18,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:36:18,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:36:20,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 18:36:20,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:36:22,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 18:36:24,096 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:36:25,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 18:36:26,102 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=453500.0, ans=0.0 2023-09-29 18:36:28,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:36:28,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:36:28,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 18:36:30,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 18:36:31,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 18:36:36,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:36:38,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:36:39,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:36:42,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:36:42,266 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:36:43,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:36:45,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:36:45,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 18:36:46,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:36:50,402 INFO [train.py:1039] (1/4) Epoch 13, batch 4300, loss[loss=0.2205, simple_loss=0.2646, pruned_loss=0.08817, over 19234.00 frames. ], tot_loss[loss=0.1921, simple_loss=0.2633, pruned_loss=0.06046, over 4696004.70 frames. ], batch size: 388, lr: 7.88e-03, grad_scale: 16.0 2023-09-29 18:36:52,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:36:52,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:36:57,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:36:58,706 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.604e+02 2.045e+02 2.305e+02 2.959e+02 4.581e+02, threshold=4.610e+02, percent-clipped=2.0 2023-09-29 18:37:04,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:37:04,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 18:37:06,561 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:37:08,160 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 18:37:08,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:37:08,211 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 18:37:11,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 18:37:13,073 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=453700.0, ans=0.125 2023-09-29 18:37:14,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:37:19,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 18:37:19,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:37:20,551 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 18:37:20,986 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=453766.6666666667, ans=0.125 2023-09-29 18:37:22,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 18:37:23,831 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:37:29,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:37:29,391 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:37:31,438 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:37:31,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:37:33,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:37:33,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 18:37:34,763 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 18:37:37,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:37:38,521 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.21 vs. limit=22.5 2023-09-29 18:37:41,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:37:41,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 18:37:41,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:37:41,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:37:41,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 18:37:41,147 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 18:37:42,705 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 18:37:43,772 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=15.65 vs. limit=15.0 2023-09-29 18:37:44,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:37:44,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 18:37:44,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 18:37:47,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:37:49,056 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 18:37:50,619 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:37:50,782 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=453833.3333333333, ans=0.0 2023-09-29 18:37:50,974 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=453833.3333333333, ans=0.2 2023-09-29 18:37:52,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:37:52,218 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:37:54,202 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 18:37:55,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:37:55,622 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:37:55,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:37:55,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:37:57,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:37:58,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:38:01,306 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=453900.0, ans=0.2 2023-09-29 18:38:02,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:38:04,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:38:04,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:38:09,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 18:38:10,530 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 18:38:11,802 INFO [train.py:1039] (1/4) Epoch 13, batch 4350, loss[loss=0.1863, simple_loss=0.2593, pruned_loss=0.05665, over 23555.00 frames. ], tot_loss[loss=0.1923, simple_loss=0.264, pruned_loss=0.06033, over 4717483.48 frames. ], batch size: 120, lr: 7.88e-03, grad_scale: 16.0 2023-09-29 18:38:12,337 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:38:15,188 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:38:18,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:38:21,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:38:21,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:38:23,799 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.87 vs. limit=12.0 2023-09-29 18:38:25,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:38:31,103 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:38:32,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:38:32,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:38:37,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 18:38:39,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:38:40,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 18:38:42,466 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=454033.3333333333, ans=0.0 2023-09-29 18:38:46,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 18:38:46,906 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=454100.0, ans=0.0 2023-09-29 18:38:48,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:38:48,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:38:48,675 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=454100.0, ans=0.2 2023-09-29 18:38:54,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:38:57,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 18:39:00,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:39:02,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 18:39:08,744 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 18:39:08,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:39:09,567 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.17 vs. limit=15.0 2023-09-29 18:39:10,354 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:39:10,477 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 18:39:12,591 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 18:39:12,600 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:39:14,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:39:16,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:39:16,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:39:16,565 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=454233.3333333333, ans=0.125 2023-09-29 18:39:17,199 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.58 vs. limit=6.0 2023-09-29 18:39:19,116 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:39:19,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:39:22,232 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 18:39:22,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:39:22,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:39:22,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:39:23,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 18:39:25,296 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 18:39:25,303 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 18:39:25,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 18:39:27,977 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.70 vs. limit=15.0 2023-09-29 18:39:28,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:39:29,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 18:39:29,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:39:31,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:39:32,927 INFO [train.py:1039] (1/4) Epoch 13, batch 4400, loss[loss=0.1873, simple_loss=0.2623, pruned_loss=0.05614, over 24301.00 frames. ], tot_loss[loss=0.193, simple_loss=0.2651, pruned_loss=0.06044, over 4716265.81 frames. ], batch size: 61, lr: 7.88e-03, grad_scale: 32.0 2023-09-29 18:39:33,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 18:39:35,967 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 18:39:35,978 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:39:39,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:39:39,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:39:41,133 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.590e+02 1.889e+02 2.108e+02 2.568e+02 3.749e+02, threshold=4.217e+02, percent-clipped=0.0 2023-09-29 18:39:42,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:39:44,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 18:39:44,537 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 18:39:44,598 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 18:39:46,078 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 18:39:46,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 18:39:46,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:39:46,630 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=454300.0, ans=0.125 2023-09-29 18:39:50,086 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 18:39:51,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:39:51,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:39:51,765 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 18:39:55,326 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:39:55,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 18:39:55,396 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 18:39:58,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 18:39:59,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 18:39:59,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 18:39:59,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:40:01,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:40:02,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:40:02,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:40:04,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 18:40:04,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 18:40:04,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:40:07,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:40:07,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:40:09,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:40:09,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:40:09,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 18:40:10,777 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 18:40:11,678 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys.whitening_limit, batch_count=454433.3333333333, ans=6.0 2023-09-29 18:40:14,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:40:16,912 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=454433.3333333333, ans=0.0 2023-09-29 18:40:20,906 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=454433.3333333333, ans=0.2 2023-09-29 18:40:24,846 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:40:28,492 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 18:40:33,118 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:40:34,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:40:37,851 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:40:37,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 18:40:37,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:40:38,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 18:40:38,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:40:39,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 18:40:41,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 18:40:44,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 18:40:45,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 18:40:45,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:40:45,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 18:40:47,436 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:40:49,172 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=454566.6666666667, ans=0.0 2023-09-29 18:40:51,135 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:40:54,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 18:40:56,116 INFO [train.py:1039] (1/4) Epoch 13, batch 4450, loss[loss=0.1834, simple_loss=0.2703, pruned_loss=0.04825, over 24344.00 frames. ], tot_loss[loss=0.1946, simple_loss=0.2668, pruned_loss=0.06119, over 4707975.36 frames. ], batch size: 74, lr: 7.87e-03, grad_scale: 32.0 2023-09-29 18:40:56,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:40:58,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:40:58,107 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:41:06,297 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:41:06,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:41:12,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:41:12,751 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=454700.0, ans=0.1 2023-09-29 18:41:13,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:41:17,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:41:17,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:41:18,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 18:41:18,557 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:41:18,677 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:41:18,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:41:18,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 18:41:21,587 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 18:41:21,783 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=454700.0, ans=0.125 2023-09-29 18:41:27,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:41:28,054 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:41:29,639 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:41:31,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:41:31,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:41:33,695 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.63 vs. limit=15.0 2023-09-29 18:41:35,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 18:41:37,908 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 18:41:37,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 18:41:37,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:41:41,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:41:42,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 18:41:43,023 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=454766.6666666667, ans=0.125 2023-09-29 18:41:45,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 18:41:50,167 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:41:50,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 18:41:50,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:41:50,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:41:50,341 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:41:50,355 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:41:53,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:41:55,569 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 18:41:57,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 18:41:59,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 18:42:02,874 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:42:03,160 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:42:04,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:42:05,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:42:07,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 18:42:07,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 18:42:09,525 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=454900.0, ans=0.125 2023-09-29 18:42:10,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 18:42:13,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:42:17,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:42:18,807 INFO [train.py:1039] (1/4) Epoch 13, batch 4500, loss[loss=0.1791, simple_loss=0.2684, pruned_loss=0.04486, over 24649.00 frames. ], tot_loss[loss=0.1948, simple_loss=0.2666, pruned_loss=0.06148, over 4707142.71 frames. ], batch size: 73, lr: 7.87e-03, grad_scale: 32.0 2023-09-29 18:42:18,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 18:42:18,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 18:42:20,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:42:25,176 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:42:25,281 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:42:25,620 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=454966.6666666667, ans=0.2 2023-09-29 18:42:26,534 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.493e+02 2.036e+02 2.219e+02 2.497e+02 4.181e+02, threshold=4.438e+02, percent-clipped=0.0 2023-09-29 18:42:28,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 18:42:28,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:42:30,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:42:30,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:42:39,177 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=455033.3333333333, ans=0.125 2023-09-29 18:42:43,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:42:43,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:42:47,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:42:49,913 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:42:51,482 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 18:42:53,626 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.78 vs. limit=12.0 2023-09-29 18:42:57,530 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 18:43:02,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:43:07,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 18:43:10,918 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.62 vs. limit=12.0 2023-09-29 18:43:11,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:43:11,923 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=455166.6666666667, ans=0.125 2023-09-29 18:43:12,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 18:43:13,041 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:43:13,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:43:14,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:43:14,707 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:43:17,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:43:17,891 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 18:43:17,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 18:43:17,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:43:22,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:43:22,623 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:43:26,519 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:43:29,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 18:43:29,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:43:29,733 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=455233.3333333333, ans=0.0 2023-09-29 18:43:31,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 18:43:31,444 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=455233.3333333333, ans=0.1 2023-09-29 18:43:32,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 18:43:32,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 18:43:38,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 18:43:40,820 INFO [train.py:1039] (1/4) Epoch 13, batch 4550, loss[loss=0.1841, simple_loss=0.2533, pruned_loss=0.05751, over 23469.00 frames. ], tot_loss[loss=0.1931, simple_loss=0.2646, pruned_loss=0.06075, over 4693451.37 frames. ], batch size: 134, lr: 7.87e-03, grad_scale: 32.0 2023-09-29 18:43:41,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 18:43:44,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:43:46,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:43:47,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:43:49,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:43:54,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:43:55,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:43:58,632 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:43:58,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:43:58,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:44:00,858 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:44:02,319 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:44:04,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:44:07,151 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 18:44:08,567 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 18:44:10,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:44:11,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 18:44:16,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 18:44:16,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:44:20,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 18:44:23,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 18:44:25,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:44:25,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:44:25,340 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:44:28,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 18:44:31,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:44:34,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:44:34,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:44:37,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:44:37,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 18:44:39,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 18:44:39,376 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:44:39,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 18:44:39,759 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=455500.0, ans=0.0 2023-09-29 18:44:43,920 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 18:44:43,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:44:45,486 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:44:45,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:44:47,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:44:47,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:44:48,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 18:44:50,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 18:44:52,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:44:52,800 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.28 vs. limit=15.0 2023-09-29 18:44:54,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 18:44:54,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 18:44:54,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:44:54,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 18:44:57,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:44:57,311 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:45:00,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:45:00,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:45:01,774 INFO [train.py:1039] (1/4) Epoch 13, batch 4600, loss[loss=0.1901, simple_loss=0.2729, pruned_loss=0.05363, over 24045.00 frames. ], tot_loss[loss=0.1912, simple_loss=0.263, pruned_loss=0.05974, over 4703786.80 frames. ], batch size: 80, lr: 7.86e-03, grad_scale: 32.0 2023-09-29 18:45:01,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 18:45:03,414 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:45:05,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 18:45:07,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:45:07,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:45:10,378 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.517e+02 1.841e+02 2.065e+02 2.321e+02 3.867e+02, threshold=4.130e+02, percent-clipped=0.0 2023-09-29 18:45:10,585 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:45:10,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:45:12,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:45:13,560 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 18:45:15,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:45:19,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:45:21,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:45:22,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:45:23,942 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=455700.0, ans=0.0 2023-09-29 18:45:25,075 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=455700.0, ans=0.125 2023-09-29 18:45:30,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 18:45:31,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:45:32,096 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=455700.0, ans=0.2 2023-09-29 18:45:34,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:45:38,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:45:38,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:45:40,384 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=455766.6666666667, ans=0.2 2023-09-29 18:45:44,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 18:45:44,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 18:45:44,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:45:44,996 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=455766.6666666667, ans=0.1 2023-09-29 18:45:49,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:45:49,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 18:45:51,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:45:54,207 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 18:45:57,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 18:46:01,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:46:03,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:46:05,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:46:05,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 18:46:06,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:46:06,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 18:46:06,642 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:46:08,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:46:09,794 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:46:09,922 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:46:11,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:46:11,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 18:46:13,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 18:46:13,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 18:46:13,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:46:16,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:46:16,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:46:16,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:46:18,265 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=455900.0, ans=0.125 2023-09-29 18:46:21,869 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=455900.0, ans=0.0 2023-09-29 18:46:24,437 INFO [train.py:1039] (1/4) Epoch 13, batch 4650, loss[loss=0.1781, simple_loss=0.2491, pruned_loss=0.0536, over 24402.00 frames. ], tot_loss[loss=0.1909, simple_loss=0.2626, pruned_loss=0.05964, over 4691998.63 frames. ], batch size: 58, lr: 7.86e-03, grad_scale: 32.0 2023-09-29 18:46:24,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:46:28,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:46:29,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:46:29,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:46:29,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:46:30,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:46:30,552 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:46:30,811 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=455966.6666666667, ans=0.2 2023-09-29 18:46:35,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 18:46:39,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:46:43,071 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 18:46:43,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:46:43,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 18:46:43,312 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:46:44,757 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 18:46:44,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 18:46:46,225 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:46:46,326 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:46:49,412 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:46:50,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:46:52,459 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 18:46:52,780 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=456033.3333333333, ans=0.0 2023-09-29 18:46:54,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:46:55,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 18:46:58,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:46:58,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:47:00,347 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 18:47:00,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:47:04,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:47:09,268 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:47:13,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:47:15,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:47:16,533 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.44 vs. limit=22.5 2023-09-29 18:47:17,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:47:17,214 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=456166.6666666667, ans=0.2 2023-09-29 18:47:18,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 18:47:20,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 18:47:21,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 18:47:22,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 18:47:22,075 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 18:47:23,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:47:29,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:47:29,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:47:31,262 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 18:47:31,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:47:31,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:47:31,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:47:33,489 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:47:36,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:47:36,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:47:39,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:47:40,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:47:42,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:47:42,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 18:47:42,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 18:47:43,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:47:45,493 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 18:47:46,876 INFO [train.py:1039] (1/4) Epoch 13, batch 4700, loss[loss=0.2164, simple_loss=0.2782, pruned_loss=0.07732, over 23656.00 frames. ], tot_loss[loss=0.1918, simple_loss=0.2636, pruned_loss=0.05999, over 4693154.16 frames. ], batch size: 256, lr: 7.86e-03, grad_scale: 32.0 2023-09-29 18:47:52,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:47:52,784 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=456300.0, ans=0.125 2023-09-29 18:47:53,961 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:47:54,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:47:55,239 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 2.032e+02 2.349e+02 2.827e+02 4.344e+02, threshold=4.699e+02, percent-clipped=1.0 2023-09-29 18:47:55,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:47:56,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 18:48:02,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 18:48:02,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 18:48:05,174 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=456366.6666666667, ans=0.125 2023-09-29 18:48:06,487 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:48:08,717 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:48:08,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:48:11,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:48:14,193 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=456366.6666666667, ans=0.125 2023-09-29 18:48:14,750 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.38 vs. limit=10.0 2023-09-29 18:48:20,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:48:20,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 18:48:23,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:48:31,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 18:48:33,321 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:48:33,577 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=456433.3333333333, ans=0.1 2023-09-29 18:48:36,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:48:39,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 18:48:41,314 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:48:46,664 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:48:46,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 18:48:48,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:48:48,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:48:52,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:48:52,794 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:48:54,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 18:48:55,735 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 18:48:55,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:48:59,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:48:59,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:48:59,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 18:49:00,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:49:04,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 18:49:07,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:49:07,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:49:10,441 INFO [train.py:1039] (1/4) Epoch 13, batch 4750, loss[loss=0.1656, simple_loss=0.2411, pruned_loss=0.045, over 24306.00 frames. ], tot_loss[loss=0.1922, simple_loss=0.2648, pruned_loss=0.05979, over 4698785.86 frames. ], batch size: 56, lr: 7.86e-03, grad_scale: 32.0 2023-09-29 18:49:12,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:49:13,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:49:15,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 18:49:15,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:49:18,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 18:49:20,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:49:20,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:49:21,484 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.90 vs. limit=10.0 2023-09-29 18:49:22,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:49:27,064 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.07 vs. limit=22.5 2023-09-29 18:49:29,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 18:49:31,548 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=456700.0, ans=0.0 2023-09-29 18:49:33,247 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:49:34,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 18:49:36,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:49:39,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:49:39,371 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:49:39,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:49:40,967 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 18:49:40,972 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 18:49:42,648 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=456766.6666666667, ans=10.0 2023-09-29 18:49:48,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 18:49:51,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:49:53,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:49:56,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:49:56,707 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 18:49:56,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:49:59,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:50:02,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:50:04,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 18:50:05,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 18:50:06,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:50:06,094 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:50:07,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:50:07,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 18:50:07,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 18:50:11,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 18:50:14,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:50:17,374 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:50:17,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 18:50:17,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:50:18,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:50:21,945 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 18:50:22,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:50:23,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:50:25,682 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:50:25,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 18:50:27,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 18:50:28,835 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 18:50:31,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 18:50:31,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:50:33,039 INFO [train.py:1039] (1/4) Epoch 13, batch 4800, loss[loss=0.2164, simple_loss=0.2781, pruned_loss=0.07735, over 23864.00 frames. ], tot_loss[loss=0.1936, simple_loss=0.2661, pruned_loss=0.06049, over 4712866.46 frames. ], batch size: 164, lr: 7.85e-03, grad_scale: 32.0 2023-09-29 18:50:33,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 18:50:40,656 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:50:40,750 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:50:43,536 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.667e+02 1.965e+02 2.180e+02 2.463e+02 4.053e+02, threshold=4.360e+02, percent-clipped=0.0 2023-09-29 18:50:47,379 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:50:49,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:50:49,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:50:49,457 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=457033.3333333333, ans=0.2 2023-09-29 18:50:50,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 18:50:50,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:50:51,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:50:53,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:50:58,022 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:50:59,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:50:59,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:50:59,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:50:59,813 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 18:50:59,837 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:51:02,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:51:03,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:51:03,887 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=457033.3333333333, ans=0.0 2023-09-29 18:51:06,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:51:07,423 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=14.40 vs. limit=15.0 2023-09-29 18:51:08,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:51:08,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 18:51:09,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 18:51:12,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:51:12,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 18:51:14,114 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 18:51:14,255 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:51:15,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:51:15,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:51:15,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:51:15,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:51:20,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:51:20,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:51:22,383 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=457166.6666666667, ans=0.125 2023-09-29 18:51:25,286 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:51:28,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:51:28,676 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=457166.6666666667, ans=0.1 2023-09-29 18:51:30,818 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.74 vs. limit=15.0 2023-09-29 18:51:31,418 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:51:36,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 18:51:36,208 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:51:36,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:51:38,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:51:39,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:51:42,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:51:43,210 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=457233.3333333333, ans=0.1 2023-09-29 18:51:44,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:51:45,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:51:46,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:51:46,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:51:47,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:51:50,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:51:51,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:51:51,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:51:53,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 18:51:54,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 18:51:54,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:51:54,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:51:56,742 INFO [train.py:1039] (1/4) Epoch 13, batch 4850, loss[loss=0.1792, simple_loss=0.2413, pruned_loss=0.05859, over 23433.00 frames. ], tot_loss[loss=0.1936, simple_loss=0.266, pruned_loss=0.06058, over 4721643.51 frames. ], batch size: 285, lr: 7.85e-03, grad_scale: 32.0 2023-09-29 18:51:56,811 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:51:56,812 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:52:01,322 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:52:05,536 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.59 vs. limit=15.0 2023-09-29 18:52:07,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 18:52:08,015 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:52:13,251 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:52:14,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 18:52:14,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:52:15,015 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=457366.6666666667, ans=0.125 2023-09-29 18:52:20,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:52:20,461 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=457366.6666666667, ans=0.2 2023-09-29 18:52:21,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:52:23,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:52:23,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 18:52:27,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:52:29,218 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:52:29,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 18:52:30,739 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:52:30,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 18:52:35,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:52:35,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:52:38,272 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.11 vs. limit=15.0 2023-09-29 18:52:39,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:52:39,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 18:52:39,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 18:52:40,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 18:52:46,283 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=457500.0, ans=0.2 2023-09-29 18:52:47,731 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=457500.0, ans=0.125 2023-09-29 18:52:49,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:52:49,170 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 18:52:50,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:52:50,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:52:52,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 18:52:55,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 18:52:55,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:52:56,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 18:52:56,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:52:56,882 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=457500.0, ans=0.125 2023-09-29 18:52:58,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:52:58,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 18:53:07,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:53:12,906 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=457566.6666666667, ans=0.0 2023-09-29 18:53:14,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:53:14,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:53:18,638 INFO [train.py:1039] (1/4) Epoch 13, batch 4900, loss[loss=0.1934, simple_loss=0.2393, pruned_loss=0.07376, over 18983.00 frames. ], tot_loss[loss=0.1932, simple_loss=0.2647, pruned_loss=0.06082, over 4709046.49 frames. ], batch size: 388, lr: 7.85e-03, grad_scale: 16.0 2023-09-29 18:53:22,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 18:53:22,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:53:27,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:53:29,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:53:29,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:53:32,646 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 2.076e+02 2.390e+02 2.815e+02 4.365e+02, threshold=4.780e+02, percent-clipped=1.0 2023-09-29 18:53:32,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 18:53:36,453 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.60 vs. limit=12.0 2023-09-29 18:53:38,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 18:53:39,292 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=457700.0, ans=0.125 2023-09-29 18:53:42,321 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=457700.0, ans=0.0 2023-09-29 18:53:43,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 18:53:43,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 18:53:45,169 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 18:53:45,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:53:45,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:53:46,651 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:53:46,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:53:46,778 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 18:53:50,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 18:53:51,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 18:53:52,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 18:53:53,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 18:53:55,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:53:56,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:53:58,637 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:53:58,653 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 18:54:00,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:54:00,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:54:00,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 18:54:00,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 18:54:06,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 18:54:06,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:54:08,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:54:08,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:54:09,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:54:09,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 18:54:09,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:54:11,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 18:54:14,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:54:15,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 18:54:17,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:54:20,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 18:54:21,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:54:23,210 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 18:54:23,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 18:54:31,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:54:33,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 18:54:35,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 18:54:35,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 18:54:35,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:54:37,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:54:40,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:54:40,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:54:40,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:54:41,781 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 18:54:41,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 18:54:43,210 INFO [train.py:1039] (1/4) Epoch 13, batch 4950, loss[loss=0.1889, simple_loss=0.2581, pruned_loss=0.05988, over 23326.00 frames. ], tot_loss[loss=0.1922, simple_loss=0.2635, pruned_loss=0.06043, over 4708151.60 frames. ], batch size: 105, lr: 7.84e-03, grad_scale: 8.0 2023-09-29 18:54:46,414 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:54:46,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 18:54:46,686 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=457966.6666666667, ans=0.125 2023-09-29 18:54:49,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 18:54:50,845 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 18:54:50,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 18:54:52,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 18:54:52,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:54:52,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:54:52,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:54:52,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:54:55,666 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:54:57,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:54:57,301 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:54:58,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:55:02,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:55:04,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:55:04,714 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=458033.3333333333, ans=0.2 2023-09-29 18:55:08,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 18:55:13,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:55:13,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 18:55:15,366 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:55:16,795 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:55:19,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:55:19,913 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 18:55:21,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 18:55:23,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:55:23,438 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=458100.0, ans=0.0 2023-09-29 18:55:26,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:55:26,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:55:27,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:55:27,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:55:29,185 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:55:30,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:55:33,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 18:55:34,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:55:34,355 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=458166.6666666667, ans=0.125 2023-09-29 18:55:35,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:55:35,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:55:38,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 18:55:38,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:55:40,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:55:44,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:55:45,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:55:45,696 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:55:45,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:55:47,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:55:47,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:55:48,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:55:50,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:55:51,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:55:51,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 18:55:56,346 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:56:01,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 18:56:02,374 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 18:56:05,455 INFO [train.py:1039] (1/4) Epoch 13, batch 5000, loss[loss=0.184, simple_loss=0.2631, pruned_loss=0.05243, over 24514.00 frames. ], tot_loss[loss=0.1922, simple_loss=0.2637, pruned_loss=0.06038, over 4707335.58 frames. ], batch size: 66, lr: 7.84e-03, grad_scale: 8.0 2023-09-29 18:56:05,987 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=458300.0, ans=0.07 2023-09-29 18:56:07,165 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:56:07,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 18:56:09,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 18:56:10,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 18:56:13,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:56:14,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 18:56:14,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:56:14,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:56:16,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 18:56:16,276 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:56:17,758 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:56:19,075 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.946e+02 2.294e+02 2.903e+02 4.132e+02, threshold=4.587e+02, percent-clipped=0.0 2023-09-29 18:56:19,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 18:56:19,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:56:19,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:56:20,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 18:56:20,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 18:56:22,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:56:23,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 18:56:23,797 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 18:56:23,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:56:24,018 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=458366.6666666667, ans=0.2 2023-09-29 18:56:24,120 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=458366.6666666667, ans=0.125 2023-09-29 18:56:25,395 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 18:56:25,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 18:56:25,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 18:56:27,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 18:56:28,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:56:28,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:56:30,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 18:56:30,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 18:56:31,687 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:56:31,820 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:56:33,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 18:56:36,222 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 18:56:36,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:56:39,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:56:43,414 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 18:56:46,406 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=458433.3333333333, ans=0.0 2023-09-29 18:56:47,537 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:56:49,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:56:49,041 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:56:52,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 18:56:52,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:56:53,616 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:56:53,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:56:55,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 18:56:56,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:56:59,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:57:01,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:57:07,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 18:57:07,564 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=458500.0, ans=0.2 2023-09-29 18:57:11,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:57:15,868 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=458566.6666666667, ans=0.1 2023-09-29 18:57:23,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:57:23,933 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:57:23,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:57:25,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:57:25,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 18:57:25,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 18:57:25,293 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:57:28,283 INFO [train.py:1039] (1/4) Epoch 13, batch 5050, loss[loss=0.2095, simple_loss=0.2729, pruned_loss=0.07307, over 23416.00 frames. ], tot_loss[loss=0.1924, simple_loss=0.2643, pruned_loss=0.06023, over 4725019.16 frames. ], batch size: 285, lr: 7.84e-03, grad_scale: 8.0 2023-09-29 18:57:29,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:57:29,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 18:57:31,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:57:33,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:57:34,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:57:34,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 18:57:36,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:57:38,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:57:39,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 18:57:41,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:57:42,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 18:57:49,383 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=458700.0, ans=0.125 2023-09-29 18:57:54,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 18:57:54,758 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 18:57:54,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:57:54,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 18:57:57,107 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:57:58,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:57:58,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:57:58,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:57:58,820 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 18:57:59,739 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.41 vs. limit=15.0 2023-09-29 18:58:00,288 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 18:58:00,786 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=458766.6666666667, ans=0.125 2023-09-29 18:58:01,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:58:03,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:58:07,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:58:07,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 18:58:08,392 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.08 vs. limit=15.0 2023-09-29 18:58:10,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:58:12,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 18:58:12,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:58:12,853 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=458766.6666666667, ans=10.0 2023-09-29 18:58:13,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:58:14,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:58:15,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:58:17,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:58:20,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:58:20,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:58:20,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:58:21,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:58:21,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 18:58:23,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:58:26,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:58:32,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:58:33,439 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 18:58:33,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 18:58:34,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:58:34,378 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:58:35,828 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 18:58:40,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:58:40,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 18:58:40,339 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:58:42,224 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=458900.0, ans=0.1 2023-09-29 18:58:43,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:58:43,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:58:44,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 18:58:46,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 18:58:48,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:58:48,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:58:49,640 INFO [train.py:1039] (1/4) Epoch 13, batch 5100, loss[loss=0.191, simple_loss=0.2773, pruned_loss=0.05235, over 24377.00 frames. ], tot_loss[loss=0.1926, simple_loss=0.2653, pruned_loss=0.05997, over 4734038.04 frames. ], batch size: 74, lr: 7.84e-03, grad_scale: 8.0 2023-09-29 18:58:49,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:58:52,916 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 18:58:53,321 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=458966.6666666667, ans=0.125 2023-09-29 18:58:54,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:58:57,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 18:58:57,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 18:58:59,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:59:00,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:59:02,847 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.978e+02 2.231e+02 2.583e+02 5.581e+02, threshold=4.463e+02, percent-clipped=1.0 2023-09-29 18:59:03,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:59:04,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 18:59:04,558 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 18:59:08,574 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_abs, batch_count=459033.3333333333, ans=0.5 2023-09-29 18:59:10,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:59:11,856 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:59:15,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:59:18,251 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=459033.3333333333, ans=0.0 2023-09-29 18:59:19,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 18:59:19,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:59:21,235 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=459100.0, ans=0.1 2023-09-29 18:59:22,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:59:22,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 18:59:25,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:59:25,635 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:59:26,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 18:59:27,330 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=459100.0, ans=0.0 2023-09-29 18:59:28,604 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 18:59:28,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:59:30,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 18:59:30,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 18:59:34,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:59:44,526 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:59:47,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 18:59:47,420 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 18:59:47,435 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 18:59:48,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 18:59:48,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:59:53,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 18:59:56,680 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 18:59:59,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 18:59:59,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 19:00:01,472 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=459233.3333333333, ans=0.1 2023-09-29 19:00:02,667 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 19:00:04,215 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 19:00:04,285 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 19:00:10,217 INFO [train.py:1039] (1/4) Epoch 13, batch 5150, loss[loss=0.2119, simple_loss=0.2729, pruned_loss=0.07548, over 23753.00 frames. ], tot_loss[loss=0.1939, simple_loss=0.2664, pruned_loss=0.06066, over 4728210.52 frames. ], batch size: 233, lr: 7.83e-03, grad_scale: 8.0 2023-09-29 19:00:10,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:00:10,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:00:10,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:00:10,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:00:10,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 19:00:12,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:00:14,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 19:00:14,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 19:00:15,316 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=459300.0, ans=0.1 2023-09-29 19:00:16,339 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 19:00:16,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:00:16,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 19:00:17,827 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:00:17,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 19:00:20,270 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:00:23,146 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:00:27,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 19:00:27,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 19:00:29,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:00:30,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:00:32,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 19:00:32,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:00:32,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:00:32,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:00:32,504 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 19:00:33,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 19:00:34,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:00:34,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 19:00:37,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 19:00:37,628 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=459366.6666666667, ans=0.125 2023-09-29 19:00:38,892 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 19:00:40,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:00:47,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:00:48,113 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=459433.3333333333, ans=0.2 2023-09-29 19:00:49,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 19:00:51,647 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:00:58,693 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=459433.3333333333, ans=0.125 2023-09-29 19:00:59,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:00:59,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:01:04,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:01:05,865 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:01:07,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 19:01:12,045 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:01:13,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:01:13,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 19:01:15,900 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.77 vs. limit=15.0 2023-09-29 19:01:16,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:01:18,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:01:18,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 19:01:22,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:01:25,740 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 19:01:28,753 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:01:28,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:01:30,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 19:01:30,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 19:01:30,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:01:32,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:01:33,945 INFO [train.py:1039] (1/4) Epoch 13, batch 5200, loss[loss=0.1693, simple_loss=0.2477, pruned_loss=0.04545, over 24346.00 frames. ], tot_loss[loss=0.1954, simple_loss=0.2671, pruned_loss=0.06187, over 4714447.82 frames. ], batch size: 56, lr: 7.83e-03, grad_scale: 16.0 2023-09-29 19:01:35,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:01:35,860 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=459633.3333333333, ans=0.125 2023-09-29 19:01:36,046 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=459633.3333333333, ans=0.0 2023-09-29 19:01:37,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:01:40,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:01:43,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 19:01:46,061 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.634e+02 2.009e+02 2.232e+02 2.701e+02 3.997e+02, threshold=4.463e+02, percent-clipped=0.0 2023-09-29 19:01:46,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:01:46,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:01:49,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:01:50,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:01:51,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:01:53,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 19:01:55,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 19:01:57,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:01:59,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 19:02:00,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 19:02:02,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 19:02:02,511 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 19:02:02,637 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=459700.0, ans=0.0 2023-09-29 19:02:03,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 19:02:05,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 19:02:05,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:02:05,604 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 19:02:05,615 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:02:09,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:02:10,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:02:10,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 19:02:10,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:02:12,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:02:12,572 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=459766.6666666667, ans=0.0 2023-09-29 19:02:16,734 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 19:02:16,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 19:02:18,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 19:02:22,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 19:02:23,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 19:02:30,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:02:30,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:02:33,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 19:02:33,413 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:02:34,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 19:02:34,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:02:34,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 19:02:39,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:02:40,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:02:44,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:02:44,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:02:44,824 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:02:49,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:02:50,959 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 19:02:51,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:02:52,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:02:52,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:02:54,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 19:02:54,445 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=459966.6666666667, ans=0.125 2023-09-29 19:02:55,532 INFO [train.py:1039] (1/4) Epoch 13, batch 5250, loss[loss=0.193, simple_loss=0.2825, pruned_loss=0.05177, over 24581.00 frames. ], tot_loss[loss=0.1942, simple_loss=0.2665, pruned_loss=0.06098, over 4728010.04 frames. ], batch size: 71, lr: 7.83e-03, grad_scale: 16.0 2023-09-29 19:02:55,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:02:57,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:03:01,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:03:01,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:03:03,939 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:03:10,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:03:11,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:03:14,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:03:14,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 19:03:17,015 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=460033.3333333333, ans=0.0 2023-09-29 19:03:18,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 19:03:18,221 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:03:19,771 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:03:31,634 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer_ff2.min_abs, batch_count=460100.0, ans=0.1 2023-09-29 19:03:32,215 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.15 vs. limit=6.0 2023-09-29 19:03:38,461 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 19:03:50,528 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.37 vs. limit=22.5 2023-09-29 19:03:52,256 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=460166.6666666667, ans=0.09899494936611666 2023-09-29 19:04:11,486 INFO [train.py:1039] (1/4) Epoch 13, batch 5300, loss[loss=0.1985, simple_loss=0.2811, pruned_loss=0.05793, over 23955.00 frames. ], tot_loss[loss=0.1927, simple_loss=0.2643, pruned_loss=0.06051, over 4705848.98 frames. ], batch size: 80, lr: 7.82e-03, grad_scale: 16.0 2023-09-29 19:04:20,082 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=460300.0, ans=0.0 2023-09-29 19:04:22,458 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 1.900e+02 2.152e+02 2.840e+02 4.256e+02, threshold=4.304e+02, percent-clipped=0.0 2023-09-29 19:04:22,847 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 19:04:26,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:04:26,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 19:04:26,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 19:04:26,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:04:26,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:04:26,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:04:27,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:04:27,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:04:27,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:04:27,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:04:27,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 19:04:27,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:04:27,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 19:04:28,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 19:04:28,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 19:04:28,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 19:04:28,335 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 19:04:28,459 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 19:04:29,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:04:29,558 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:04:29,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:04:29,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:04:29,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:04:30,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:04:30,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:04:30,527 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:04:30,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:04:30,717 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:04:30,725 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:04:30,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:04:30,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:04:31,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 19:04:31,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:04:32,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:04:32,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 19:04:32,321 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 19:04:32,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:04:33,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:04:33,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 19:04:33,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 19:04:33,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 19:04:34,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 19:04:34,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:04:34,443 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 19:04:34,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 19:04:34,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 19:04:34,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:04:34,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 19:04:34,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 19:04:35,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 19:04:35,344 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 19:04:43,514 INFO [train.py:1039] (1/4) Epoch 14, batch 0, loss[loss=0.1852, simple_loss=0.2562, pruned_loss=0.05713, over 23707.00 frames. ], tot_loss[loss=0.1852, simple_loss=0.2562, pruned_loss=0.05713, over 23707.00 frames. ], batch size: 149, lr: 7.54e-03, grad_scale: 32.0 2023-09-29 19:04:43,514 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-29 19:04:58,068 INFO [train.py:1071] (1/4) Epoch 14, validation: loss=0.2893, simple_loss=0.2709, pruned_loss=0.1538, over 1125622.00 frames. 2023-09-29 19:04:58,069 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-29 19:05:00,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 19:05:01,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:05:03,130 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:05:09,208 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:05:09,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:05:10,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:05:10,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 19:05:13,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 19:05:17,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:05:18,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:05:22,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:05:24,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:05:24,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:05:24,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:05:26,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 19:05:28,637 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:05:34,243 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 19:05:37,769 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 19:05:37,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:05:38,348 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.37 vs. limit=10.0 2023-09-29 19:05:40,731 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 19:05:45,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:05:45,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:05:46,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:05:50,313 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:05:55,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:06:00,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 19:06:05,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 19:06:05,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:06:05,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:06:07,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:06:07,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:06:10,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 19:06:13,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:06:14,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:06:17,165 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 19:06:20,124 INFO [train.py:1039] (1/4) Epoch 14, batch 50, loss[loss=0.1721, simple_loss=0.2579, pruned_loss=0.0432, over 24421.00 frames. ], tot_loss[loss=0.1923, simple_loss=0.2662, pruned_loss=0.0592, over 1066265.38 frames. ], batch size: 69, lr: 7.54e-03, grad_scale: 32.0 2023-09-29 19:06:20,368 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 19:06:21,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:06:24,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:06:26,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:06:26,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 19:06:28,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 19:06:28,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:06:30,474 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=460713.3333333333, ans=0.125 2023-09-29 19:06:31,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:06:33,302 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:06:35,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:06:37,392 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=460780.0, ans=0.125 2023-09-29 19:06:38,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 19:06:38,613 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:06:45,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 19:06:48,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 19:06:48,979 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=460780.0, ans=0.0 2023-09-29 19:06:50,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 19:06:50,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:06:50,892 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=460780.0, ans=0.125 2023-09-29 19:06:52,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:06:52,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:06:52,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:06:53,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 19:06:55,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 19:06:55,132 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:06:58,332 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=460846.6666666667, ans=0.0 2023-09-29 19:07:01,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:07:04,842 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 19:07:04,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 19:07:06,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 19:07:06,653 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=460846.6666666667, ans=0.125 2023-09-29 19:07:06,716 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=460846.6666666667, ans=0.125 2023-09-29 19:07:09,702 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:07:11,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 19:07:11,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 19:07:11,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:07:12,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 19:07:20,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:07:20,551 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:07:20,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:07:22,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:07:22,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:07:25,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 19:07:26,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 19:07:26,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:07:26,980 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:07:29,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:07:29,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:07:30,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 19:07:31,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 19:07:33,140 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 19:07:33,522 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=460980.0, ans=0.0 2023-09-29 19:07:35,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:07:35,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:07:35,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 19:07:35,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 19:07:37,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:07:38,452 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.448e+02 1.985e+02 2.220e+02 2.670e+02 4.594e+02, threshold=4.441e+02, percent-clipped=1.0 2023-09-29 19:07:38,624 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 19:07:40,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 19:07:40,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:07:43,614 INFO [train.py:1039] (1/4) Epoch 14, batch 100, loss[loss=0.205, simple_loss=0.2786, pruned_loss=0.06567, over 23217.00 frames. ], tot_loss[loss=0.1916, simple_loss=0.2662, pruned_loss=0.05853, over 1892372.58 frames. ], batch size: 93, lr: 7.53e-03, grad_scale: 16.0 2023-09-29 19:07:43,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:07:45,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:07:50,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:07:52,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 19:07:52,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:07:57,131 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:07:58,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:07:58,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 19:07:58,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:07:58,620 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:08:00,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 19:08:00,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 19:08:01,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:08:01,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:08:01,751 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:08:05,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 19:08:07,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:08:07,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:08:08,711 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 19:08:10,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 19:08:13,343 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 19:08:13,513 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer_na.min_abs, batch_count=461113.3333333333, ans=0.02 2023-09-29 19:08:15,245 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 19:08:16,848 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:08:16,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:08:19,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 19:08:22,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:08:24,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:08:30,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:08:31,738 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 19:08:33,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 19:08:35,290 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=461246.6666666667, ans=0.125 2023-09-29 19:08:37,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:08:39,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:08:41,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:08:44,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:08:47,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:08:47,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:08:51,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:08:53,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:08:54,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:08:54,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:08:55,003 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:08:56,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 19:08:56,476 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 19:08:57,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:08:58,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 19:09:00,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:09:00,080 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:09:00,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 19:09:00,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 19:09:01,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 19:09:01,585 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:09:03,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:09:05,174 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:09:05,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:09:05,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:09:06,633 INFO [train.py:1039] (1/4) Epoch 14, batch 150, loss[loss=0.1936, simple_loss=0.2736, pruned_loss=0.05682, over 24622.00 frames. ], tot_loss[loss=0.1931, simple_loss=0.2668, pruned_loss=0.05969, over 2517195.76 frames. ], batch size: 68, lr: 7.53e-03, grad_scale: 16.0 2023-09-29 19:09:08,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:09:10,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:09:10,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:09:11,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:09:14,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:09:16,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:09:18,268 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=461380.0, ans=0.0 2023-09-29 19:09:18,446 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=461380.0, ans=0.0 2023-09-29 19:09:19,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:09:19,629 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:09:24,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 19:09:24,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 19:09:24,787 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 19:09:27,822 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:09:27,830 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 19:09:29,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:09:29,521 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:09:29,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:09:30,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:09:31,060 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:09:31,738 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.73 vs. limit=15.0 2023-09-29 19:09:32,650 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 19:09:32,920 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 19:09:34,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:09:40,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:09:43,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 19:09:44,620 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 19:09:47,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:09:47,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:09:47,720 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:09:51,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:09:52,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:09:54,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:09:56,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:09:56,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 19:10:03,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:10:04,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:10:04,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:10:04,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:10:07,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:10:07,361 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=461580.0, ans=0.2 2023-09-29 19:10:08,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 19:10:12,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:10:13,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 19:10:14,038 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:10:15,622 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:10:15,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 19:10:15,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:10:15,718 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 19:10:16,097 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=461646.6666666667, ans=0.05 2023-09-29 19:10:21,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:10:23,775 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.853e+02 2.115e+02 2.469e+02 4.470e+02, threshold=4.229e+02, percent-clipped=1.0 2023-09-29 19:10:26,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:10:26,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:10:29,877 INFO [train.py:1039] (1/4) Epoch 14, batch 200, loss[loss=0.2654, simple_loss=0.3204, pruned_loss=0.1052, over 20009.00 frames. ], tot_loss[loss=0.1942, simple_loss=0.2677, pruned_loss=0.0604, over 2996356.32 frames. ], batch size: 388, lr: 7.53e-03, grad_scale: 16.0 2023-09-29 19:10:30,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 19:10:30,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:10:30,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:10:32,139 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=461713.3333333333, ans=0.125 2023-09-29 19:10:34,789 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 19:10:36,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 19:10:37,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:10:38,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:10:43,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:10:43,316 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:10:43,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:10:45,064 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=461780.0, ans=0.125 2023-09-29 19:10:51,696 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=461780.0, ans=0.0 2023-09-29 19:10:56,771 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=461780.0, ans=0.125 2023-09-29 19:10:58,306 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=461780.0, ans=0.125 2023-09-29 19:11:01,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:11:02,183 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=461846.6666666667, ans=0.125 2023-09-29 19:11:03,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:11:04,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:11:05,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:11:05,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 19:11:05,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 19:11:06,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:11:08,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 19:11:08,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:11:09,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:11:11,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 19:11:12,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 19:11:12,684 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:11:16,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:11:23,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:11:33,570 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:11:35,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:11:42,048 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:11:45,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 19:11:46,469 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:11:46,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:11:47,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:11:47,938 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:11:49,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 19:11:49,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:11:49,554 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 19:11:49,922 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=462046.6666666667, ans=0.2 2023-09-29 19:11:50,899 INFO [train.py:1039] (1/4) Epoch 14, batch 250, loss[loss=0.1868, simple_loss=0.2705, pruned_loss=0.05155, over 24011.00 frames. ], tot_loss[loss=0.1938, simple_loss=0.2671, pruned_loss=0.06021, over 3376810.16 frames. ], batch size: 80, lr: 7.53e-03, grad_scale: 16.0 2023-09-29 19:11:53,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:11:54,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:11:55,149 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=462046.6666666667, ans=0.125 2023-09-29 19:11:56,186 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:11:56,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:11:57,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:11:57,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:11:59,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:12:03,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:12:17,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:12:19,052 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:12:20,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:12:26,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 19:12:28,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 19:12:28,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:12:28,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:12:30,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 19:12:30,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:12:30,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:12:32,207 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:12:35,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 19:12:35,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:12:36,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:12:36,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:12:36,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:12:38,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 19:12:38,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 19:12:38,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:12:42,038 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:12:42,222 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:12:44,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:12:48,006 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:12:50,091 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=462246.6666666667, ans=0.04949747468305833 2023-09-29 19:12:51,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:12:55,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:12:56,582 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=4.89 vs. limit=10.0 2023-09-29 19:13:00,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:13:01,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:13:05,590 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 19:13:05,878 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=462313.3333333333, ans=0.1 2023-09-29 19:13:07,094 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:13:07,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 19:13:10,062 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.958e+02 2.110e+02 2.520e+02 4.183e+02, threshold=4.220e+02, percent-clipped=0.0 2023-09-29 19:13:10,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 19:13:10,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 19:13:11,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:13:11,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 19:13:13,247 INFO [train.py:1039] (1/4) Epoch 14, batch 300, loss[loss=0.1687, simple_loss=0.245, pruned_loss=0.04623, over 24421.00 frames. ], tot_loss[loss=0.1927, simple_loss=0.2654, pruned_loss=0.05994, over 3675431.83 frames. ], batch size: 58, lr: 7.52e-03, grad_scale: 8.0 2023-09-29 19:13:19,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:13:19,418 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:13:22,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:13:24,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 19:13:24,939 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=462380.0, ans=0.0 2023-09-29 19:13:26,112 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:13:27,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 19:13:27,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 19:13:27,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:13:31,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 19:13:32,933 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=462446.6666666667, ans=0.125 2023-09-29 19:13:37,878 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 19:13:37,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 19:13:42,624 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 19:13:43,967 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:13:45,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:13:47,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:13:47,248 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 19:13:47,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:13:50,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:13:53,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:13:53,794 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:13:58,839 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 19:13:58,846 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 19:13:58,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:14:02,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:14:04,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 19:14:05,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:14:08,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:14:12,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:14:12,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 19:14:18,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:14:18,529 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 19:14:21,837 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:14:22,027 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:14:22,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 19:14:23,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 19:14:23,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:14:26,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 19:14:27,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:14:27,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:14:29,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:14:29,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:14:29,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:14:35,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:14:36,147 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=462713.3333333333, ans=0.1 2023-09-29 19:14:37,254 INFO [train.py:1039] (1/4) Epoch 14, batch 350, loss[loss=0.1938, simple_loss=0.2786, pruned_loss=0.05451, over 24602.00 frames. ], tot_loss[loss=0.1912, simple_loss=0.2635, pruned_loss=0.05945, over 3902260.25 frames. ], batch size: 71, lr: 7.52e-03, grad_scale: 8.0 2023-09-29 19:14:37,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 19:14:39,622 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:14:46,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:14:50,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:14:50,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:14:53,755 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 19:14:55,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:14:55,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 19:14:58,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:14:58,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 19:15:00,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:15:02,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 19:15:03,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:15:07,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:15:07,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:15:09,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:15:09,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:15:09,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:15:09,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:15:09,506 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=462846.6666666667, ans=0.0 2023-09-29 19:15:10,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:15:13,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:15:13,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:15:22,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:15:22,964 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:15:23,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:15:24,549 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:15:30,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 19:15:30,521 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:15:35,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:15:35,246 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:15:35,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:15:38,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 19:15:41,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:15:42,545 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=12.68 vs. limit=15.0 2023-09-29 19:15:43,187 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 19:15:45,178 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 19:15:46,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:15:48,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:15:48,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 19:15:48,629 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=462980.0, ans=0.0 2023-09-29 19:15:48,742 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=462980.0, ans=0.2 2023-09-29 19:15:50,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:15:51,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:15:53,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:15:57,093 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.848e+02 2.063e+02 2.317e+02 4.440e+02, threshold=4.125e+02, percent-clipped=1.0 2023-09-29 19:15:57,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:15:57,256 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:15:58,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:16:00,132 INFO [train.py:1039] (1/4) Epoch 14, batch 400, loss[loss=0.1802, simple_loss=0.246, pruned_loss=0.0572, over 23630.00 frames. ], tot_loss[loss=0.1902, simple_loss=0.2623, pruned_loss=0.05905, over 4076086.98 frames. ], batch size: 256, lr: 7.52e-03, grad_scale: 16.0 2023-09-29 19:16:01,071 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.06 vs. limit=12.0 2023-09-29 19:16:02,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:16:05,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:16:05,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 19:16:05,207 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:16:06,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:16:07,602 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.64 vs. limit=15.0 2023-09-29 19:16:08,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:16:08,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:16:10,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:16:13,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:16:17,038 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 19:16:18,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 19:16:18,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:16:20,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 19:16:20,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:16:23,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:16:24,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:16:24,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 19:16:26,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:16:26,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:16:26,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:16:28,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:16:31,387 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 19:16:31,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 19:16:33,901 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.01 vs. limit=15.0 2023-09-29 19:16:34,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:16:36,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:16:37,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 19:16:39,259 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 19:16:42,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:16:42,661 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=463180.0, ans=0.0 2023-09-29 19:16:45,193 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:16:52,492 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 19:16:55,650 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 19:16:57,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 19:17:00,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:17:01,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:17:01,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 19:17:02,945 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=463246.6666666667, ans=0.1 2023-09-29 19:17:02,963 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=463246.6666666667, ans=0.125 2023-09-29 19:17:05,039 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.15 vs. limit=15.0 2023-09-29 19:17:05,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:17:07,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 19:17:08,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:17:11,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:17:13,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 19:17:14,904 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 19:17:15,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 19:17:18,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 19:17:18,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:17:20,952 INFO [train.py:1039] (1/4) Epoch 14, batch 450, loss[loss=0.1622, simple_loss=0.2493, pruned_loss=0.03755, over 24530.00 frames. ], tot_loss[loss=0.1902, simple_loss=0.2626, pruned_loss=0.05892, over 4220552.73 frames. ], batch size: 66, lr: 7.52e-03, grad_scale: 16.0 2023-09-29 19:17:21,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 19:17:25,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 19:17:25,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:17:25,401 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 19:17:26,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 19:17:26,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:17:28,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:17:28,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:17:28,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 19:17:29,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:17:30,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 19:17:31,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 19:17:42,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:17:42,139 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:17:42,506 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=463446.6666666667, ans=0.125 2023-09-29 19:17:43,894 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=463446.6666666667, ans=0.125 2023-09-29 19:17:45,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 19:17:46,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 19:17:48,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 19:17:50,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:17:50,362 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=463446.6666666667, ans=0.125 2023-09-29 19:17:52,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:17:57,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:17:57,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:18:00,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 19:18:00,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 19:18:01,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 19:18:02,005 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:18:03,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:18:04,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:18:07,105 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 19:18:07,824 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 19:18:07,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:18:10,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:18:10,593 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 19:18:12,394 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=463580.0, ans=0.0 2023-09-29 19:18:15,131 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 19:18:15,198 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:18:16,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 19:18:18,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 19:18:21,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:18:24,183 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 19:18:24,243 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 19:18:25,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 19:18:30,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:18:31,618 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.43 vs. limit=22.5 2023-09-29 19:18:32,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 19:18:32,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 19:18:34,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:18:35,556 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.96 vs. limit=12.0 2023-09-29 19:18:39,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:18:40,722 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.457e+02 1.869e+02 2.102e+02 2.617e+02 3.390e+02, threshold=4.204e+02, percent-clipped=0.0 2023-09-29 19:18:41,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:18:43,096 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:18:43,147 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 19:18:44,438 INFO [train.py:1039] (1/4) Epoch 14, batch 500, loss[loss=0.2147, simple_loss=0.2912, pruned_loss=0.06912, over 23841.00 frames. ], tot_loss[loss=0.1908, simple_loss=0.2632, pruned_loss=0.05919, over 4322535.48 frames. ], batch size: 85, lr: 7.51e-03, grad_scale: 16.0 2023-09-29 19:18:46,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:18:48,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:18:48,424 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:18:48,449 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 19:18:51,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 19:18:51,337 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:18:54,673 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=463713.3333333333, ans=0.0 2023-09-29 19:18:55,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 19:18:59,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 19:19:00,710 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 19:19:03,656 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:19:03,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:19:05,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:19:05,403 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 19:19:12,759 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=463780.0, ans=0.125 2023-09-29 19:19:15,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:19:16,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 19:19:16,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 19:19:17,235 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=463846.6666666667, ans=0.125 2023-09-29 19:19:18,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:19:18,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 19:19:18,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 19:19:22,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:19:23,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:19:23,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:19:25,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:19:26,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 19:19:29,645 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 19:19:32,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:19:34,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:19:34,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:19:34,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:19:36,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 19:19:39,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 19:19:42,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:19:44,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:19:47,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:19:50,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:19:57,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:19:59,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 19:19:59,729 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:19:59,762 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:20:04,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 19:20:05,351 INFO [train.py:1039] (1/4) Epoch 14, batch 550, loss[loss=0.2186, simple_loss=0.2783, pruned_loss=0.07948, over 22717.00 frames. ], tot_loss[loss=0.1921, simple_loss=0.2644, pruned_loss=0.0599, over 4413718.16 frames. ], batch size: 322, lr: 7.51e-03, grad_scale: 16.0 2023-09-29 19:20:05,511 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 19:20:05,800 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=464046.6666666667, ans=0.2 2023-09-29 19:20:07,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:20:13,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 19:20:14,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 19:20:14,897 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:20:14,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 19:20:15,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:20:17,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:20:17,160 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:20:19,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:20:19,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:20:19,622 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=464046.6666666667, ans=0.1 2023-09-29 19:20:20,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:20:22,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:20:23,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 19:20:23,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:20:24,400 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=464113.3333333333, ans=0.1 2023-09-29 19:20:28,667 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:20:28,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:20:30,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:20:31,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:20:37,329 WARNING [train.py:1197] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 19:20:38,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 19:20:40,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:20:45,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:20:45,153 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:20:46,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 19:20:50,450 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:20:50,459 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 19:20:52,490 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:20:52,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 19:20:55,591 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:20:57,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 19:20:57,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 19:20:57,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:20:58,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 19:21:00,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 19:21:00,435 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=464246.6666666667, ans=0.0 2023-09-29 19:21:00,495 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=464246.6666666667, ans=0.05 2023-09-29 19:21:01,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:21:01,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:21:01,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:21:01,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:21:06,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:21:07,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:21:10,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:21:10,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:21:12,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 19:21:13,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 19:21:15,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:21:15,320 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:21:16,805 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:21:18,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 19:21:18,439 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 19:21:25,256 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 1.884e+02 2.077e+02 2.403e+02 3.738e+02, threshold=4.154e+02, percent-clipped=0.0 2023-09-29 19:21:25,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 19:21:28,500 INFO [train.py:1039] (1/4) Epoch 14, batch 600, loss[loss=0.1926, simple_loss=0.2657, pruned_loss=0.05975, over 23384.00 frames. ], tot_loss[loss=0.1918, simple_loss=0.2644, pruned_loss=0.05965, over 4488474.10 frames. ], batch size: 93, lr: 7.51e-03, grad_scale: 16.0 2023-09-29 19:21:30,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 19:21:31,634 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:21:31,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 19:21:31,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:21:39,538 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.06 vs. limit=15.0 2023-09-29 19:21:40,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:21:40,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 19:21:42,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 19:21:45,239 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 19:21:46,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:21:49,907 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:21:51,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 19:21:51,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:21:53,516 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=464446.6666666667, ans=0.1 2023-09-29 19:21:58,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 19:22:01,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:22:01,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:22:02,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:22:08,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:22:08,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:22:08,460 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=464513.3333333333, ans=0.0 2023-09-29 19:22:10,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:22:16,928 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:22:17,302 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=464580.0, ans=0.0 2023-09-29 19:22:21,487 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:22:21,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:22:21,506 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:22:21,807 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=464580.0, ans=0.5 2023-09-29 19:22:29,802 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=464580.0, ans=0.1 2023-09-29 19:22:33,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 19:22:37,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 19:22:37,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:22:42,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 19:22:42,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:22:43,086 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=464646.6666666667, ans=0.04949747468305833 2023-09-29 19:22:44,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 19:22:44,939 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:22:46,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:22:50,996 INFO [train.py:1039] (1/4) Epoch 14, batch 650, loss[loss=0.1928, simple_loss=0.2756, pruned_loss=0.055, over 24363.00 frames. ], tot_loss[loss=0.191, simple_loss=0.2638, pruned_loss=0.05914, over 4539180.55 frames. ], batch size: 77, lr: 7.50e-03, grad_scale: 16.0 2023-09-29 19:22:51,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 19:22:52,738 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=10.51 vs. limit=15.0 2023-09-29 19:22:53,421 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 19:22:53,723 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=464713.3333333333, ans=0.1 2023-09-29 19:22:56,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:22:56,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:22:58,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:22:58,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 19:23:00,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:23:06,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:23:06,870 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:23:10,141 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:23:13,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 19:23:15,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:23:15,368 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:23:17,635 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.51 vs. limit=15.0 2023-09-29 19:23:20,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:23:20,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 19:23:23,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:23:23,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:23:23,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 19:23:25,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:23:26,641 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:23:29,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:23:29,616 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 19:23:29,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:23:29,661 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:23:33,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:23:34,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:23:34,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:23:34,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:23:36,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 19:23:36,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:23:38,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:23:39,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 19:23:39,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:23:41,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 19:23:41,712 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 19:23:44,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 19:23:44,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:23:44,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:23:44,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:23:44,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:23:48,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:23:54,879 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:23:54,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:23:56,492 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:23:59,764 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.35 vs. limit=15.0 2023-09-29 19:24:00,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:24:00,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 19:24:02,188 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:24:09,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 19:24:09,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:24:09,182 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:24:09,811 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.52 vs. limit=15.0 2023-09-29 19:24:10,964 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.710e+02 2.056e+02 2.592e+02 3.186e+02 5.109e+02, threshold=5.184e+02, percent-clipped=6.0 2023-09-29 19:24:11,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:24:14,006 INFO [train.py:1039] (1/4) Epoch 14, batch 700, loss[loss=0.1812, simple_loss=0.2495, pruned_loss=0.05646, over 23668.00 frames. ], tot_loss[loss=0.1901, simple_loss=0.2622, pruned_loss=0.05906, over 4573863.09 frames. ], batch size: 135, lr: 7.50e-03, grad_scale: 16.0 2023-09-29 19:24:17,332 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 19:24:17,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 19:24:20,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 19:24:22,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:24:23,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:24:23,897 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=465046.6666666667, ans=0.0 2023-09-29 19:24:25,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 19:24:30,900 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:24:31,219 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=465113.3333333333, ans=0.95 2023-09-29 19:24:33,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:24:35,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:24:37,440 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=465113.3333333333, ans=0.2 2023-09-29 19:24:38,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 19:24:38,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:24:40,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:24:42,680 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=465113.3333333333, ans=0.2 2023-09-29 19:24:43,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 19:24:43,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:24:44,334 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=465113.3333333333, ans=0.1 2023-09-29 19:24:47,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 19:24:49,902 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.84 vs. limit=6.0 2023-09-29 19:24:50,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 19:24:52,680 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=465180.0, ans=0.125 2023-09-29 19:24:53,861 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 19:24:53,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:24:55,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:25:00,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:25:00,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 19:25:01,180 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=465180.0, ans=0.0 2023-09-29 19:25:06,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:25:06,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 19:25:06,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 19:25:07,897 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=465246.6666666667, ans=0.125 2023-09-29 19:25:10,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:25:12,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:25:17,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:25:24,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:25:24,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 19:25:24,586 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=465313.3333333333, ans=0.1 2023-09-29 19:25:27,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 19:25:27,710 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 19:25:27,947 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=465313.3333333333, ans=0.0 2023-09-29 19:25:29,804 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.14 vs. limit=22.5 2023-09-29 19:25:30,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:25:32,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:25:32,410 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:25:32,644 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=465313.3333333333, ans=0.125 2023-09-29 19:25:34,046 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:25:34,055 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 19:25:37,473 INFO [train.py:1039] (1/4) Epoch 14, batch 750, loss[loss=0.1625, simple_loss=0.2444, pruned_loss=0.04032, over 24468.00 frames. ], tot_loss[loss=0.1896, simple_loss=0.2622, pruned_loss=0.05848, over 4617256.45 frames. ], batch size: 63, lr: 7.50e-03, grad_scale: 8.0 2023-09-29 19:25:39,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 19:25:39,217 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 19:25:39,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 19:25:39,462 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=465380.0, ans=0.125 2023-09-29 19:25:41,582 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=465380.0, ans=0.2 2023-09-29 19:25:42,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 19:25:42,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 19:25:42,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:25:44,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 19:25:45,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:25:45,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 19:25:47,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:25:48,177 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.99 vs. limit=22.5 2023-09-29 19:25:50,905 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:25:50,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 19:25:50,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:25:52,648 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:25:52,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 19:25:56,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:25:57,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:25:59,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:25:59,455 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 19:25:59,702 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=465446.6666666667, ans=0.125 2023-09-29 19:26:00,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:26:01,412 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=465446.6666666667, ans=0.0 2023-09-29 19:26:02,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:26:04,133 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:26:05,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 19:26:05,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 19:26:05,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:26:09,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 19:26:09,659 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 19:26:11,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 19:26:11,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:26:13,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 19:26:14,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:26:15,097 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=465513.3333333333, ans=0.0 2023-09-29 19:26:21,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 19:26:21,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:26:21,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 19:26:21,746 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=465513.3333333333, ans=0.0 2023-09-29 19:26:24,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:26:26,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:26:26,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 19:26:28,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:26:28,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 19:26:29,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:26:31,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:26:32,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 19:26:33,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:26:37,796 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=465580.0, ans=0.0 2023-09-29 19:26:39,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:26:39,252 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=465580.0, ans=0.2 2023-09-29 19:26:40,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:26:42,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:26:44,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 19:26:44,765 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=465646.6666666667, ans=0.1 2023-09-29 19:26:47,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 19:26:48,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:26:49,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:26:51,166 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:26:51,346 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=465646.6666666667, ans=0.2 2023-09-29 19:26:52,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:26:54,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:26:54,330 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 19:26:59,102 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 2.067e+02 2.400e+02 2.919e+02 4.074e+02, threshold=4.801e+02, percent-clipped=0.0 2023-09-29 19:27:01,199 INFO [train.py:1039] (1/4) Epoch 14, batch 800, loss[loss=0.1954, simple_loss=0.2785, pruned_loss=0.05614, over 24428.00 frames. ], tot_loss[loss=0.1912, simple_loss=0.2637, pruned_loss=0.0593, over 4642087.50 frames. ], batch size: 69, lr: 7.50e-03, grad_scale: 16.0 2023-09-29 19:27:02,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:27:02,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:27:04,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:27:04,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:27:05,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:27:06,045 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:27:07,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:27:10,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:27:12,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 19:27:16,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 19:27:16,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:27:18,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:27:18,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:27:18,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:27:21,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 19:27:21,042 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:27:21,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 19:27:24,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:27:25,791 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:27:28,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:27:28,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:27:32,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:27:32,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:27:35,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:27:37,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:27:37,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 19:27:37,307 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=465846.6666666667, ans=0.125 2023-09-29 19:27:37,542 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=465846.6666666667, ans=0.125 2023-09-29 19:27:38,825 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 19:27:40,306 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 19:27:40,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 19:27:40,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:27:43,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:27:43,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:27:46,338 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 19:27:46,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 19:27:48,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 19:27:50,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:27:54,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:27:57,254 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:27:58,023 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.72 vs. limit=12.0 2023-09-29 19:28:00,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 19:28:00,587 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:28:03,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 19:28:10,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 19:28:11,253 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.09 vs. limit=12.0 2023-09-29 19:28:12,745 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.62 vs. limit=15.0 2023-09-29 19:28:14,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:28:14,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 19:28:16,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:28:16,526 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 19:28:17,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:28:17,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 19:28:19,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:28:20,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:28:20,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:28:22,317 INFO [train.py:1039] (1/4) Epoch 14, batch 850, loss[loss=0.209, simple_loss=0.2703, pruned_loss=0.07388, over 23802.00 frames. ], tot_loss[loss=0.1926, simple_loss=0.2655, pruned_loss=0.05984, over 4654865.19 frames. ], batch size: 179, lr: 7.49e-03, grad_scale: 16.0 2023-09-29 19:28:22,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 19:28:23,960 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:28:26,190 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 19:28:26,272 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 19:28:26,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 19:28:27,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 19:28:27,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:28:30,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:28:30,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:28:30,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 19:28:37,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:28:37,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:28:39,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 19:28:43,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 19:28:44,169 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.13 vs. limit=15.0 2023-09-29 19:28:46,439 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:28:47,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 19:28:52,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 19:28:52,637 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 19:28:55,766 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 19:28:55,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:28:55,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:28:55,839 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 19:28:59,441 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:29:01,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:29:01,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 19:29:03,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:29:04,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:29:06,376 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:29:06,410 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 19:29:09,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:29:10,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 19:29:10,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 19:29:16,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:29:16,166 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:29:16,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:29:17,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:29:19,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:29:22,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:29:24,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:29:27,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 19:29:27,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:29:28,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 19:29:29,064 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=466313.3333333333, ans=0.125 2023-09-29 19:29:38,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 19:29:39,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:29:39,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 19:29:39,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:29:40,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:29:42,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 19:29:43,738 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.955e+02 2.156e+02 2.451e+02 4.149e+02, threshold=4.312e+02, percent-clipped=0.0 2023-09-29 19:29:45,209 INFO [train.py:1039] (1/4) Epoch 14, batch 900, loss[loss=0.1819, simple_loss=0.2639, pruned_loss=0.04992, over 24058.00 frames. ], tot_loss[loss=0.1929, simple_loss=0.2657, pruned_loss=0.05999, over 4677968.73 frames. ], batch size: 80, lr: 7.49e-03, grad_scale: 16.0 2023-09-29 19:29:46,961 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:29:48,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:29:50,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 19:29:53,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:29:53,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 19:29:55,987 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 19:29:56,206 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=466380.0, ans=0.0 2023-09-29 19:29:57,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:29:57,522 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:29:57,596 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 19:29:59,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:30:12,089 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.22 vs. limit=6.0 2023-09-29 19:30:12,089 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys.whitening_limit, batch_count=466446.6666666667, ans=6.0 2023-09-29 19:30:12,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:30:12,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:30:12,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 19:30:17,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:30:22,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 19:30:25,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:30:30,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 19:30:31,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 19:30:33,410 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 19:30:34,946 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 19:30:38,340 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=466580.0, ans=0.1 2023-09-29 19:30:41,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 19:30:41,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:30:42,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 19:30:43,214 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=466580.0, ans=0.125 2023-09-29 19:30:49,558 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:30:49,585 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:30:53,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 19:30:53,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:30:56,351 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 19:30:56,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=466646.6666666667, ans=0.0 2023-09-29 19:30:57,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:30:57,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:30:59,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:30:59,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:31:04,421 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 19:31:04,481 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 19:31:06,645 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 19:31:06,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 19:31:08,191 INFO [train.py:1039] (1/4) Epoch 14, batch 950, loss[loss=0.1873, simple_loss=0.2541, pruned_loss=0.06021, over 23891.00 frames. ], tot_loss[loss=0.193, simple_loss=0.2657, pruned_loss=0.06018, over 4693005.90 frames. ], batch size: 195, lr: 7.49e-03, grad_scale: 16.0 2023-09-29 19:31:08,461 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:31:12,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 19:31:18,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:31:20,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:31:22,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:31:22,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 19:31:25,955 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 19:31:31,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:31:33,131 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:31:33,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:31:34,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:31:34,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 19:31:34,865 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 19:31:36,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:31:38,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 19:31:39,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:31:44,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:31:44,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:31:44,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:31:44,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 19:31:46,402 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 19:31:46,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:31:50,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:31:51,960 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=466846.6666666667, ans=0.125 2023-09-29 19:31:55,583 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=466846.6666666667, ans=0.0 2023-09-29 19:31:56,737 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:31:56,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:31:59,445 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 19:32:02,454 WARNING [train.py:1197] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 19:32:02,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 19:32:03,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:32:04,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:32:04,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:32:05,872 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=466913.3333333333, ans=0.1 2023-09-29 19:32:10,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 19:32:10,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:32:11,728 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:32:13,181 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:32:13,210 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 19:32:13,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:32:13,237 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:32:14,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 19:32:18,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:32:21,765 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=466980.0, ans=0.0 2023-09-29 19:32:23,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:32:28,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:32:30,240 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.480e+02 1.850e+02 2.095e+02 2.342e+02 3.294e+02, threshold=4.189e+02, percent-clipped=0.0 2023-09-29 19:32:30,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 19:32:30,374 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 19:32:31,858 INFO [train.py:1039] (1/4) Epoch 14, batch 1000, loss[loss=0.1969, simple_loss=0.2788, pruned_loss=0.05746, over 24367.00 frames. ], tot_loss[loss=0.1919, simple_loss=0.2644, pruned_loss=0.05971, over 4698035.15 frames. ], batch size: 77, lr: 7.49e-03, grad_scale: 16.0 2023-09-29 19:32:32,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:32:37,382 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 19:32:37,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:32:43,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:32:45,141 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 19:32:45,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 19:32:49,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:32:49,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:32:51,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:32:51,733 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=467113.3333333333, ans=0.125 2023-09-29 19:32:54,967 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 19:32:55,487 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=467113.3333333333, ans=0.1 2023-09-29 19:32:59,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 19:32:59,586 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=467113.3333333333, ans=0.125 2023-09-29 19:33:00,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 19:33:02,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:33:02,592 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=467113.3333333333, ans=0.0 2023-09-29 19:33:05,013 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 19:33:05,862 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.19 vs. limit=15.0 2023-09-29 19:33:06,582 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 19:33:06,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 19:33:08,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:33:10,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:33:15,668 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=467180.0, ans=0.125 2023-09-29 19:33:17,778 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=7.95 vs. limit=15.0 2023-09-29 19:33:18,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:33:18,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:33:18,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:33:19,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:33:19,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 19:33:19,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:33:21,424 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:33:21,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:33:22,930 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 19:33:24,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 19:33:26,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 19:33:29,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 19:33:29,916 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=467246.6666666667, ans=0.125 2023-09-29 19:33:33,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:33:38,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:33:38,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:33:38,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:33:40,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:33:42,133 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=467313.3333333333, ans=0.125 2023-09-29 19:33:43,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 19:33:43,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:33:44,439 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.83 vs. limit=15.0 2023-09-29 19:33:44,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 19:33:45,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 19:33:46,658 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:33:46,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:33:48,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:33:51,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 19:33:51,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:33:54,444 INFO [train.py:1039] (1/4) Epoch 14, batch 1050, loss[loss=0.1594, simple_loss=0.2038, pruned_loss=0.05753, over 19095.00 frames. ], tot_loss[loss=0.1909, simple_loss=0.2629, pruned_loss=0.05947, over 4687393.28 frames. ], batch size: 388, lr: 7.48e-03, grad_scale: 16.0 2023-09-29 19:33:56,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:33:56,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:33:59,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 19:34:01,163 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:34:04,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:34:06,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 19:34:08,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 19:34:10,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:34:10,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:34:11,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 19:34:12,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:34:12,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 19:34:13,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:34:15,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 19:34:16,987 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:34:17,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 19:34:17,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 19:34:18,780 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=467446.6666666667, ans=0.0 2023-09-29 19:34:23,120 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=467446.6666666667, ans=0.025 2023-09-29 19:34:25,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:34:25,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 19:34:27,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:34:29,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 19:34:29,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 19:34:29,180 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 19:34:30,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:34:32,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 19:34:34,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 19:34:36,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:34:38,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 19:34:40,846 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=467513.3333333333, ans=0.125 2023-09-29 19:34:42,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 19:34:44,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:34:45,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:34:47,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:34:51,719 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 19:34:53,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 19:34:53,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 19:34:54,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:34:55,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:34:56,543 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 19:35:01,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:35:02,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:35:02,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:35:02,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:35:04,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:35:08,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:35:08,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 19:35:09,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:35:09,778 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 19:35:09,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 19:35:11,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:35:15,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:35:17,480 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.954e+02 2.210e+02 2.556e+02 3.990e+02, threshold=4.421e+02, percent-clipped=0.0 2023-09-29 19:35:19,001 INFO [train.py:1039] (1/4) Epoch 14, batch 1100, loss[loss=0.1862, simple_loss=0.2601, pruned_loss=0.05613, over 23684.00 frames. ], tot_loss[loss=0.1903, simple_loss=0.2621, pruned_loss=0.0593, over 4680881.53 frames. ], batch size: 149, lr: 7.48e-03, grad_scale: 16.0 2023-09-29 19:35:23,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:35:26,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 19:35:28,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:35:28,605 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:35:28,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 19:35:30,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:35:31,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:35:33,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:35:37,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:35:38,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 19:35:38,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 19:35:40,039 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:35:40,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:35:41,635 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 19:35:42,153 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten.whitening_limit, batch_count=467780.0, ans=15.0 2023-09-29 19:35:44,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:35:44,816 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=467780.0, ans=0.125 2023-09-29 19:35:46,887 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 19:35:52,698 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:35:54,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 19:35:55,725 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 19:35:57,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:36:00,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:36:00,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 19:36:01,707 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:36:03,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 19:36:03,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:36:03,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:36:03,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:36:04,776 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:36:04,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 19:36:12,846 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:36:12,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 19:36:14,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:36:19,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 19:36:22,763 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 19:36:22,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 19:36:23,800 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.01 vs. limit=10.0 2023-09-29 19:36:24,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:36:28,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:36:28,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:36:29,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 19:36:29,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:36:31,243 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:36:31,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 19:36:31,464 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:36:31,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 19:36:34,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:36:34,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:36:35,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 19:36:40,212 INFO [train.py:1039] (1/4) Epoch 14, batch 1150, loss[loss=0.1964, simple_loss=0.2783, pruned_loss=0.05728, over 23986.00 frames. ], tot_loss[loss=0.1906, simple_loss=0.2628, pruned_loss=0.0592, over 4692816.97 frames. ], batch size: 80, lr: 7.48e-03, grad_scale: 16.0 2023-09-29 19:36:41,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:36:42,157 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=468046.6666666667, ans=0.1 2023-09-29 19:36:45,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:36:46,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:36:48,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:36:48,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 19:36:48,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:36:51,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 19:36:52,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:36:52,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 19:36:58,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 19:36:58,691 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=468113.3333333333, ans=0.125 2023-09-29 19:37:02,383 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:37:06,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:37:07,011 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:37:08,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 19:37:08,519 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 19:37:08,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:37:11,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 19:37:11,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:37:13,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:37:22,133 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=468180.0, ans=0.07 2023-09-29 19:37:23,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:37:31,453 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:37:31,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 19:37:32,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:37:32,415 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=468246.6666666667, ans=0.0 2023-09-29 19:37:33,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:37:36,158 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.53 vs. limit=15.0 2023-09-29 19:37:37,534 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 19:37:39,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:37:46,593 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 19:37:49,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:37:51,696 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:37:51,754 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 19:37:51,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:37:56,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:38:01,581 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.871e+02 2.319e+02 2.937e+02 5.340e+02, threshold=4.639e+02, percent-clipped=1.0 2023-09-29 19:38:01,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:38:01,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 19:38:03,177 INFO [train.py:1039] (1/4) Epoch 14, batch 1200, loss[loss=0.2036, simple_loss=0.2829, pruned_loss=0.06215, over 24406.00 frames. ], tot_loss[loss=0.1919, simple_loss=0.2643, pruned_loss=0.05979, over 4697112.10 frames. ], batch size: 77, lr: 7.48e-03, grad_scale: 32.0 2023-09-29 19:38:03,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:38:03,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:38:03,671 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=468380.0, ans=0.0 2023-09-29 19:38:05,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:38:08,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:38:10,029 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 19:38:11,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:38:13,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:38:15,018 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 19:38:16,693 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 19:38:16,878 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=468380.0, ans=0.125 2023-09-29 19:38:21,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:38:22,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:38:26,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:38:27,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:38:27,955 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 19:38:28,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:38:35,842 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=468513.3333333333, ans=0.0 2023-09-29 19:38:35,955 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=468513.3333333333, ans=0.125 2023-09-29 19:38:37,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 19:38:37,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:38:37,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 19:38:39,191 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:38:44,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 19:38:47,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 19:38:47,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:38:49,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:38:49,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:38:50,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 19:38:53,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:38:53,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:38:53,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:38:55,343 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 19:38:55,550 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=468580.0, ans=0.125 2023-09-29 19:38:56,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 19:38:56,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:38:56,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 19:39:00,581 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:39:00,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:39:02,334 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 19:39:03,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:39:06,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 19:39:10,654 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 19:39:14,348 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:39:17,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:39:18,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:39:21,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:39:23,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 19:39:25,443 INFO [train.py:1039] (1/4) Epoch 14, batch 1250, loss[loss=0.1864, simple_loss=0.2726, pruned_loss=0.05009, over 24360.00 frames. ], tot_loss[loss=0.1925, simple_loss=0.2646, pruned_loss=0.06016, over 4706468.46 frames. ], batch size: 74, lr: 7.47e-03, grad_scale: 16.0 2023-09-29 19:39:28,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:39:29,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:39:30,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 19:39:33,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:39:34,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 19:39:34,922 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=468713.3333333333, ans=0.0 2023-09-29 19:39:39,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 19:39:40,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:39:41,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 19:39:41,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:39:44,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 19:39:48,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 19:39:49,340 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 19:39:49,349 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:39:50,870 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:39:50,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:39:54,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:39:56,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 19:40:01,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 19:40:02,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:40:05,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:40:06,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 19:40:06,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:40:06,421 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=468846.6666666667, ans=0.1 2023-09-29 19:40:07,608 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 19:40:07,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:40:07,658 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:40:12,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:40:15,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:40:17,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:40:17,736 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=468913.3333333333, ans=0.125 2023-09-29 19:40:18,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 19:40:19,014 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 19:40:19,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 19:40:22,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:40:24,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 19:40:24,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:40:29,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 19:40:29,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:40:32,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 19:40:32,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 19:40:34,143 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:40:34,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 19:40:34,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:40:37,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 19:40:38,881 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:40:40,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:40:40,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:40:45,068 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 19:40:46,414 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.815e+02 2.067e+02 2.359e+02 2.982e+02, threshold=4.134e+02, percent-clipped=0.0 2023-09-29 19:40:46,457 INFO [train.py:1039] (1/4) Epoch 14, batch 1300, loss[loss=0.1782, simple_loss=0.2436, pruned_loss=0.05637, over 23468.00 frames. ], tot_loss[loss=0.1924, simple_loss=0.2645, pruned_loss=0.06015, over 4716487.77 frames. ], batch size: 285, lr: 7.47e-03, grad_scale: 16.0 2023-09-29 19:40:48,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:40:48,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 19:40:54,714 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:40:56,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 19:40:58,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:40:58,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:41:00,068 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:41:01,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 19:41:06,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:41:08,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 19:41:09,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 19:41:11,869 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=469113.3333333333, ans=0.125 2023-09-29 19:41:14,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 19:41:18,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:41:19,062 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:41:22,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:41:24,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:41:24,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 19:41:25,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 19:41:27,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 19:41:30,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:41:31,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 19:41:34,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 19:41:35,458 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 19:41:37,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:41:40,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:41:40,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 19:41:40,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:41:40,900 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 19:41:42,772 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=469246.6666666667, ans=0.0 2023-09-29 19:41:43,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:41:47,172 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:41:47,176 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:41:50,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 19:41:50,974 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 19:41:52,576 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 19:41:57,386 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:42:00,830 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 19:42:02,362 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:42:09,250 INFO [train.py:1039] (1/4) Epoch 14, batch 1350, loss[loss=0.1759, simple_loss=0.2378, pruned_loss=0.05698, over 23543.00 frames. ], tot_loss[loss=0.191, simple_loss=0.2631, pruned_loss=0.05944, over 4715819.90 frames. ], batch size: 256, lr: 7.47e-03, grad_scale: 8.0 2023-09-29 19:42:11,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 19:42:14,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:42:16,565 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=469380.0, ans=0.125 2023-09-29 19:42:17,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:42:18,577 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.51 vs. limit=15.0 2023-09-29 19:42:20,804 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:42:20,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:42:21,724 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.81 vs. limit=15.0 2023-09-29 19:42:22,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:42:24,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:42:24,768 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=469446.6666666667, ans=0.125 2023-09-29 19:42:27,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:42:29,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 19:42:30,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 19:42:30,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:42:34,819 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=10.92 vs. limit=15.0 2023-09-29 19:42:35,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 19:42:35,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:42:36,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:42:36,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 19:42:39,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 19:42:41,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 19:42:42,336 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=469513.3333333333, ans=0.1 2023-09-29 19:42:43,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:42:43,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 19:42:51,163 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=469513.3333333333, ans=0.1 2023-09-29 19:42:56,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:42:59,539 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=469580.0, ans=0.1 2023-09-29 19:43:06,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:43:06,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:43:06,866 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 19:43:07,581 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.49 vs. limit=12.0 2023-09-29 19:43:11,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:43:11,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 19:43:12,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 19:43:12,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:43:14,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:43:14,844 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=469646.6666666667, ans=0.2 2023-09-29 19:43:17,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 19:43:19,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:43:24,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 19:43:25,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 19:43:31,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 19:43:32,804 INFO [train.py:1039] (1/4) Epoch 14, batch 1400, loss[loss=0.1922, simple_loss=0.258, pruned_loss=0.06321, over 23658.00 frames. ], tot_loss[loss=0.1901, simple_loss=0.2624, pruned_loss=0.05886, over 4715851.43 frames. ], batch size: 164, lr: 7.46e-03, grad_scale: 8.0 2023-09-29 19:43:32,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:43:34,252 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.861e+02 2.134e+02 2.363e+02 3.336e+02, threshold=4.269e+02, percent-clipped=0.0 2023-09-29 19:43:36,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:43:38,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:43:41,675 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=469713.3333333333, ans=0.2 2023-09-29 19:43:41,678 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=469713.3333333333, ans=0.125 2023-09-29 19:43:43,349 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 19:43:44,991 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 19:43:54,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:43:56,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:43:57,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:43:58,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 19:44:03,909 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:44:05,471 WARNING [train.py:1197] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 19:44:07,917 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=469846.6666666667, ans=0.125 2023-09-29 19:44:16,440 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:44:16,544 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:44:21,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 19:44:21,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:44:21,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:44:22,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:44:24,065 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:44:24,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:44:24,467 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=469913.3333333333, ans=0.05 2023-09-29 19:44:25,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:44:25,761 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:44:27,142 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.75 vs. limit=22.5 2023-09-29 19:44:27,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 19:44:27,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:44:29,879 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=469913.3333333333, ans=0.1 2023-09-29 19:44:31,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:44:34,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:44:38,521 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=18.56 vs. limit=22.5 2023-09-29 19:44:41,130 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 19:44:42,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 19:44:42,861 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=469980.0, ans=0.1 2023-09-29 19:44:44,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:44:46,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 19:44:48,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:44:49,763 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:44:52,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 19:44:53,787 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:44:53,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:44:54,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 19:44:56,252 INFO [train.py:1039] (1/4) Epoch 14, batch 1450, loss[loss=0.2115, simple_loss=0.2679, pruned_loss=0.0775, over 22787.00 frames. ], tot_loss[loss=0.1893, simple_loss=0.2611, pruned_loss=0.05874, over 4702231.99 frames. ], batch size: 322, lr: 7.46e-03, grad_scale: 8.0 2023-09-29 19:44:56,694 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=470046.6666666667, ans=0.0 2023-09-29 19:44:59,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:45:00,968 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 19:45:01,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:45:01,197 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 19:45:02,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 19:45:05,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 19:45:05,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:45:07,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:45:07,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 19:45:07,368 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=470046.6666666667, ans=0.1 2023-09-29 19:45:10,204 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:45:10,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 19:45:12,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 19:45:12,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:45:13,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:45:15,297 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:45:16,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:45:22,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:45:22,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:45:24,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:45:24,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:45:26,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:45:27,901 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:45:27,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:45:28,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:45:29,916 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=470180.0, ans=0.0 2023-09-29 19:45:31,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 19:45:34,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:45:34,722 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=470180.0, ans=0.125 2023-09-29 19:45:37,569 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 19:45:40,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:45:41,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 19:45:42,746 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=13.20 vs. limit=15.0 2023-09-29 19:45:43,397 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:45:44,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 19:45:50,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:45:51,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 19:45:51,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 19:45:52,627 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.54 vs. limit=15.0 2023-09-29 19:45:54,076 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:45:57,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:45:57,502 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=470246.6666666667, ans=0.125 2023-09-29 19:45:59,387 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:46:00,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 19:46:04,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 19:46:05,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 19:46:07,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:46:09,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 19:46:10,908 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=470313.3333333333, ans=0.125 2023-09-29 19:46:12,573 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=470313.3333333333, ans=0.125 2023-09-29 19:46:18,284 INFO [train.py:1039] (1/4) Epoch 14, batch 1500, loss[loss=0.1601, simple_loss=0.2382, pruned_loss=0.04101, over 24594.00 frames. ], tot_loss[loss=0.19, simple_loss=0.2619, pruned_loss=0.05899, over 4705577.53 frames. ], batch size: 60, lr: 7.46e-03, grad_scale: 8.0 2023-09-29 19:46:19,668 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.877e+02 2.089e+02 2.456e+02 3.474e+02, threshold=4.179e+02, percent-clipped=0.0 2023-09-29 19:46:21,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 19:46:21,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:46:21,391 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:46:22,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:46:23,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:46:25,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:46:26,606 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 19:46:28,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 19:46:28,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 19:46:28,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:46:30,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:46:30,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:46:32,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:46:32,476 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=470380.0, ans=0.125 2023-09-29 19:46:38,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:46:39,463 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 19:46:39,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 19:46:39,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:46:41,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:46:44,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 19:46:48,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 19:46:50,353 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:46:51,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 19:46:53,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 19:46:56,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:46:57,807 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:46:57,849 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:46:59,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 19:47:00,037 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:47:00,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:47:01,516 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 19:47:01,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:47:08,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:47:08,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 19:47:10,195 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=470580.0, ans=0.025 2023-09-29 19:47:15,401 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:47:16,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 19:47:21,553 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 19:47:22,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:47:22,951 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 19:47:23,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:47:25,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:47:26,044 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 19:47:26,429 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=470646.6666666667, ans=0.2 2023-09-29 19:47:27,532 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 19:47:29,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 19:47:31,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:47:32,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:47:32,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:47:34,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:47:34,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:47:36,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:47:37,692 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 19:47:39,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 19:47:39,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:47:40,985 INFO [train.py:1039] (1/4) Epoch 14, batch 1550, loss[loss=0.1851, simple_loss=0.2523, pruned_loss=0.05898, over 23685.00 frames. ], tot_loss[loss=0.1902, simple_loss=0.2628, pruned_loss=0.05881, over 4721384.01 frames. ], batch size: 149, lr: 7.46e-03, grad_scale: 8.0 2023-09-29 19:47:41,118 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 19:47:42,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 19:47:44,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:47:46,448 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:47:46,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:47:46,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:47:48,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:47:48,944 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=470713.3333333333, ans=0.0 2023-09-29 19:47:50,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:47:53,235 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 19:47:53,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:47:53,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:47:54,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 19:47:58,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 19:47:58,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 19:48:00,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:48:00,794 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 19:48:01,011 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.max_positive, batch_count=470780.0, ans=0.95 2023-09-29 19:48:02,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 19:48:02,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 19:48:02,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:48:04,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:48:06,344 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.58 vs. limit=6.0 2023-09-29 19:48:08,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:48:10,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 19:48:10,864 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 19:48:17,056 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=470846.6666666667, ans=0.125 2023-09-29 19:48:20,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:48:22,725 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=470846.6666666667, ans=0.125 2023-09-29 19:48:26,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:48:26,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:48:26,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:48:27,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 19:48:30,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 19:48:33,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:48:35,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:48:38,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:48:38,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:48:39,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 19:48:39,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 19:48:41,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:48:42,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:48:44,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 19:48:44,357 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 19:48:47,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:48:51,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 19:48:55,205 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=470980.0, ans=0.1 2023-09-29 19:48:57,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:48:57,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:48:59,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 19:49:00,252 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=470980.0, ans=0.2 2023-09-29 19:49:01,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 19:49:02,611 INFO [train.py:1039] (1/4) Epoch 14, batch 1600, loss[loss=0.18, simple_loss=0.2669, pruned_loss=0.04653, over 24612.00 frames. ], tot_loss[loss=0.1913, simple_loss=0.2641, pruned_loss=0.05928, over 4724250.84 frames. ], batch size: 68, lr: 7.45e-03, grad_scale: 16.0 2023-09-29 19:49:02,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:49:02,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:49:02,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:49:02,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:49:04,176 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.865e+02 2.125e+02 2.416e+02 3.474e+02, threshold=4.250e+02, percent-clipped=0.0 2023-09-29 19:49:05,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:49:06,698 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.92 vs. limit=12.0 2023-09-29 19:49:07,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 19:49:08,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 19:49:10,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 19:49:12,044 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:49:13,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 19:49:13,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:49:16,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:49:24,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:49:27,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 19:49:30,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:49:32,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 19:49:32,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:49:32,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 19:49:36,370 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=471180.0, ans=0.2 2023-09-29 19:49:37,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 19:49:44,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:49:45,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 19:49:45,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:49:46,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:49:46,933 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:49:48,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 19:49:55,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 19:49:56,652 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:49:58,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:49:58,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:50:00,185 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:50:01,750 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:50:03,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:50:06,146 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:50:10,108 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=471313.3333333333, ans=0.0 2023-09-29 19:50:11,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:50:12,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:50:15,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 19:50:16,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:50:17,433 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 19:50:19,332 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=471313.3333333333, ans=0.2 2023-09-29 19:50:21,412 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.47 vs. limit=15.0 2023-09-29 19:50:22,218 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=471380.0, ans=0.035 2023-09-29 19:50:23,419 INFO [train.py:1039] (1/4) Epoch 14, batch 1650, loss[loss=0.1819, simple_loss=0.2411, pruned_loss=0.06137, over 23452.00 frames. ], tot_loss[loss=0.1914, simple_loss=0.264, pruned_loss=0.05936, over 4723109.93 frames. ], batch size: 285, lr: 7.45e-03, grad_scale: 16.0 2023-09-29 19:50:23,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:50:25,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:50:26,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:50:26,757 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 19:50:26,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 19:50:26,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 19:50:28,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 19:50:29,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:50:32,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:50:32,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:50:32,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 19:50:33,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:50:37,491 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 19:50:40,478 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:50:40,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:50:40,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:50:40,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:50:42,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 19:50:42,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 19:50:47,485 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 19:50:49,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 19:50:56,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 19:50:58,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:51:01,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 19:51:01,725 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=471513.3333333333, ans=0.125 2023-09-29 19:51:03,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:51:08,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:51:10,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:51:10,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:51:10,437 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=471513.3333333333, ans=0.1 2023-09-29 19:51:11,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:51:13,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:51:14,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:51:16,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:51:16,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:51:16,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:51:18,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:51:19,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 19:51:20,635 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.62 vs. limit=6.0 2023-09-29 19:51:24,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:51:25,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 19:51:25,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:51:27,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 19:51:28,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 19:51:28,809 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 19:51:28,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:51:28,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:51:30,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:51:30,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:51:30,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 19:51:32,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:51:34,039 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:51:36,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:51:41,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 19:51:44,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:51:44,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:51:44,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 19:51:46,306 INFO [train.py:1039] (1/4) Epoch 14, batch 1700, loss[loss=0.1879, simple_loss=0.2691, pruned_loss=0.05331, over 24402.00 frames. ], tot_loss[loss=0.1907, simple_loss=0.2628, pruned_loss=0.05927, over 4722470.01 frames. ], batch size: 77, lr: 7.45e-03, grad_scale: 8.0 2023-09-29 19:51:46,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:51:46,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:51:46,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:51:49,270 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.869e+02 2.042e+02 2.278e+02 4.402e+02, threshold=4.084e+02, percent-clipped=1.0 2023-09-29 19:51:49,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:51:49,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:51:49,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 19:51:54,378 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 19:51:54,675 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 19:52:04,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:52:06,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:52:11,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 19:52:13,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:52:13,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:52:14,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:52:16,998 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 19:52:18,606 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:52:18,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:52:18,823 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=471846.6666666667, ans=0.0 2023-09-29 19:52:20,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 19:52:21,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 19:52:23,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 19:52:24,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 19:52:27,356 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=471846.6666666667, ans=0.125 2023-09-29 19:52:28,224 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:52:29,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 19:52:31,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:52:39,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:52:40,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:52:40,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:52:43,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 19:52:43,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 19:52:43,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:52:47,298 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:52:47,300 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 19:52:49,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:52:49,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:52:49,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:52:49,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:52:50,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:52:50,918 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:52:53,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:52:53,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:52:54,544 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:52:57,913 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:53:00,097 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 19:53:01,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:53:03,317 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:53:04,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 19:53:07,977 INFO [train.py:1039] (1/4) Epoch 14, batch 1750, loss[loss=0.1899, simple_loss=0.276, pruned_loss=0.05188, over 24582.00 frames. ], tot_loss[loss=0.1897, simple_loss=0.2614, pruned_loss=0.05896, over 4714301.27 frames. ], batch size: 71, lr: 7.45e-03, grad_scale: 8.0 2023-09-29 19:53:11,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:53:14,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:53:15,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 19:53:15,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 19:53:15,760 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:53:19,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:53:19,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:53:23,442 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=472113.3333333333, ans=0.125 2023-09-29 19:53:24,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 19:53:26,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:53:28,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 19:53:28,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:53:31,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:53:34,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 19:53:36,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 19:53:38,218 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:53:38,262 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 19:53:46,112 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:53:50,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:53:50,540 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:53:50,993 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=472180.0, ans=0.0 2023-09-29 19:53:53,003 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:53:54,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:53:56,549 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:53:56,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:53:59,682 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:53:59,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:54:01,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 19:54:05,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:54:06,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 19:54:06,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:54:08,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:54:09,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:54:14,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 19:54:15,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 19:54:15,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:54:18,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:54:23,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:54:25,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:54:27,353 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:54:27,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 19:54:27,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:54:27,673 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=472313.3333333333, ans=0.95 2023-09-29 19:54:28,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 19:54:28,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:54:29,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 19:54:29,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:54:31,028 INFO [train.py:1039] (1/4) Epoch 14, batch 1800, loss[loss=0.1867, simple_loss=0.2524, pruned_loss=0.06057, over 23688.00 frames. ], tot_loss[loss=0.1891, simple_loss=0.2606, pruned_loss=0.05878, over 4693969.07 frames. ], batch size: 232, lr: 7.44e-03, grad_scale: 8.0 2023-09-29 19:54:31,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:54:34,554 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 2.007e+02 2.286e+02 2.732e+02 4.452e+02, threshold=4.572e+02, percent-clipped=3.0 2023-09-29 19:54:34,657 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:54:34,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:54:36,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 19:54:39,224 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.76 vs. limit=6.0 2023-09-29 19:54:41,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:54:43,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 19:54:44,493 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:54:47,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:54:49,223 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=472446.6666666667, ans=0.1 2023-09-29 19:54:50,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:54:50,635 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=472446.6666666667, ans=0.0 2023-09-29 19:54:51,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:54:53,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:54:54,976 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:54:55,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 19:54:55,132 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:54:57,289 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=18.53 vs. limit=22.5 2023-09-29 19:54:59,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:55:06,165 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 19:55:06,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 19:55:06,648 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=472513.3333333333, ans=0.0 2023-09-29 19:55:08,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 19:55:08,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:55:10,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:55:10,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:55:10,163 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:55:15,660 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 19:55:17,139 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:55:20,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:55:21,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 19:55:21,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 19:55:21,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:55:23,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:55:24,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 19:55:29,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 19:55:36,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:55:36,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 19:55:37,241 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.90 vs. limit=15.0 2023-09-29 19:55:37,873 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:55:37,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:55:38,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:55:39,943 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 19:55:45,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 19:55:45,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:55:48,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 19:55:48,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:55:51,499 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:55:51,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 19:55:51,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:55:52,904 INFO [train.py:1039] (1/4) Epoch 14, batch 1850, loss[loss=0.1762, simple_loss=0.2526, pruned_loss=0.04992, over 23647.00 frames. ], tot_loss[loss=0.1897, simple_loss=0.2616, pruned_loss=0.05892, over 4696502.69 frames. ], batch size: 149, lr: 7.44e-03, grad_scale: 8.0 2023-09-29 19:55:53,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:55:53,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:55:54,764 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:55:56,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:55:57,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:55:59,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:56:05,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:56:05,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 19:56:10,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 19:56:12,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 19:56:16,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:56:18,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 19:56:18,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 19:56:29,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:56:30,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 19:56:34,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:56:34,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:56:39,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 19:56:40,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:56:41,476 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 19:56:43,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:56:46,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:56:48,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:56:48,665 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=472913.3333333333, ans=0.125 2023-09-29 19:56:51,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 19:56:53,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:56:53,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 19:56:53,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:56:55,800 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:56:57,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:57:00,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 19:57:01,894 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:57:02,857 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten.whitening_limit, batch_count=472980.0, ans=22.5 2023-09-29 19:57:04,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:57:06,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:57:06,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 19:57:06,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 19:57:08,026 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 19:57:09,539 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 19:57:11,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 19:57:11,228 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:57:11,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:57:12,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:57:12,705 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 19:57:12,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 19:57:14,121 INFO [train.py:1039] (1/4) Epoch 14, batch 1900, loss[loss=0.1689, simple_loss=0.2467, pruned_loss=0.04557, over 24473.00 frames. ], tot_loss[loss=0.1894, simple_loss=0.2617, pruned_loss=0.05856, over 4708871.00 frames. ], batch size: 58, lr: 7.44e-03, grad_scale: 8.0 2023-09-29 19:57:14,181 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:57:14,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 19:57:15,086 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=15.69 vs. limit=15.0 2023-09-29 19:57:15,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 19:57:17,954 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.953e+02 2.438e+02 3.060e+02 4.986e+02, threshold=4.875e+02, percent-clipped=3.0 2023-09-29 19:57:18,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:57:18,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 19:57:18,539 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=473046.6666666667, ans=0.125 2023-09-29 19:57:19,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:57:19,741 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 19:57:19,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:57:21,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:57:27,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:57:29,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:57:32,991 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 19:57:33,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 19:57:34,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:57:36,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:57:36,092 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 19:57:36,149 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 19:57:37,811 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=473113.3333333333, ans=0.0 2023-09-29 19:57:41,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 19:57:43,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:57:46,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 19:57:48,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 19:57:58,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 19:58:00,324 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=473180.0, ans=0.125 2023-09-29 19:58:01,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 19:58:01,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:58:02,119 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 19:58:02,126 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 19:58:02,182 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 19:58:03,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 19:58:03,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:58:08,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 19:58:11,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:58:13,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:58:13,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 19:58:16,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 19:58:19,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 19:58:20,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:58:21,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=473313.3333333333, ans=0.1 2023-09-29 19:58:27,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:58:27,491 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:58:27,522 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:58:27,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:58:29,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 19:58:29,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 19:58:30,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:58:34,391 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:58:34,394 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 19:58:35,793 INFO [train.py:1039] (1/4) Epoch 14, batch 1950, loss[loss=0.23, simple_loss=0.2918, pruned_loss=0.08411, over 22863.00 frames. ], tot_loss[loss=0.1909, simple_loss=0.2632, pruned_loss=0.05936, over 4704448.74 frames. ], batch size: 322, lr: 7.44e-03, grad_scale: 8.0 2023-09-29 19:58:35,988 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:58:35,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:58:36,063 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:58:37,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:58:42,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:58:42,875 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=473380.0, ans=0.125 2023-09-29 19:58:44,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:58:45,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:58:45,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 19:58:47,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 19:58:48,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 19:58:48,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:58:50,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:58:53,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:58:53,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:58:54,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:58:57,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:58:59,367 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:58:59,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 19:59:01,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:59:01,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:59:06,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:59:09,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 19:59:09,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:59:09,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 19:59:09,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 19:59:11,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 19:59:11,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:59:11,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:59:14,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:59:17,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:59:23,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 19:59:26,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:59:26,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 19:59:27,635 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 19:59:27,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:59:29,703 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=473580.0, ans=0.04949747468305833 2023-09-29 19:59:31,228 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=473580.0, ans=0.125 2023-09-29 19:59:32,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:59:33,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:59:33,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 19:59:37,845 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=473580.0, ans=0.0 2023-09-29 19:59:43,491 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:59:45,384 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:59:47,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:59:49,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:59:52,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:59:54,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:59:54,246 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 19:59:54,254 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 19:59:55,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:59:55,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 19:59:57,177 INFO [train.py:1039] (1/4) Epoch 14, batch 2000, loss[loss=0.1999, simple_loss=0.279, pruned_loss=0.06042, over 24032.00 frames. ], tot_loss[loss=0.1917, simple_loss=0.2643, pruned_loss=0.05957, over 4713189.60 frames. ], batch size: 80, lr: 7.43e-03, grad_scale: 16.0 2023-09-29 19:59:58,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:00:00,435 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.544e+02 1.917e+02 2.206e+02 2.573e+02 3.762e+02, threshold=4.412e+02, percent-clipped=0.0 2023-09-29 20:00:00,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 20:00:02,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:00:02,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:00:03,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:00:06,749 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:00:11,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 20:00:11,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 20:00:16,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:00:18,371 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 20:00:19,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 20:00:19,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:00:21,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:00:24,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 20:00:26,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:00:26,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:00:26,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:00:28,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 20:00:29,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 20:00:31,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 20:00:31,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:00:34,204 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:00:34,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 20:00:34,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:00:35,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:00:37,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:00:38,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 20:00:40,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 20:00:40,519 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:00:40,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:00:45,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:00:46,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:00:46,146 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:00:48,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:00:48,611 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=473913.3333333333, ans=0.1 2023-09-29 20:00:49,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:00:51,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:00:51,399 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:00:51,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:00:53,454 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:00:56,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:00:57,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 20:01:04,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 20:01:05,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:01:08,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:01:08,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:01:13,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:01:15,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:01:15,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:01:15,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 20:01:15,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:01:15,379 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=473980.0, ans=0.2 2023-09-29 20:01:18,048 INFO [train.py:1039] (1/4) Epoch 14, batch 2050, loss[loss=0.1929, simple_loss=0.2625, pruned_loss=0.0617, over 23461.00 frames. ], tot_loss[loss=0.1924, simple_loss=0.2641, pruned_loss=0.06035, over 4702485.67 frames. ], batch size: 119, lr: 7.43e-03, grad_scale: 16.0 2023-09-29 20:01:18,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:01:19,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:01:25,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:01:25,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:01:30,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:01:33,796 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:01:35,261 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:01:35,367 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:01:36,059 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.45 vs. limit=22.5 2023-09-29 20:01:36,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 20:01:36,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:01:37,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:01:38,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 20:01:48,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 20:01:48,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:01:49,867 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 20:01:50,074 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=474180.0, ans=0.1 2023-09-29 20:01:53,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:01:53,620 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=474180.0, ans=0.125 2023-09-29 20:01:54,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 20:01:55,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 20:01:58,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:02:01,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:02:03,715 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 20:02:03,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:02:05,393 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:02:06,872 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:02:06,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:02:08,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:02:10,442 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 20:02:13,462 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:02:13,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:02:18,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:02:18,599 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=474246.6666666667, ans=0.125 2023-09-29 20:02:22,901 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:02:23,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 20:02:29,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:02:30,736 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:02:32,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:02:33,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 20:02:38,877 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 20:02:38,877 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:02:38,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:02:40,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 20:02:40,649 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:02:41,976 INFO [train.py:1039] (1/4) Epoch 14, batch 2100, loss[loss=0.1974, simple_loss=0.2591, pruned_loss=0.06783, over 23710.00 frames. ], tot_loss[loss=0.1909, simple_loss=0.2618, pruned_loss=0.06003, over 4675058.44 frames. ], batch size: 232, lr: 7.43e-03, grad_scale: 16.0 2023-09-29 20:02:42,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 20:02:42,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 20:02:43,749 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:02:45,127 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.952e+02 2.197e+02 2.435e+02 3.188e+02, threshold=4.394e+02, percent-clipped=0.0 2023-09-29 20:02:46,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:02:48,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:02:50,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:02:51,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:02:51,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 20:02:52,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:02:54,379 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 20:02:54,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 20:02:56,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:02:56,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:02:56,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 20:02:58,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 20:03:05,615 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 20:03:05,616 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 20:03:08,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:03:10,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:03:13,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:03:15,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 20:03:15,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:03:15,491 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 20:03:18,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 20:03:18,512 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:03:18,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 20:03:18,577 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 20:03:20,079 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 20:03:21,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 20:03:23,378 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:03:26,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 20:03:27,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 20:03:29,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:03:30,192 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.83 vs. limit=15.0 2023-09-29 20:03:31,496 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:03:31,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 20:03:31,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:03:31,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:03:31,930 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=474580.0, ans=0.0 2023-09-29 20:03:32,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:03:32,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 20:03:36,640 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 20:03:36,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 20:03:41,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:03:44,743 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:03:44,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 20:03:49,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:03:52,063 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.45 vs. limit=15.0 2023-09-29 20:03:53,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:03:53,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:03:53,345 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:03:53,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 20:03:53,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 20:03:55,566 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.98 vs. limit=10.0 2023-09-29 20:03:57,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:03:57,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 20:03:57,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:03:57,953 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:03:59,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 20:04:01,298 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 20:04:01,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:04:02,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:04:02,966 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:04:04,268 INFO [train.py:1039] (1/4) Epoch 14, batch 2150, loss[loss=0.1776, simple_loss=0.247, pruned_loss=0.05415, over 19795.00 frames. ], tot_loss[loss=0.1898, simple_loss=0.2609, pruned_loss=0.05936, over 4694131.63 frames. ], batch size: 43, lr: 7.43e-03, grad_scale: 16.0 2023-09-29 20:04:04,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:04:04,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:04:10,337 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=474713.3333333333, ans=0.125 2023-09-29 20:04:11,834 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=474713.3333333333, ans=0.1 2023-09-29 20:04:13,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 20:04:15,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:04:15,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:04:17,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:04:18,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:04:18,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:04:22,347 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:04:23,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:04:23,776 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:04:26,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:04:28,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 20:04:30,041 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 20:04:32,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:04:34,407 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:04:37,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:04:37,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:04:37,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:04:37,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 20:04:38,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:04:38,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:04:38,997 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:04:40,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 20:04:42,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 20:04:42,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:04:42,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:04:44,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 20:04:46,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:04:48,812 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:04:50,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:04:51,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:04:51,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 20:04:51,760 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 20:04:53,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:04:53,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:04:54,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:04:56,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 20:04:58,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:04:59,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:04:59,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 20:05:02,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 20:05:02,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:05:02,894 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 20:05:04,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:05:04,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:05:05,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 20:05:05,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:05:05,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 20:05:05,977 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 20:05:05,978 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 20:05:06,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 20:05:07,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:05:09,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:05:09,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:05:10,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:05:12,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 20:05:13,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:05:15,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:05:24,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:05:25,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 20:05:27,307 INFO [train.py:1039] (1/4) Epoch 14, batch 2200, loss[loss=0.203, simple_loss=0.2801, pruned_loss=0.06295, over 23968.00 frames. ], tot_loss[loss=0.1897, simple_loss=0.2614, pruned_loss=0.059, over 4691992.54 frames. ], batch size: 86, lr: 7.42e-03, grad_scale: 16.0 2023-09-29 20:05:29,086 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:05:29,475 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=475046.6666666667, ans=0.125 2023-09-29 20:05:30,432 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.473e+02 1.895e+02 2.112e+02 2.594e+02 4.631e+02, threshold=4.225e+02, percent-clipped=1.0 2023-09-29 20:05:35,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:05:35,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:05:37,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:05:37,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 20:05:40,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:05:42,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:05:42,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 20:05:46,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 20:05:48,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 20:05:48,643 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 20:05:53,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 20:05:57,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:05:58,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:06:00,185 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:06:03,368 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:06:03,404 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 20:06:05,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 20:06:07,549 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:06:07,670 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 20:06:12,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:06:13,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:06:15,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:06:16,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:06:18,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 20:06:19,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:06:23,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 20:06:26,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:06:26,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:06:26,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:06:29,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:06:29,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:06:29,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:06:29,237 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:06:30,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 20:06:32,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:06:32,488 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 20:06:35,708 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 20:06:35,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:06:38,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:06:40,814 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 20:06:41,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:06:42,445 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 20:06:44,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 20:06:44,098 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 20:06:45,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:06:46,482 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.05 vs. limit=15.0 2023-09-29 20:06:47,061 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 20:06:47,978 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=5.74 vs. limit=12.0 2023-09-29 20:06:50,011 INFO [train.py:1039] (1/4) Epoch 14, batch 2250, loss[loss=0.199, simple_loss=0.2754, pruned_loss=0.0613, over 24493.00 frames. ], tot_loss[loss=0.1901, simple_loss=0.2623, pruned_loss=0.05891, over 4700408.74 frames. ], batch size: 66, lr: 7.42e-03, grad_scale: 16.0 2023-09-29 20:06:50,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:06:51,640 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 20:06:53,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:06:56,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:07:04,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:07:05,831 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:07:08,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:07:09,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:07:10,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:07:12,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 20:07:12,417 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:07:12,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:07:15,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 20:07:17,043 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:07:17,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:07:19,127 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:07:23,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:07:24,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 20:07:24,097 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 20:07:27,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 20:07:28,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:07:30,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:07:30,853 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.33 vs. limit=6.0 2023-09-29 20:07:35,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:07:37,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:07:37,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:07:37,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:07:42,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:07:43,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:07:48,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:07:49,987 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 20:07:53,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:07:55,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:07:55,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:07:58,662 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=475646.6666666667, ans=0.125 2023-09-29 20:08:02,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 20:08:04,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 20:08:04,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 20:08:04,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:08:05,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:08:08,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 20:08:12,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:08:12,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:08:13,504 INFO [train.py:1039] (1/4) Epoch 14, batch 2300, loss[loss=0.1917, simple_loss=0.2564, pruned_loss=0.06349, over 23604.00 frames. ], tot_loss[loss=0.1901, simple_loss=0.2623, pruned_loss=0.05896, over 4696851.80 frames. ], batch size: 149, lr: 7.42e-03, grad_scale: 16.0 2023-09-29 20:08:16,565 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.612e+02 1.985e+02 2.281e+02 2.632e+02 4.053e+02, threshold=4.563e+02, percent-clipped=0.0 2023-09-29 20:08:18,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:08:19,769 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:08:21,435 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 20:08:23,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:08:24,941 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=475713.3333333333, ans=0.0 2023-09-29 20:08:30,064 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:08:30,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 20:08:30,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:08:30,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:08:30,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 20:08:33,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:08:34,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:08:36,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:08:38,173 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=475780.0, ans=0.0 2023-09-29 20:08:41,317 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:08:42,450 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=475780.0, ans=0.125 2023-09-29 20:08:45,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:08:49,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:08:52,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:08:54,387 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:08:57,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:08:59,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:09:04,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:09:05,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 20:09:05,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:09:05,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 20:09:10,383 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 20:09:10,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:09:10,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:09:10,488 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:09:11,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:09:11,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 20:09:11,958 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 20:09:12,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 20:09:12,232 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=475913.3333333333, ans=0.125 2023-09-29 20:09:13,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:09:13,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:09:13,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 20:09:21,713 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.03 vs. limit=15.0 2023-09-29 20:09:22,368 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:09:22,778 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=475980.0, ans=0.2 2023-09-29 20:09:27,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:09:27,417 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=475980.0, ans=0.125 2023-09-29 20:09:31,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:09:31,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:09:31,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 20:09:33,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 20:09:33,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:09:34,801 INFO [train.py:1039] (1/4) Epoch 14, batch 2350, loss[loss=0.1928, simple_loss=0.2471, pruned_loss=0.06927, over 23716.00 frames. ], tot_loss[loss=0.1904, simple_loss=0.263, pruned_loss=0.05891, over 4700929.17 frames. ], batch size: 232, lr: 7.42e-03, grad_scale: 16.0 2023-09-29 20:09:34,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 20:09:36,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 20:09:36,840 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=476046.6666666667, ans=0.125 2023-09-29 20:09:40,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:09:40,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 20:09:46,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 20:09:49,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:09:54,263 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 20:09:55,417 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:09:55,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:09:55,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:09:55,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:09:55,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 20:09:57,454 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=476113.3333333333, ans=0.125 2023-09-29 20:10:00,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:10:07,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 20:10:09,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:10:10,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:10:10,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:10:12,721 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=476180.0, ans=0.0 2023-09-29 20:10:15,307 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.95 vs. limit=15.0 2023-09-29 20:10:15,933 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:10:17,365 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 20:10:17,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:10:19,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:10:20,438 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:10:20,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:10:25,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:10:26,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 20:10:27,153 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=476246.6666666667, ans=0.1 2023-09-29 20:10:28,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:10:30,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:10:30,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:10:30,864 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=476246.6666666667, ans=0.1 2023-09-29 20:10:32,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 20:10:34,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:10:35,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 20:10:37,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:10:41,626 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.87 vs. limit=22.5 2023-09-29 20:10:42,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 20:10:43,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 20:10:45,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:10:45,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 20:10:45,460 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 20:10:46,790 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 20:10:48,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 20:10:52,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:10:55,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:10:56,570 INFO [train.py:1039] (1/4) Epoch 14, batch 2400, loss[loss=0.1798, simple_loss=0.235, pruned_loss=0.06235, over 22738.00 frames. ], tot_loss[loss=0.1905, simple_loss=0.2622, pruned_loss=0.05938, over 4692056.69 frames. ], batch size: 322, lr: 7.41e-03, grad_scale: 32.0 2023-09-29 20:10:59,957 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.484e+02 1.908e+02 2.123e+02 2.498e+02 3.353e+02, threshold=4.247e+02, percent-clipped=0.0 2023-09-29 20:11:00,242 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:11:01,844 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:11:01,912 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 20:11:03,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 20:11:11,416 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 20:11:11,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:11:13,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 20:11:14,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:11:16,043 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:11:16,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 20:11:22,281 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:11:24,465 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 20:11:29,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 20:11:30,975 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=476513.3333333333, ans=0.09899494936611666 2023-09-29 20:11:35,759 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 20:11:39,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:11:40,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:11:45,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:11:46,009 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=476580.0, ans=0.025 2023-09-29 20:11:47,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 20:11:48,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:11:53,783 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:11:56,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:12:01,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:12:01,968 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:12:01,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 20:12:02,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:12:02,292 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=476646.6666666667, ans=0.125 2023-09-29 20:12:03,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:12:03,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:12:03,597 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 20:12:08,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:12:08,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 20:12:08,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 20:12:10,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 20:12:12,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:12:12,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:12:13,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 20:12:14,331 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=476646.6666666667, ans=0.0 2023-09-29 20:12:15,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 20:12:15,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 20:12:15,444 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 20:12:16,988 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 20:12:18,292 INFO [train.py:1039] (1/4) Epoch 14, batch 2450, loss[loss=0.1917, simple_loss=0.2575, pruned_loss=0.0629, over 23760.00 frames. ], tot_loss[loss=0.1895, simple_loss=0.2614, pruned_loss=0.05876, over 4703990.51 frames. ], batch size: 212, lr: 7.41e-03, grad_scale: 16.0 2023-09-29 20:12:18,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:12:19,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:12:19,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:12:21,464 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 20:12:23,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:12:23,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 20:12:24,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:12:24,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:12:28,081 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:12:28,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:12:29,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 20:12:33,332 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=10.28 vs. limit=10.0 2023-09-29 20:12:36,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:12:36,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:12:37,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:12:38,649 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.15 vs. limit=12.0 2023-09-29 20:12:39,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:12:39,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:12:39,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 20:12:40,949 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=476780.0, ans=0.0 2023-09-29 20:12:46,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:12:47,889 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 20:12:47,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:12:52,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 20:12:52,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:12:52,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:12:54,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:12:54,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 20:12:56,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:13:03,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:13:05,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:13:05,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:13:05,214 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:13:07,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:13:07,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:13:08,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 20:13:14,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:13:14,344 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:13:17,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:13:17,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:13:18,633 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.99 vs. limit=22.5 2023-09-29 20:13:22,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:13:22,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 20:13:23,743 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:13:25,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:13:25,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 20:13:25,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:13:26,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:13:31,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:13:31,782 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=476980.0, ans=0.125 2023-09-29 20:13:32,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:13:32,945 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:13:37,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 20:13:38,887 INFO [train.py:1039] (1/4) Epoch 14, batch 2500, loss[loss=0.1867, simple_loss=0.254, pruned_loss=0.05967, over 23694.00 frames. ], tot_loss[loss=0.1894, simple_loss=0.2613, pruned_loss=0.05875, over 4714590.44 frames. ], batch size: 149, lr: 7.41e-03, grad_scale: 16.0 2023-09-29 20:13:39,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:13:40,937 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=477046.6666666667, ans=0.0 2023-09-29 20:13:44,487 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.907e+02 2.163e+02 2.456e+02 3.959e+02, threshold=4.326e+02, percent-clipped=0.0 2023-09-29 20:13:48,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:13:58,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:13:58,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:14:00,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:14:00,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 20:14:04,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 20:14:05,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:14:05,262 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=477113.3333333333, ans=10.0 2023-09-29 20:14:05,637 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.42 vs. limit=10.0 2023-09-29 20:14:06,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 20:14:06,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 20:14:08,042 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 20:14:09,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:14:10,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:14:10,999 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 20:14:11,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:14:12,456 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 20:14:12,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:14:16,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:14:18,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:14:23,662 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 20:14:25,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 20:14:27,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:14:30,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:14:33,302 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:14:36,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:14:39,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:14:43,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 20:14:47,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 20:14:47,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:14:47,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 20:14:50,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:14:50,615 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:14:52,888 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 20:14:52,889 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 20:14:52,898 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 20:14:54,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:14:58,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 20:14:58,084 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 20:14:59,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:14:59,639 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 20:15:03,145 INFO [train.py:1039] (1/4) Epoch 14, batch 2550, loss[loss=0.1875, simple_loss=0.2665, pruned_loss=0.05425, over 24412.00 frames. ], tot_loss[loss=0.1899, simple_loss=0.2626, pruned_loss=0.05854, over 4730734.43 frames. ], batch size: 77, lr: 7.41e-03, grad_scale: 16.0 2023-09-29 20:15:04,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 20:15:07,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:15:09,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:15:09,623 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:15:12,673 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:15:12,804 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 20:15:12,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 20:15:14,666 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=477380.0, ans=0.0 2023-09-29 20:15:15,952 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 20:15:17,532 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:15:19,097 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:15:22,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:15:22,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 20:15:23,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 20:15:25,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:15:25,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:15:29,199 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:15:29,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 20:15:30,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 20:15:30,719 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:15:30,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 20:15:45,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:15:48,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:15:48,998 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:15:49,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:15:50,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 20:15:56,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:15:58,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 20:15:58,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 20:16:00,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:16:00,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 20:16:00,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 20:16:02,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:16:04,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:16:08,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:16:09,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 20:16:09,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:16:10,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:16:11,085 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 20:16:12,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 20:16:14,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:16:20,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:16:22,029 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:16:24,922 INFO [train.py:1039] (1/4) Epoch 14, batch 2600, loss[loss=0.1927, simple_loss=0.2804, pruned_loss=0.05254, over 24308.00 frames. ], tot_loss[loss=0.1909, simple_loss=0.2637, pruned_loss=0.05905, over 4736740.82 frames. ], batch size: 74, lr: 7.40e-03, grad_scale: 16.0 2023-09-29 20:16:25,168 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 20:16:28,242 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 20:16:28,289 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:16:28,353 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 20:16:29,681 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.879e+02 2.138e+02 2.500e+02 3.129e+02, threshold=4.275e+02, percent-clipped=0.0 2023-09-29 20:16:29,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 20:16:29,865 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 20:16:32,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:16:32,178 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 20:16:35,941 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 20:16:37,642 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 20:16:40,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:16:44,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 20:16:45,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 20:16:47,397 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 20:16:47,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 20:16:50,663 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 20:16:50,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 20:16:57,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:16:57,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:16:57,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:16:57,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 20:16:57,658 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=477846.6666666667, ans=0.0 2023-09-29 20:16:58,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:17:03,614 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 20:17:11,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:17:11,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:17:13,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 20:17:14,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:17:14,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:17:14,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 20:17:19,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 20:17:19,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:17:21,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:17:24,589 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 20:17:25,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:17:26,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:17:31,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:17:32,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:17:32,134 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 20:17:33,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:17:36,583 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:17:36,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:17:36,975 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=477980.0, ans=0.1 2023-09-29 20:17:42,927 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=477980.0, ans=0.125 2023-09-29 20:17:44,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 20:17:46,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:17:48,998 INFO [train.py:1039] (1/4) Epoch 14, batch 2650, loss[loss=0.1724, simple_loss=0.2441, pruned_loss=0.05036, over 24328.00 frames. ], tot_loss[loss=0.1914, simple_loss=0.264, pruned_loss=0.05938, over 4729318.92 frames. ], batch size: 56, lr: 7.40e-03, grad_scale: 16.0 2023-09-29 20:17:49,434 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=478046.6666666667, ans=0.0 2023-09-29 20:17:49,513 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=478046.6666666667, ans=0.0 2023-09-29 20:17:50,558 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 20:17:54,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 20:17:55,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:17:57,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 20:17:58,753 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 20:17:58,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:18:01,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:18:03,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 20:18:03,664 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=478113.3333333333, ans=0.2 2023-09-29 20:18:03,669 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=478113.3333333333, ans=0.2 2023-09-29 20:18:04,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:18:06,794 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=478113.3333333333, ans=0.125 2023-09-29 20:18:07,909 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:18:09,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 20:18:09,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 20:18:10,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:18:12,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 20:18:12,760 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 20:18:13,042 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=478113.3333333333, ans=0.0 2023-09-29 20:18:17,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:18:18,780 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 20:18:18,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:18:20,335 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 20:18:25,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:18:25,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 20:18:25,510 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:18:25,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:18:31,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 20:18:31,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 20:18:31,816 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=478180.0, ans=0.1 2023-09-29 20:18:34,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 20:18:39,122 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 20:18:39,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:18:40,685 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:18:40,743 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 20:18:40,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:18:42,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:18:43,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:18:46,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:18:47,063 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:18:47,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:18:48,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:18:50,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:18:50,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 20:18:50,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:18:53,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:18:53,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 20:18:58,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:18:58,769 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=478313.3333333333, ans=0.0 2023-09-29 20:19:00,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:19:00,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:19:02,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 20:19:03,511 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.09 vs. limit=15.0 2023-09-29 20:19:05,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:19:06,081 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=478313.3333333333, ans=0.2 2023-09-29 20:19:07,388 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:19:07,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:19:08,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:19:10,261 INFO [train.py:1039] (1/4) Epoch 14, batch 2700, loss[loss=0.1836, simple_loss=0.2588, pruned_loss=0.05423, over 24476.00 frames. ], tot_loss[loss=0.1926, simple_loss=0.2649, pruned_loss=0.06013, over 4717099.71 frames. ], batch size: 63, lr: 7.40e-03, grad_scale: 16.0 2023-09-29 20:19:10,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 20:19:10,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:19:13,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:19:13,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 20:19:14,652 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.969e+02 2.229e+02 2.662e+02 4.082e+02, threshold=4.458e+02, percent-clipped=0.0 2023-09-29 20:19:16,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:19:18,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 20:19:19,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:19:21,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:19:21,082 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:19:21,258 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=478380.0, ans=0.125 2023-09-29 20:19:22,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:19:22,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:19:23,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:19:23,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 20:19:24,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 20:19:25,450 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:19:25,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 20:19:27,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 20:19:28,630 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:19:32,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:19:32,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 20:19:33,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:19:38,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:19:38,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:19:40,165 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=478446.6666666667, ans=0.125 2023-09-29 20:19:44,458 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:19:44,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:19:44,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:19:44,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:19:47,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:19:49,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:19:49,237 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 20:19:49,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:19:51,291 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=11.35 vs. limit=15.0 2023-09-29 20:19:52,222 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=478513.3333333333, ans=0.1 2023-09-29 20:19:56,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:19:56,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:20:06,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:20:08,495 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:20:08,985 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 20:20:11,107 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=478580.0, ans=0.125 2023-09-29 20:20:12,349 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:20:12,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:20:15,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:20:16,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:20:17,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:20:18,548 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:20:20,403 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:20:20,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:20:20,745 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=478646.6666666667, ans=0.125 2023-09-29 20:20:23,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:20:25,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:20:25,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:20:26,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 20:20:26,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:20:30,048 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:20:30,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 20:20:31,484 INFO [train.py:1039] (1/4) Epoch 14, batch 2750, loss[loss=0.1799, simple_loss=0.2414, pruned_loss=0.05917, over 23784.00 frames. ], tot_loss[loss=0.1932, simple_loss=0.265, pruned_loss=0.06066, over 4710707.11 frames. ], batch size: 212, lr: 7.40e-03, grad_scale: 16.0 2023-09-29 20:20:32,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 20:20:32,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:20:35,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:20:35,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:20:35,522 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=478713.3333333333, ans=0.125 2023-09-29 20:20:37,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:20:39,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 20:20:39,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:20:44,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:20:46,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 20:20:46,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:20:46,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:20:46,158 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 20:20:46,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:20:46,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:20:52,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 20:20:54,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:20:55,522 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:20:55,619 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:20:55,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 20:20:57,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:20:57,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:20:57,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:20:58,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:21:03,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 20:21:03,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 20:21:03,531 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=478846.6666666667, ans=0.2 2023-09-29 20:21:05,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 20:21:06,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:21:07,334 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=478846.6666666667, ans=0.125 2023-09-29 20:21:08,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 20:21:14,378 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=478846.6666666667, ans=0.1 2023-09-29 20:21:16,005 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=478846.6666666667, ans=0.125 2023-09-29 20:21:17,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:21:18,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 20:21:18,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:21:19,182 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=478846.6666666667, ans=0.1 2023-09-29 20:21:22,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:21:22,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:21:24,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:21:31,635 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:21:31,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:21:31,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 20:21:33,517 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 20:21:36,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:21:38,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 20:21:40,417 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=478980.0, ans=0.1 2023-09-29 20:21:43,228 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 20:21:45,626 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:21:46,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 20:21:47,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:21:49,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:21:49,442 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 20:21:49,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:21:53,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 20:21:53,563 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=478980.0, ans=0.125 2023-09-29 20:21:54,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:21:54,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:21:56,020 INFO [train.py:1039] (1/4) Epoch 14, batch 2800, loss[loss=0.2053, simple_loss=0.2383, pruned_loss=0.08619, over 19394.00 frames. ], tot_loss[loss=0.191, simple_loss=0.2623, pruned_loss=0.05982, over 4698395.93 frames. ], batch size: 388, lr: 7.39e-03, grad_scale: 8.0 2023-09-29 20:21:56,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 20:21:56,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:21:57,591 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:21:59,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:22:00,645 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 20:22:00,646 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 20:22:03,508 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.993e+02 2.230e+02 2.590e+02 3.913e+02, threshold=4.460e+02, percent-clipped=0.0 2023-09-29 20:22:05,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:22:06,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 20:22:06,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:22:10,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:22:12,276 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 20:22:15,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 20:22:16,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 20:22:16,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:22:18,346 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:22:18,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:22:20,867 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=479113.3333333333, ans=0.1 2023-09-29 20:22:24,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:22:25,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:22:25,687 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 20:22:25,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:22:34,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:22:35,763 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:22:38,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:22:38,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:22:40,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:22:40,661 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=479180.0, ans=0.125 2023-09-29 20:22:45,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:22:45,053 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 20:22:46,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:22:48,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:22:48,572 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:22:51,697 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:22:51,890 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=479246.6666666667, ans=0.0 2023-09-29 20:22:53,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:22:57,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:23:00,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:23:00,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:23:00,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 20:23:00,941 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=479313.3333333333, ans=0.125 2023-09-29 20:23:01,957 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 20:23:02,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 20:23:02,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:23:03,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 20:23:04,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:23:04,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:23:04,526 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=479313.3333333333, ans=0.125 2023-09-29 20:23:05,594 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:23:07,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 20:23:07,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:23:08,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:23:08,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:23:09,116 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=479313.3333333333, ans=0.125 2023-09-29 20:23:10,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 20:23:16,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:23:16,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 20:23:16,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:23:18,229 INFO [train.py:1039] (1/4) Epoch 14, batch 2850, loss[loss=0.16, simple_loss=0.2374, pruned_loss=0.04133, over 24588.00 frames. ], tot_loss[loss=0.1902, simple_loss=0.2618, pruned_loss=0.05927, over 4694588.86 frames. ], batch size: 60, lr: 7.39e-03, grad_scale: 8.0 2023-09-29 20:23:19,813 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:23:23,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:23:24,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:23:24,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:23:28,697 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:23:28,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:23:30,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 20:23:32,255 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 20:23:38,347 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 20:23:38,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:23:40,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 20:23:40,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:23:44,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 20:23:45,095 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=479446.6666666667, ans=0.0 2023-09-29 20:23:46,299 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 20:23:47,914 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:23:49,895 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=479513.3333333333, ans=0.125 2023-09-29 20:23:59,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:24:01,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:24:01,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:24:02,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 20:24:02,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 20:24:02,161 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:24:05,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 20:24:05,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 20:24:08,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:24:10,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:24:10,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:24:10,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:24:13,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:24:14,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:24:15,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:24:17,253 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:24:18,825 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:24:20,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:24:20,949 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.86 vs. limit=12.0 2023-09-29 20:24:21,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:24:24,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:24:27,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:24:31,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 20:24:31,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 20:24:33,106 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 20:24:34,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:24:34,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 20:24:34,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:24:36,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:24:36,401 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:24:36,438 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:24:36,439 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 20:24:36,499 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 20:24:36,505 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:24:38,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:24:41,440 INFO [train.py:1039] (1/4) Epoch 14, batch 2900, loss[loss=0.1836, simple_loss=0.2567, pruned_loss=0.05525, over 23438.00 frames. ], tot_loss[loss=0.1901, simple_loss=0.262, pruned_loss=0.0591, over 4702002.63 frames. ], batch size: 119, lr: 7.39e-03, grad_scale: 8.0 2023-09-29 20:24:44,101 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=479713.3333333333, ans=0.125 2023-09-29 20:24:45,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 20:24:45,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:24:46,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:24:46,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 20:24:50,314 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.953e+02 2.184e+02 2.656e+02 3.783e+02, threshold=4.367e+02, percent-clipped=0.0 2023-09-29 20:24:51,463 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=4.79 vs. limit=10.0 2023-09-29 20:24:52,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:24:52,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 20:24:53,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 20:24:55,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:24:55,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 20:24:55,935 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.16 vs. limit=15.0 2023-09-29 20:24:58,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:24:59,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:25:02,706 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:25:02,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:25:06,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 20:25:06,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 20:25:08,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 20:25:09,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:25:13,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 20:25:14,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 20:25:17,601 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:25:17,606 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 20:25:17,635 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:25:19,576 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=479846.6666666667, ans=0.2 2023-09-29 20:25:21,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:25:21,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 20:25:22,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:25:24,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:25:27,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:25:30,916 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:25:31,188 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=479913.3333333333, ans=0.125 2023-09-29 20:25:32,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 20:25:32,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 20:25:32,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:25:34,084 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=479913.3333333333, ans=0.0 2023-09-29 20:25:37,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:25:40,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 20:25:42,161 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:25:47,438 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:25:59,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:25:59,709 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 20:26:01,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 20:26:05,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:26:05,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 20:26:06,523 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:26:06,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 20:26:07,899 INFO [train.py:1039] (1/4) Epoch 14, batch 2950, loss[loss=0.1981, simple_loss=0.2652, pruned_loss=0.0655, over 23769.00 frames. ], tot_loss[loss=0.1909, simple_loss=0.2629, pruned_loss=0.05948, over 4704443.15 frames. ], batch size: 164, lr: 7.38e-03, grad_scale: 8.0 2023-09-29 20:26:11,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:26:12,827 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 20:26:13,077 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=480046.6666666667, ans=0.125 2023-09-29 20:26:14,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:26:14,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:26:14,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:26:16,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:26:18,159 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 20:26:18,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 20:26:21,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:26:21,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:26:26,919 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.17 vs. limit=15.0 2023-09-29 20:26:29,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:26:29,767 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=480113.3333333333, ans=0.2 2023-09-29 20:26:31,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:26:33,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:26:34,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:26:37,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:26:37,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:26:38,025 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=480113.3333333333, ans=0.125 2023-09-29 20:26:39,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:26:41,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:26:41,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:26:42,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 20:26:47,651 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 20:26:48,907 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 20:26:50,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:26:52,821 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 20:26:53,649 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.97 vs. limit=10.0 2023-09-29 20:26:54,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 20:26:54,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:26:54,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:26:54,653 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 20:26:54,661 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 20:26:57,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 20:27:01,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:27:01,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:27:01,793 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=480246.6666666667, ans=0.07 2023-09-29 20:27:04,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:27:06,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:27:06,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:27:06,327 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 20:27:08,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:27:08,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 20:27:10,664 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=8.01 vs. limit=15.0 2023-09-29 20:27:13,146 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:27:13,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:27:15,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 20:27:15,431 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:27:15,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 20:27:18,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:27:21,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:27:21,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:27:23,132 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:27:23,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 20:27:23,954 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.85 vs. limit=12.0 2023-09-29 20:27:24,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:27:24,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:27:24,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 20:27:25,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:27:26,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:27:28,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:27:29,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:27:30,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 20:27:31,337 INFO [train.py:1039] (1/4) Epoch 14, batch 3000, loss[loss=0.2018, simple_loss=0.2682, pruned_loss=0.06772, over 23460.00 frames. ], tot_loss[loss=0.1912, simple_loss=0.2638, pruned_loss=0.05928, over 4716651.59 frames. ], batch size: 285, lr: 7.38e-03, grad_scale: 8.0 2023-09-29 20:27:31,338 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-29 20:27:40,856 INFO [zipformer.py:1853] (1/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([4.9941, 4.8155, 4.6184, 4.2799], device='cuda:1') 2023-09-29 20:27:46,686 INFO [train.py:1071] (1/4) Epoch 14, validation: loss=0.2839, simple_loss=0.2749, pruned_loss=0.1465, over 1125622.00 frames. 2023-09-29 20:27:46,687 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-29 20:27:46,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:27:48,481 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:27:49,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 20:27:51,704 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 20:27:53,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 20:27:54,615 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.901e+02 2.128e+02 2.266e+02 3.715e+02, threshold=4.256e+02, percent-clipped=0.0 2023-09-29 20:27:54,859 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:27:56,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:27:56,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 20:27:56,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:28:01,778 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=480446.6666666667, ans=0.0 2023-09-29 20:28:04,861 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 20:28:06,717 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=480446.6666666667, ans=0.1 2023-09-29 20:28:08,182 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=480446.6666666667, ans=0.07 2023-09-29 20:28:17,415 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:28:22,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 20:28:24,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:28:26,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 20:28:26,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:28:26,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:28:29,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:28:29,300 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 20:28:32,265 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 20:28:34,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:28:35,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 20:28:37,959 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:28:38,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:28:38,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:28:38,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:28:41,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 20:28:42,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:28:42,688 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:28:44,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:28:47,758 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 20:28:49,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:28:50,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:28:50,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:28:53,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:28:53,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:28:56,008 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 20:28:56,066 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 20:28:57,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:28:57,585 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 20:28:57,664 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:28:59,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 20:29:01,169 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=480646.6666666667, ans=0.1 2023-09-29 20:29:02,320 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 20:29:03,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 20:29:03,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 20:29:05,310 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 20:29:05,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 20:29:05,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:29:06,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:29:08,224 INFO [train.py:1039] (1/4) Epoch 14, batch 3050, loss[loss=0.1737, simple_loss=0.2453, pruned_loss=0.0511, over 24337.00 frames. ], tot_loss[loss=0.1918, simple_loss=0.2647, pruned_loss=0.0594, over 4727410.80 frames. ], batch size: 56, lr: 7.38e-03, grad_scale: 8.0 2023-09-29 20:29:08,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 20:29:08,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:29:08,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:29:13,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 20:29:13,904 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=480713.3333333333, ans=0.5 2023-09-29 20:29:15,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:29:18,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:29:18,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:29:20,542 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=480713.3333333333, ans=0.2 2023-09-29 20:29:23,156 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:29:27,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 20:29:29,782 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=480780.0, ans=0.0 2023-09-29 20:29:30,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 20:29:31,746 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.59 vs. limit=10.0 2023-09-29 20:29:33,068 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 20:29:33,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:29:36,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:29:37,970 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:29:37,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:29:39,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:29:45,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:29:45,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 20:29:45,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:29:46,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:29:46,448 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:29:47,999 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:29:49,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:29:52,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:29:52,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 20:29:54,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:29:54,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 20:29:57,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:29:57,874 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:29:59,288 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:29:59,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:30:07,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:30:07,286 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:30:13,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:30:13,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:30:13,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:30:16,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:30:18,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 20:30:18,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:30:20,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 20:30:22,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:30:22,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:30:22,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 20:30:25,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:30:30,451 INFO [train.py:1039] (1/4) Epoch 14, batch 3100, loss[loss=0.2197, simple_loss=0.2693, pruned_loss=0.08498, over 19733.00 frames. ], tot_loss[loss=0.1909, simple_loss=0.2637, pruned_loss=0.05905, over 4730417.58 frames. ], batch size: 389, lr: 7.38e-03, grad_scale: 8.0 2023-09-29 20:30:32,203 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:30:34,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:30:35,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 20:30:38,656 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.952e+02 2.226e+02 2.517e+02 3.865e+02, threshold=4.452e+02, percent-clipped=0.0 2023-09-29 20:30:38,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 20:30:41,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 20:30:43,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 20:30:43,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:30:45,723 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:30:45,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:30:50,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 20:30:54,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:30:59,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 20:31:06,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 20:31:07,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:31:07,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:31:07,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:31:08,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 20:31:10,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:31:10,544 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 20:31:10,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:31:12,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:31:13,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 20:31:15,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:31:18,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:31:18,887 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=481246.6666666667, ans=0.0 2023-09-29 20:31:20,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 20:31:20,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 20:31:21,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:31:23,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:31:24,988 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:31:25,015 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:31:26,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:31:26,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 20:31:26,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:31:29,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:31:30,873 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:31:30,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:31:30,886 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 20:31:35,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:31:37,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 20:31:39,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:31:39,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 20:31:40,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:31:42,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:31:42,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 20:31:53,131 INFO [train.py:1039] (1/4) Epoch 14, batch 3150, loss[loss=0.2049, simple_loss=0.2881, pruned_loss=0.06086, over 24601.00 frames. ], tot_loss[loss=0.1901, simple_loss=0.2626, pruned_loss=0.05879, over 4722003.84 frames. ], batch size: 71, lr: 7.37e-03, grad_scale: 8.0 2023-09-29 20:31:53,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 20:31:56,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:31:57,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:31:57,238 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=481380.0, ans=0.1 2023-09-29 20:31:58,551 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:31:58,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:32:00,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 20:32:02,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:32:02,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 20:32:04,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 20:32:05,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:32:06,573 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.15 vs. limit=10.0 2023-09-29 20:32:09,002 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 20:32:12,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 20:32:12,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:32:14,124 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 20:32:14,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 20:32:15,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 20:32:17,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 20:32:17,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 20:32:17,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:32:17,248 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:32:18,814 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:32:20,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 20:32:21,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:32:21,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:32:22,251 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=481446.6666666667, ans=0.1 2023-09-29 20:32:23,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:32:23,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 20:32:28,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 20:32:28,799 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:32:31,716 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 20:32:31,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:32:33,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 20:32:38,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 20:32:38,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:32:39,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 20:32:39,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 20:32:40,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:32:40,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:32:40,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 20:32:40,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 20:32:43,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 20:32:43,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 20:32:43,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:32:47,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:32:47,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:32:47,571 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 20:32:47,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:32:49,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 20:32:49,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:32:49,453 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 20:32:50,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 20:32:50,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 20:32:53,926 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:32:53,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:32:55,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 20:32:56,924 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 20:32:57,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:33:01,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:33:01,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:33:01,653 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:33:09,862 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:33:09,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:33:11,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 20:33:16,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:33:16,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 20:33:18,099 INFO [train.py:1039] (1/4) Epoch 14, batch 3200, loss[loss=0.1709, simple_loss=0.2479, pruned_loss=0.0469, over 24534.00 frames. ], tot_loss[loss=0.1898, simple_loss=0.2621, pruned_loss=0.05877, over 4702065.87 frames. ], batch size: 60, lr: 7.37e-03, grad_scale: 16.0 2023-09-29 20:33:20,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:33:20,540 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:33:20,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 20:33:22,337 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=481713.3333333333, ans=0.125 2023-09-29 20:33:24,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:33:26,518 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 1.865e+02 2.106e+02 2.358e+02 3.213e+02, threshold=4.213e+02, percent-clipped=0.0 2023-09-29 20:33:28,314 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 20:33:31,433 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:33:41,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:33:41,381 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 20:33:41,881 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.29 vs. limit=22.5 2023-09-29 20:33:52,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 20:33:53,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:33:56,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 20:33:56,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 20:34:01,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:34:01,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:34:03,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:34:07,969 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 20:34:09,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 20:34:09,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 20:34:10,103 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=481913.3333333333, ans=0.1 2023-09-29 20:34:12,826 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 20:34:15,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:34:21,900 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:34:21,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:34:23,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:34:23,492 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 20:34:23,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 20:34:24,294 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.72 vs. limit=15.0 2023-09-29 20:34:30,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:34:31,710 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 20:34:31,894 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=481980.0, ans=0.05 2023-09-29 20:34:33,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 20:34:34,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 20:34:36,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 20:34:37,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:34:40,980 INFO [train.py:1039] (1/4) Epoch 14, batch 3250, loss[loss=0.1827, simple_loss=0.2586, pruned_loss=0.05334, over 24334.00 frames. ], tot_loss[loss=0.19, simple_loss=0.2625, pruned_loss=0.05872, over 4708625.12 frames. ], batch size: 61, lr: 7.37e-03, grad_scale: 16.0 2023-09-29 20:34:41,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 20:34:41,118 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 20:34:41,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:34:41,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:34:42,703 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 20:34:46,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:34:48,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:34:51,634 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.96 vs. limit=22.5 2023-09-29 20:34:58,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:34:58,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 20:35:00,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:35:02,052 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:35:02,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:35:02,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:35:03,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 20:35:05,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:35:06,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:35:06,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:35:06,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:35:06,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:35:08,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:35:09,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:35:11,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:35:14,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:35:14,591 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:35:16,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:35:16,191 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:35:17,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:35:18,118 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=482180.0, ans=0.0 2023-09-29 20:35:21,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 20:35:23,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:35:23,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:35:25,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:35:27,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 20:35:33,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 20:35:35,372 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 20:35:41,143 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:35:42,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:35:42,566 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 20:35:42,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:35:42,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 20:35:42,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:35:44,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 20:35:44,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 20:35:45,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:35:47,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:35:47,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:35:48,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 20:35:48,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:35:49,051 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=482313.3333333333, ans=0.2 2023-09-29 20:35:52,167 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=482313.3333333333, ans=0.125 2023-09-29 20:35:53,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:35:53,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:35:56,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 20:35:56,553 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:36:00,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:36:00,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 20:36:03,639 INFO [train.py:1039] (1/4) Epoch 14, batch 3300, loss[loss=0.1881, simple_loss=0.2701, pruned_loss=0.05303, over 24515.00 frames. ], tot_loss[loss=0.191, simple_loss=0.2638, pruned_loss=0.0591, over 4722977.91 frames. ], batch size: 71, lr: 7.37e-03, grad_scale: 16.0 2023-09-29 20:36:03,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:36:03,851 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 20:36:04,136 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=482380.0, ans=0.125 2023-09-29 20:36:05,449 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 20:36:07,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 20:36:07,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:36:11,796 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.874e+02 2.156e+02 2.504e+02 3.971e+02, threshold=4.311e+02, percent-clipped=0.0 2023-09-29 20:36:12,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:36:12,380 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=482380.0, ans=0.0 2023-09-29 20:36:13,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:36:13,709 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:36:15,301 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=482380.0, ans=0.125 2023-09-29 20:36:16,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 20:36:16,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 20:36:18,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:36:21,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:36:25,951 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 20:36:26,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:36:26,086 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:36:27,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:36:27,739 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 20:36:29,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:36:30,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 20:36:30,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 20:36:30,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:36:30,974 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 20:36:34,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:36:34,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 20:36:35,189 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=482513.3333333333, ans=0.125 2023-09-29 20:36:37,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:36:37,101 WARNING [train.py:1197] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 20:36:38,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 20:36:39,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:36:40,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:36:43,579 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 20:36:45,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 20:36:45,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:36:46,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 20:36:49,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:36:52,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 20:36:53,180 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=482580.0, ans=0.0 2023-09-29 20:36:54,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:36:57,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:36:57,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:36:57,295 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:36:57,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:36:57,605 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=482580.0, ans=0.0 2023-09-29 20:37:00,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:37:00,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:37:00,537 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=482580.0, ans=0.125 2023-09-29 20:37:01,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:37:03,273 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 20:37:03,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 20:37:05,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 20:37:07,578 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:37:07,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:37:09,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:37:09,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:37:11,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 20:37:11,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:37:12,794 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 20:37:14,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:37:16,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 20:37:19,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 20:37:21,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:37:21,547 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:37:24,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 20:37:24,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:37:24,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:37:26,185 INFO [train.py:1039] (1/4) Epoch 14, batch 3350, loss[loss=0.1794, simple_loss=0.2666, pruned_loss=0.04614, over 24643.00 frames. ], tot_loss[loss=0.1922, simple_loss=0.2647, pruned_loss=0.05988, over 4719262.17 frames. ], batch size: 73, lr: 7.36e-03, grad_scale: 16.0 2023-09-29 20:37:27,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:37:27,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:37:30,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:37:33,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:37:35,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:37:37,265 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 20:37:38,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:37:38,854 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=482713.3333333333, ans=0.0 2023-09-29 20:37:40,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:37:42,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:37:42,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:37:44,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 20:37:45,597 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 20:37:47,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:37:49,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 20:37:49,282 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 20:37:49,414 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:37:50,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:37:51,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:37:51,715 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.89 vs. limit=15.0 2023-09-29 20:37:53,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 20:37:53,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:37:53,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:37:56,072 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:37:58,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:37:58,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:38:00,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:38:03,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:38:06,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:38:06,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:38:08,751 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=11.96 vs. limit=15.0 2023-09-29 20:38:11,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:38:12,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:38:15,011 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:38:15,269 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=482913.3333333333, ans=0.1 2023-09-29 20:38:16,299 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:38:17,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:38:19,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 20:38:19,468 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 20:38:21,522 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 20:38:21,581 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:38:23,103 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 20:38:23,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:38:24,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:38:25,081 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=482913.3333333333, ans=0.0 2023-09-29 20:38:25,476 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=17.65 vs. limit=22.5 2023-09-29 20:38:30,960 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=482980.0, ans=0.2 2023-09-29 20:38:32,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:38:33,809 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 20:38:33,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 20:38:35,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:38:35,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:38:41,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:38:43,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 20:38:44,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 20:38:44,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:38:46,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:38:47,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 20:38:48,238 INFO [train.py:1039] (1/4) Epoch 14, batch 3400, loss[loss=0.1793, simple_loss=0.2597, pruned_loss=0.04947, over 24124.00 frames. ], tot_loss[loss=0.1929, simple_loss=0.2654, pruned_loss=0.06024, over 4710970.12 frames. ], batch size: 80, lr: 7.36e-03, grad_scale: 16.0 2023-09-29 20:38:48,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:38:48,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 20:38:49,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:38:50,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:38:51,836 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 20:38:53,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:38:53,399 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 20:38:53,689 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=483046.6666666667, ans=0.125 2023-09-29 20:38:53,807 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=483046.6666666667, ans=0.0 2023-09-29 20:38:56,132 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.936e+02 2.106e+02 2.472e+02 5.174e+02, threshold=4.212e+02, percent-clipped=2.0 2023-09-29 20:38:58,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 20:38:58,506 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 20:38:58,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:39:02,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:39:02,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 20:39:03,701 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:39:05,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:39:12,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:39:14,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 20:39:16,624 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.76 vs. limit=15.0 2023-09-29 20:39:16,783 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.70 vs. limit=22.5 2023-09-29 20:39:18,968 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:39:22,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:39:23,909 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:39:25,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 20:39:29,532 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=483180.0, ans=0.125 2023-09-29 20:39:30,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:39:35,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 20:39:40,766 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:39:42,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:39:42,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 20:39:42,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:39:43,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:39:44,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:39:45,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:39:46,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:39:49,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:39:49,968 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:39:55,293 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:39:55,601 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=483313.3333333333, ans=0.0 2023-09-29 20:39:58,270 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 20:40:04,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 20:40:06,338 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=483313.3333333333, ans=0.125 2023-09-29 20:40:06,382 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=483313.3333333333, ans=0.125 2023-09-29 20:40:09,471 INFO [train.py:1039] (1/4) Epoch 14, batch 3450, loss[loss=0.2025, simple_loss=0.2766, pruned_loss=0.06423, over 23443.00 frames. ], tot_loss[loss=0.1927, simple_loss=0.2651, pruned_loss=0.06016, over 4718502.86 frames. ], batch size: 93, lr: 7.36e-03, grad_scale: 16.0 2023-09-29 20:40:11,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 20:40:14,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 20:40:14,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:40:16,464 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:40:16,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 20:40:17,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:40:21,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 20:40:24,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:40:25,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:40:26,071 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=483446.6666666667, ans=0.0 2023-09-29 20:40:27,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:40:27,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:40:29,854 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=483446.6666666667, ans=0.0 2023-09-29 20:40:30,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:40:37,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 20:40:44,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 20:40:44,412 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 20:40:44,481 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:40:46,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:40:51,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 20:40:51,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:40:51,752 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=483513.3333333333, ans=0.2 2023-09-29 20:40:56,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:40:56,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:40:57,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 20:40:59,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:41:01,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 20:41:01,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:41:03,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:41:06,395 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=483580.0, ans=0.0 2023-09-29 20:41:06,451 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=483580.0, ans=0.125 2023-09-29 20:41:08,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:41:11,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 20:41:14,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:41:17,084 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=483646.6666666667, ans=0.05 2023-09-29 20:41:18,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:41:20,197 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=483646.6666666667, ans=0.1 2023-09-29 20:41:21,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:41:23,517 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:41:25,828 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.32 vs. limit=15.0 2023-09-29 20:41:28,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:41:28,157 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:41:29,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:41:29,705 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:41:32,613 INFO [train.py:1039] (1/4) Epoch 14, batch 3500, loss[loss=0.1922, simple_loss=0.2752, pruned_loss=0.0546, over 24278.00 frames. ], tot_loss[loss=0.1915, simple_loss=0.2633, pruned_loss=0.05985, over 4703088.31 frames. ], batch size: 74, lr: 7.36e-03, grad_scale: 16.0 2023-09-29 20:41:35,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:41:38,097 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:41:38,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 20:41:41,683 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.652e+02 2.028e+02 2.360e+02 2.884e+02 5.509e+02, threshold=4.720e+02, percent-clipped=5.0 2023-09-29 20:41:41,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 20:41:44,819 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 20:41:49,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:41:49,150 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 20:41:54,416 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:41:54,565 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:41:56,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:41:56,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:41:56,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 20:41:58,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:41:58,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:41:58,948 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.05 vs. limit=15.0 2023-09-29 20:41:59,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 20:42:02,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:42:02,894 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 20:42:04,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:42:07,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:42:09,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 20:42:09,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:42:13,022 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:42:16,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:42:18,222 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:42:19,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:42:19,845 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:42:21,498 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 20:42:22,255 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.87 vs. limit=15.0 2023-09-29 20:42:23,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 20:42:23,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 20:42:24,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:42:26,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:42:27,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:42:28,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:42:30,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 20:42:30,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:42:35,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:42:38,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 20:42:38,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 20:42:38,419 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:42:40,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:42:40,184 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:42:40,860 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.81 vs. limit=15.0 2023-09-29 20:42:42,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:42:45,994 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 20:42:46,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:42:47,667 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:42:49,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 20:42:52,068 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 20:42:55,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:42:55,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:42:55,247 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:42:55,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:42:56,643 INFO [train.py:1039] (1/4) Epoch 14, batch 3550, loss[loss=0.1906, simple_loss=0.2607, pruned_loss=0.06024, over 24437.00 frames. ], tot_loss[loss=0.1897, simple_loss=0.2617, pruned_loss=0.05884, over 4709664.26 frames. ], batch size: 58, lr: 7.35e-03, grad_scale: 16.0 2023-09-29 20:42:58,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:43:11,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:43:12,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 20:43:14,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:43:16,097 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:43:16,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:43:18,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:43:18,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 20:43:22,081 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:43:23,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:43:23,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:43:23,680 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 20:43:25,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 20:43:28,464 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=484180.0, ans=0.125 2023-09-29 20:43:32,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:43:32,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:43:34,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:43:34,274 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:43:36,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:43:36,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 20:43:36,206 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:43:37,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:43:39,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 20:43:45,142 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=484246.6666666667, ans=0.2 2023-09-29 20:43:46,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:43:46,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:43:47,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:43:49,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 20:43:51,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:43:51,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 20:43:52,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:43:54,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:43:54,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:43:57,845 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 20:43:59,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:44:00,147 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten.whitening_limit, batch_count=484246.6666666667, ans=15.0 2023-09-29 20:44:04,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:44:04,315 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 20:44:06,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:44:10,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:44:12,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 20:44:12,773 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 20:44:19,200 INFO [train.py:1039] (1/4) Epoch 14, batch 3600, loss[loss=0.1964, simple_loss=0.2696, pruned_loss=0.06153, over 23308.00 frames. ], tot_loss[loss=0.1889, simple_loss=0.2613, pruned_loss=0.05824, over 4707329.41 frames. ], batch size: 93, lr: 7.35e-03, grad_scale: 32.0 2023-09-29 20:44:19,395 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 20:44:19,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:44:19,743 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=484380.0, ans=0.95 2023-09-29 20:44:20,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:44:22,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:44:23,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:44:24,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:44:27,550 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.460e+02 1.842e+02 2.079e+02 2.493e+02 3.972e+02, threshold=4.157e+02, percent-clipped=0.0 2023-09-29 20:44:31,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:44:32,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:44:34,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:44:35,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:44:37,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:44:37,254 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 20:44:40,607 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 20:44:40,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:44:44,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:44:48,815 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:44:48,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 20:44:49,032 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:44:49,060 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 20:44:50,496 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:44:53,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:44:55,060 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:44:57,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:44:59,971 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:45:01,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:45:01,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 20:45:01,784 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=484513.3333333333, ans=0.2 2023-09-29 20:45:09,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:45:10,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 20:45:10,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 20:45:15,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:45:20,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:45:23,647 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:45:24,603 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.90 vs. limit=15.0 2023-09-29 20:45:29,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 20:45:30,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:45:30,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 20:45:32,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 20:45:32,849 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 20:45:35,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:45:35,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:45:35,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 20:45:37,103 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:45:37,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:45:37,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:45:37,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 20:45:38,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 20:45:39,237 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=484646.6666666667, ans=0.125 2023-09-29 20:45:42,500 INFO [train.py:1039] (1/4) Epoch 14, batch 3650, loss[loss=0.1895, simple_loss=0.2627, pruned_loss=0.05816, over 23387.00 frames. ], tot_loss[loss=0.1906, simple_loss=0.2626, pruned_loss=0.05927, over 4703698.67 frames. ], batch size: 119, lr: 7.35e-03, grad_scale: 32.0 2023-09-29 20:45:42,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:45:44,058 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 20:45:46,319 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.37 vs. limit=15.0 2023-09-29 20:45:48,017 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=484713.3333333333, ans=0.125 2023-09-29 20:45:49,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 20:45:50,771 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:45:51,255 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=484713.3333333333, ans=0.125 2023-09-29 20:45:52,770 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=484713.3333333333, ans=0.125 2023-09-29 20:45:54,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 20:45:55,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 20:46:00,885 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:46:00,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:46:02,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 20:46:03,330 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=484780.0, ans=0.0 2023-09-29 20:46:06,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 20:46:06,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:46:07,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 20:46:07,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 20:46:09,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:46:09,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 20:46:10,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 20:46:12,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:46:12,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:46:13,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:46:15,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 20:46:17,594 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 20:46:17,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:46:18,012 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=484846.6666666667, ans=0.125 2023-09-29 20:46:19,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 20:46:19,539 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=484846.6666666667, ans=0.05 2023-09-29 20:46:20,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:46:20,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:46:28,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:46:30,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:46:30,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 20:46:31,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:46:33,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:46:35,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:46:41,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:46:42,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:46:42,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:46:42,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 20:46:44,248 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:46:45,745 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:46:52,663 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 20:46:54,430 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:46:54,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:46:55,994 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 20:46:56,080 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:46:57,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:46:59,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:47:01,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 20:47:01,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:47:04,099 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 20:47:05,621 INFO [train.py:1039] (1/4) Epoch 14, batch 3700, loss[loss=0.1894, simple_loss=0.2649, pruned_loss=0.05694, over 24648.00 frames. ], tot_loss[loss=0.1919, simple_loss=0.2639, pruned_loss=0.05994, over 4700488.43 frames. ], batch size: 65, lr: 7.35e-03, grad_scale: 32.0 2023-09-29 20:47:05,789 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:47:07,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:47:07,673 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=485046.6666666667, ans=0.1 2023-09-29 20:47:08,993 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:47:08,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 20:47:09,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:47:11,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 20:47:12,438 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 20:47:13,911 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 1.934e+02 2.131e+02 2.298e+02 2.848e+02, threshold=4.262e+02, percent-clipped=0.0 2023-09-29 20:47:14,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 20:47:14,416 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=485046.6666666667, ans=0.1 2023-09-29 20:47:17,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:47:18,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:47:18,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:47:18,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:47:19,085 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer_ff3.min_abs, batch_count=485046.6666666667, ans=0.2 2023-09-29 20:47:20,384 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 20:47:20,764 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=485113.3333333333, ans=0.0 2023-09-29 20:47:21,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:47:24,105 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 20:47:32,464 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=485113.3333333333, ans=0.0 2023-09-29 20:47:33,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:47:33,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 20:47:35,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 20:47:35,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 20:47:35,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:47:37,260 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=485180.0, ans=0.0 2023-09-29 20:47:40,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:47:41,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 20:47:41,746 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:47:43,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:47:47,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:47:49,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 20:47:52,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 20:47:53,829 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:47:55,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 20:47:55,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:47:56,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 20:47:59,680 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.03 vs. limit=15.0 2023-09-29 20:48:00,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:48:02,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:48:05,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:48:07,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 20:48:08,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:48:08,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 20:48:10,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:48:10,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:48:13,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:48:14,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 20:48:16,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 20:48:18,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:48:18,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:48:18,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:48:20,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:48:23,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:48:24,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:48:26,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:48:27,884 INFO [train.py:1039] (1/4) Epoch 14, batch 3750, loss[loss=0.2011, simple_loss=0.2657, pruned_loss=0.06822, over 23602.00 frames. ], tot_loss[loss=0.1918, simple_loss=0.2642, pruned_loss=0.05968, over 4708277.82 frames. ], batch size: 256, lr: 7.34e-03, grad_scale: 32.0 2023-09-29 20:48:29,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 20:48:29,762 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=485380.0, ans=0.125 2023-09-29 20:48:31,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 20:48:33,676 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.35 vs. limit=15.0 2023-09-29 20:48:34,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 20:48:34,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 20:48:36,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:48:36,703 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=485380.0, ans=0.0 2023-09-29 20:48:36,815 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=485380.0, ans=0.0 2023-09-29 20:48:37,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:48:39,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:48:39,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:48:43,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:48:43,372 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=485446.6666666667, ans=0.2 2023-09-29 20:48:46,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:48:47,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:48:50,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:48:54,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:48:56,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 20:48:56,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:48:58,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:48:58,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:49:02,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 20:49:05,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 20:49:07,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:49:09,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:49:09,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:49:15,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:49:16,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 20:49:21,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 20:49:23,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:49:23,288 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=485580.0, ans=0.125 2023-09-29 20:49:27,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:49:27,928 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=485580.0, ans=0.025 2023-09-29 20:49:29,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:49:29,786 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=485580.0, ans=0.125 2023-09-29 20:49:33,430 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:49:38,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 20:49:39,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 20:49:39,936 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=485646.6666666667, ans=0.1 2023-09-29 20:49:41,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 20:49:43,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:49:46,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:49:51,125 INFO [train.py:1039] (1/4) Epoch 14, batch 3800, loss[loss=0.2049, simple_loss=0.2532, pruned_loss=0.07831, over 19776.00 frames. ], tot_loss[loss=0.1912, simple_loss=0.2636, pruned_loss=0.05937, over 4711247.08 frames. ], batch size: 388, lr: 7.34e-03, grad_scale: 32.0 2023-09-29 20:49:54,761 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:49:57,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:49:59,282 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.914e+02 2.283e+02 2.549e+02 3.824e+02, threshold=4.565e+02, percent-clipped=0.0 2023-09-29 20:49:59,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 20:49:59,519 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 20:50:01,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:50:04,640 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:50:04,759 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 20:50:08,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 20:50:08,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:50:09,597 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:50:11,424 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=485780.0, ans=0.2 2023-09-29 20:50:12,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:50:12,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:50:12,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:50:12,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 20:50:18,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 20:50:19,672 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:50:20,717 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.18 vs. limit=5.0 2023-09-29 20:50:22,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:50:25,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:50:25,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 20:50:29,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 20:50:29,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:50:32,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:50:32,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:50:34,382 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=485846.6666666667, ans=0.0 2023-09-29 20:50:37,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 20:50:37,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 20:50:39,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:50:47,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:50:54,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:50:55,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 20:50:57,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 20:50:57,540 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:51:00,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:51:00,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:51:02,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 20:51:05,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 20:51:05,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 20:51:05,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:51:07,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:51:13,730 INFO [train.py:1039] (1/4) Epoch 14, batch 3850, loss[loss=0.1795, simple_loss=0.2695, pruned_loss=0.04479, over 24674.00 frames. ], tot_loss[loss=0.1896, simple_loss=0.2621, pruned_loss=0.05854, over 4712952.58 frames. ], batch size: 73, lr: 7.34e-03, grad_scale: 16.0 2023-09-29 20:51:15,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:51:16,050 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:51:20,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:51:20,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 20:51:22,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 20:51:23,831 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:51:26,240 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 20:51:30,763 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:51:31,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 20:51:34,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 20:51:40,919 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:51:42,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:51:44,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:51:46,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:51:51,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:51:51,293 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:51:51,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:51:51,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:51:52,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:51:54,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:51:55,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:51:55,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:51:57,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 20:51:59,314 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 20:51:59,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:51:59,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:52:02,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:52:02,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:52:02,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 20:52:05,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 20:52:07,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:52:09,063 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 20:52:12,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 20:52:17,338 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.max_positive, batch_count=486246.6666666667, ans=0.95 2023-09-29 20:52:18,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:52:18,932 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=486313.3333333333, ans=0.125 2023-09-29 20:52:20,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:52:23,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:52:23,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 20:52:27,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 20:52:29,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:52:29,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:52:32,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 20:52:33,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:52:34,496 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:52:35,967 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:52:35,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:52:35,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 20:52:37,423 INFO [train.py:1039] (1/4) Epoch 14, batch 3900, loss[loss=0.1704, simple_loss=0.2172, pruned_loss=0.0618, over 18775.00 frames. ], tot_loss[loss=0.1889, simple_loss=0.2614, pruned_loss=0.05813, over 4713809.71 frames. ], batch size: 388, lr: 7.34e-03, grad_scale: 16.0 2023-09-29 20:52:37,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:52:37,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 20:52:39,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:52:39,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:52:39,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:52:40,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:52:41,137 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:52:42,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:52:42,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:52:42,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:52:44,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 20:52:44,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:52:47,046 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.643e+02 1.944e+02 2.146e+02 2.547e+02 3.892e+02, threshold=4.292e+02, percent-clipped=0.0 2023-09-29 20:52:48,747 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:52:48,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 20:52:48,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:52:53,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:52:53,576 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=486446.6666666667, ans=0.125 2023-09-29 20:52:54,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 20:52:55,143 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=486446.6666666667, ans=0.0 2023-09-29 20:52:56,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:52:58,027 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 20:52:59,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 20:52:59,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:53:03,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 20:53:03,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:53:03,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 20:53:05,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 20:53:08,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:53:10,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:53:10,406 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 20:53:10,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:53:15,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:53:18,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:53:19,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:53:19,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:53:21,207 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:53:28,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:53:28,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:53:37,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 20:53:38,205 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=486580.0, ans=0.1 2023-09-29 20:53:40,144 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:53:44,177 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.77 vs. limit=15.0 2023-09-29 20:53:49,953 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:53:51,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:53:53,163 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 20:53:54,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 20:53:54,556 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:53:54,892 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=486646.6666666667, ans=0.1 2023-09-29 20:53:56,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 20:53:56,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:53:57,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 20:54:01,167 INFO [train.py:1039] (1/4) Epoch 14, batch 3950, loss[loss=0.1886, simple_loss=0.2748, pruned_loss=0.05114, over 24631.00 frames. ], tot_loss[loss=0.1888, simple_loss=0.2616, pruned_loss=0.05799, over 4717772.04 frames. ], batch size: 73, lr: 7.33e-03, grad_scale: 16.0 2023-09-29 20:54:05,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:54:05,312 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=486713.3333333333, ans=0.125 2023-09-29 20:54:06,856 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=486713.3333333333, ans=0.1 2023-09-29 20:54:08,086 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 20:54:08,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:54:09,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:54:11,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:54:18,286 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 20:54:18,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 20:54:19,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 20:54:19,872 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 20:54:19,923 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:54:22,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:54:22,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 20:54:22,994 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:54:25,932 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 20:54:27,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:54:29,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 20:54:29,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 20:54:29,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 20:54:29,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:54:36,220 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.47 vs. limit=15.0 2023-09-29 20:54:41,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:54:43,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:54:49,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 20:54:49,320 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=486846.6666666667, ans=0.0 2023-09-29 20:54:52,824 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.16 vs. limit=15.0 2023-09-29 20:54:55,138 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 20:54:55,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 20:54:56,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:54:58,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:54:58,346 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=486913.3333333333, ans=0.1 2023-09-29 20:55:05,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:55:07,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 20:55:07,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:55:07,605 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=486980.0, ans=0.0 2023-09-29 20:55:09,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:55:09,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 20:55:14,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:55:16,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:55:19,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 20:55:25,242 INFO [train.py:1039] (1/4) Epoch 14, batch 4000, loss[loss=0.199, simple_loss=0.2819, pruned_loss=0.05809, over 24599.00 frames. ], tot_loss[loss=0.1889, simple_loss=0.2619, pruned_loss=0.05798, over 4716299.56 frames. ], batch size: 68, lr: 7.33e-03, grad_scale: 32.0 2023-09-29 20:55:29,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:55:34,256 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.507e+02 1.864e+02 2.068e+02 2.328e+02 3.219e+02, threshold=4.135e+02, percent-clipped=0.0 2023-09-29 20:55:39,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:55:42,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:55:44,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:55:44,199 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:55:44,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 20:55:45,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 20:55:45,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 20:55:45,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 20:55:45,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 20:55:49,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:55:52,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:55:52,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:55:52,432 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:55:53,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:55:53,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 20:55:57,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:55:57,403 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 20:55:59,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:55:59,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:55:59,744 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=487180.0, ans=0.125 2023-09-29 20:56:02,593 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 20:56:04,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 20:56:04,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:56:04,485 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=487180.0, ans=0.125 2023-09-29 20:56:10,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 20:56:10,617 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:56:13,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:56:13,685 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 20:56:15,190 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 20:56:16,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 20:56:16,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:56:16,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:56:18,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:56:20,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:56:20,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 20:56:20,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:56:22,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 20:56:22,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:56:22,802 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=487246.6666666667, ans=0.1 2023-09-29 20:56:24,236 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 20:56:30,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 20:56:32,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 20:56:34,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:56:36,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:56:37,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:56:39,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:56:39,744 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=487313.3333333333, ans=0.025 2023-09-29 20:56:43,970 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:56:46,908 INFO [train.py:1039] (1/4) Epoch 14, batch 4050, loss[loss=0.1928, simple_loss=0.2537, pruned_loss=0.06593, over 23863.00 frames. ], tot_loss[loss=0.1907, simple_loss=0.2632, pruned_loss=0.05905, over 4696056.57 frames. ], batch size: 179, lr: 7.33e-03, grad_scale: 32.0 2023-09-29 20:56:46,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 20:56:47,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 20:56:49,963 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 20:56:51,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:56:52,045 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:56:54,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 20:56:54,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:56:58,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:57:00,993 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:57:02,343 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 20:57:03,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:57:03,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:57:04,149 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=487446.6666666667, ans=0.0 2023-09-29 20:57:05,714 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=487446.6666666667, ans=0.1 2023-09-29 20:57:09,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:57:10,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 20:57:11,590 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.75 vs. limit=10.0 2023-09-29 20:57:12,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 20:57:14,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 20:57:14,054 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 20:57:17,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:57:23,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 20:57:27,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:57:30,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:57:33,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:57:33,827 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:57:33,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:57:37,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:57:40,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 20:57:40,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 20:57:42,416 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:57:46,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 20:57:50,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:57:59,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 20:57:59,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:58:01,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 20:58:01,644 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=487646.6666666667, ans=0.1 2023-09-29 20:58:02,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 20:58:02,860 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 20:58:02,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:58:04,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:58:06,034 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:58:06,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:58:10,377 INFO [train.py:1039] (1/4) Epoch 14, batch 4100, loss[loss=0.1606, simple_loss=0.2289, pruned_loss=0.04617, over 24344.00 frames. ], tot_loss[loss=0.1906, simple_loss=0.2638, pruned_loss=0.05866, over 4710109.87 frames. ], batch size: 56, lr: 7.33e-03, grad_scale: 32.0 2023-09-29 20:58:14,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 20:58:15,648 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 20:58:18,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 20:58:20,535 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 1.945e+02 2.209e+02 2.502e+02 4.292e+02, threshold=4.417e+02, percent-clipped=1.0 2023-09-29 20:58:20,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 20:58:20,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:58:20,789 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:58:20,845 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:58:22,256 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 20:58:22,389 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 20:58:24,063 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:58:25,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:58:25,578 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:58:27,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:58:27,466 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer_na.min_abs, batch_count=487780.0, ans=0.02 2023-09-29 20:58:32,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:58:33,665 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:58:33,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:58:33,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 20:58:35,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:58:35,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:58:35,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:58:37,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:58:37,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 20:58:38,907 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:58:40,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 20:58:42,007 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:58:45,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:58:45,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 20:58:45,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:58:46,213 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=487846.6666666667, ans=0.125 2023-09-29 20:58:47,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:58:47,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:58:50,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 20:58:52,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 20:58:54,230 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:58:55,889 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 20:58:56,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:58:57,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:59:00,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:59:02,242 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=487913.3333333333, ans=0.2 2023-09-29 20:59:05,776 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:59:10,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:59:10,355 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:59:21,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:59:21,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:59:25,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:59:28,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:59:33,282 INFO [train.py:1039] (1/4) Epoch 14, batch 4150, loss[loss=0.1847, simple_loss=0.2534, pruned_loss=0.05801, over 23674.00 frames. ], tot_loss[loss=0.1905, simple_loss=0.2636, pruned_loss=0.05872, over 4711426.50 frames. ], batch size: 149, lr: 7.32e-03, grad_scale: 32.0 2023-09-29 20:59:33,458 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:59:34,977 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:59:35,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:59:35,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:59:35,562 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=488046.6666666667, ans=0.125 2023-09-29 20:59:38,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 20:59:38,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:59:40,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 20:59:41,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 20:59:41,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 20:59:43,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:59:48,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:59:48,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:59:53,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:59:54,912 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:59:56,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 20:59:58,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 20:59:58,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:59:59,868 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 21:00:04,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:00:09,713 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:00:09,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 21:00:09,969 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=488180.0, ans=0.125 2023-09-29 21:00:10,029 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=488180.0, ans=0.95 2023-09-29 21:00:12,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 21:00:12,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:00:13,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 21:00:13,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:00:14,154 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.34 vs. limit=5.0 2023-09-29 21:00:14,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:00:17,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:00:17,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:00:18,014 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=488180.0, ans=0.125 2023-09-29 21:00:23,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 21:00:26,200 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 21:00:29,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:00:29,372 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 21:00:30,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:00:32,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 21:00:34,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:00:37,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:00:37,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:00:39,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 21:00:39,054 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:00:39,058 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 21:00:41,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 21:00:44,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 21:00:45,679 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:00:45,686 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:00:45,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 21:00:45,853 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 21:00:45,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:00:47,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 21:00:48,759 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:00:50,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:00:50,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 21:00:50,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 21:00:55,497 INFO [train.py:1039] (1/4) Epoch 14, batch 4200, loss[loss=0.1801, simple_loss=0.2603, pruned_loss=0.04992, over 24651.00 frames. ], tot_loss[loss=0.1905, simple_loss=0.2638, pruned_loss=0.05858, over 4717556.67 frames. ], batch size: 65, lr: 7.32e-03, grad_scale: 32.0 2023-09-29 21:00:55,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:00:55,982 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=488380.0, ans=0.0 2023-09-29 21:00:58,026 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=488380.0, ans=0.0 2023-09-29 21:00:59,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 21:01:00,950 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 21:01:03,977 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:01:05,346 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.706e+02 2.014e+02 2.292e+02 2.781e+02 4.764e+02, threshold=4.585e+02, percent-clipped=1.0 2023-09-29 21:01:05,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:01:05,592 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:01:05,595 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:01:09,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 21:01:09,560 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=488380.0, ans=0.1 2023-09-29 21:01:11,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 21:01:12,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:01:14,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:01:19,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:01:22,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 21:01:22,398 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:01:23,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:01:23,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 21:01:23,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:01:25,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:01:25,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:01:25,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 21:01:27,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:01:28,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 21:01:28,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:01:33,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 21:01:34,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:01:37,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:01:39,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:01:41,582 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=488513.3333333333, ans=0.0 2023-09-29 21:01:42,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:01:42,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 21:01:42,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:01:43,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:01:48,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:01:51,396 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:01:57,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 21:01:59,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 21:02:02,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:02:03,097 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=488646.6666666667, ans=0.125 2023-09-29 21:02:07,506 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=488646.6666666667, ans=0.2 2023-09-29 21:02:09,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 21:02:09,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:02:12,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 21:02:17,729 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 21:02:19,041 INFO [train.py:1039] (1/4) Epoch 14, batch 4250, loss[loss=0.1814, simple_loss=0.2631, pruned_loss=0.04987, over 24682.00 frames. ], tot_loss[loss=0.1894, simple_loss=0.2623, pruned_loss=0.0583, over 4706245.37 frames. ], batch size: 68, lr: 7.32e-03, grad_scale: 32.0 2023-09-29 21:02:22,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:02:23,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 21:02:25,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:02:30,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 21:02:32,163 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 21:02:32,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:02:33,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:02:38,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:02:44,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:02:44,193 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=488780.0, ans=0.125 2023-09-29 21:02:44,238 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=488780.0, ans=0.0 2023-09-29 21:02:45,467 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:02:46,993 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:02:46,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:02:50,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:02:52,209 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:02:53,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:02:53,861 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=488846.6666666667, ans=0.125 2023-09-29 21:02:55,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:02:57,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:02:57,478 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=488846.6666666667, ans=0.0 2023-09-29 21:02:58,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 21:03:03,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 21:03:03,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:03:03,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:03:03,627 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:03:06,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:03:06,672 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:03:06,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:03:08,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 21:03:11,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:03:15,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:03:17,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:03:18,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 21:03:18,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:03:18,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 21:03:20,904 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 21:03:22,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:03:24,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:03:24,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:03:24,909 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.03 vs. limit=15.0 2023-09-29 21:03:26,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 21:03:27,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 21:03:27,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 21:03:28,150 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=488980.0, ans=0.0 2023-09-29 21:03:32,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:03:35,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:03:36,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:03:39,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:03:40,520 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:03:42,508 INFO [train.py:1039] (1/4) Epoch 14, batch 4300, loss[loss=0.1692, simple_loss=0.2478, pruned_loss=0.04529, over 24558.00 frames. ], tot_loss[loss=0.1891, simple_loss=0.2619, pruned_loss=0.05813, over 4714800.79 frames. ], batch size: 60, lr: 7.32e-03, grad_scale: 16.0 2023-09-29 21:03:42,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:03:42,947 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=489046.6666666667, ans=0.125 2023-09-29 21:03:44,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:03:44,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 21:03:45,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:03:50,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:03:50,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:03:53,189 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.403e+02 1.977e+02 2.365e+02 3.006e+02 5.319e+02, threshold=4.729e+02, percent-clipped=1.0 2023-09-29 21:03:55,803 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=489046.6666666667, ans=0.07 2023-09-29 21:03:57,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:04:02,754 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=489113.3333333333, ans=0.125 2023-09-29 21:04:04,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:04:04,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 21:04:06,122 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:04:09,098 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:04:09,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:04:09,162 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 21:04:10,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 21:04:12,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 21:04:15,908 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 21:04:15,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:04:17,260 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 21:04:20,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 21:04:22,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:04:22,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:04:23,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:04:25,465 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:04:27,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:04:28,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:04:28,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 21:04:28,740 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 21:04:32,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:04:34,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:04:34,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 21:04:34,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:04:36,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:04:36,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 21:04:36,201 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 21:04:38,423 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 21:04:40,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:04:40,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 21:04:40,418 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=489246.6666666667, ans=0.125 2023-09-29 21:04:41,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 21:04:46,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:04:46,329 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=489246.6666666667, ans=0.125 2023-09-29 21:04:46,455 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=489246.6666666667, ans=0.125 2023-09-29 21:04:47,697 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 21:04:47,824 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:04:49,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:04:49,407 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:04:51,032 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 21:04:52,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 21:04:52,593 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:04:52,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:04:52,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:04:54,802 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:04:57,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:04:58,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:04:59,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:05:01,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:05:06,043 INFO [train.py:1039] (1/4) Epoch 14, batch 4350, loss[loss=0.1714, simple_loss=0.244, pruned_loss=0.04939, over 23544.00 frames. ], tot_loss[loss=0.1896, simple_loss=0.2627, pruned_loss=0.05823, over 4730968.89 frames. ], batch size: 106, lr: 7.31e-03, grad_scale: 16.0 2023-09-29 21:05:06,878 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2.whitening_limit, batch_count=489380.0, ans=15.0 2023-09-29 21:05:07,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 21:05:07,716 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 21:05:13,116 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:05:16,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:05:16,313 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=489380.0, ans=0.125 2023-09-29 21:05:18,217 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.76 vs. limit=22.5 2023-09-29 21:05:19,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:05:19,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:05:25,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:05:26,997 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:05:30,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:05:30,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:05:35,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:05:36,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:05:38,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:05:44,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 21:05:44,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:05:45,065 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 21:05:46,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:05:50,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:05:53,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 21:05:53,555 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=489580.0, ans=0.04949747468305833 2023-09-29 21:05:56,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:05:56,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 21:06:01,022 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 21:06:03,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:06:04,671 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 21:06:04,815 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 21:06:06,294 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 21:06:06,303 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:06:06,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:06:07,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:06:07,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:06:09,884 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:06:09,958 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:06:13,076 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 21:06:13,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:06:13,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:06:13,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:06:14,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 21:06:16,097 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 21:06:16,105 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 21:06:16,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 21:06:18,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:06:20,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:06:20,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:06:20,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:06:23,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 21:06:26,128 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 21:06:26,140 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:06:27,557 INFO [train.py:1039] (1/4) Epoch 14, batch 4400, loss[loss=0.1713, simple_loss=0.2386, pruned_loss=0.05197, over 24301.00 frames. ], tot_loss[loss=0.1905, simple_loss=0.2636, pruned_loss=0.05875, over 4738552.52 frames. ], batch size: 56, lr: 7.31e-03, grad_scale: 32.0 2023-09-29 21:06:29,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:06:29,248 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:06:32,245 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:06:32,508 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=489713.3333333333, ans=0.04949747468305833 2023-09-29 21:06:35,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 21:06:35,875 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 21:06:37,297 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 21:06:37,341 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 21:06:37,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 21:06:37,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:06:38,935 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.905e+02 2.228e+02 2.642e+02 4.473e+02, threshold=4.456e+02, percent-clipped=0.0 2023-09-29 21:06:40,690 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 21:06:42,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:06:43,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:06:43,803 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 21:06:47,450 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:06:47,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 21:06:47,537 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 21:06:50,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 21:06:50,815 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=489780.0, ans=0.04949747468305833 2023-09-29 21:06:50,936 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=489780.0, ans=0.0 2023-09-29 21:06:52,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 21:06:52,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 21:06:52,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:06:54,884 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:06:54,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:06:56,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:06:59,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 21:06:59,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 21:06:59,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:07:02,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:07:02,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:07:04,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:07:05,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:07:05,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 21:07:07,053 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 21:07:10,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:07:16,980 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:07:19,839 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 21:07:23,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:07:27,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:07:28,850 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:07:30,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 21:07:30,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:07:30,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:07:30,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:07:31,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 21:07:37,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 21:07:39,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 21:07:40,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 21:07:40,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:07:40,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 21:07:42,307 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:07:45,936 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:07:46,516 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=489980.0, ans=0.125 2023-09-29 21:07:47,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 21:07:49,001 INFO [train.py:1039] (1/4) Epoch 14, batch 4450, loss[loss=0.1736, simple_loss=0.244, pruned_loss=0.05157, over 24404.00 frames. ], tot_loss[loss=0.1906, simple_loss=0.2639, pruned_loss=0.05868, over 4741110.24 frames. ], batch size: 58, lr: 7.31e-03, grad_scale: 32.0 2023-09-29 21:07:50,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:07:53,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:07:55,099 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 21:07:57,021 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=490046.6666666667, ans=0.0 2023-09-29 21:08:02,002 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:08:02,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:08:05,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:08:07,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:08:10,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:08:12,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:08:13,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 21:08:13,542 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:08:13,666 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:08:13,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:08:13,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:08:14,075 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=490113.3333333333, ans=0.0 2023-09-29 21:08:16,828 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 21:08:23,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:08:23,580 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:08:25,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:08:25,440 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=490180.0, ans=0.0 2023-09-29 21:08:26,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:08:27,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:08:29,846 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=490180.0, ans=0.125 2023-09-29 21:08:33,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 21:08:35,490 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 21:08:35,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 21:08:35,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:08:39,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:08:40,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 21:08:44,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 21:08:49,401 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:08:49,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 21:08:49,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:08:49,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:08:49,560 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:08:49,571 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:08:52,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:08:56,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 21:08:57,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 21:08:59,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 21:08:59,433 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:09:02,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:09:02,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:09:02,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 21:09:04,900 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.77 vs. limit=6.0 2023-09-29 21:09:06,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 21:09:10,428 INFO [train.py:1039] (1/4) Epoch 14, batch 4500, loss[loss=0.1899, simple_loss=0.2676, pruned_loss=0.05613, over 23736.00 frames. ], tot_loss[loss=0.1902, simple_loss=0.2636, pruned_loss=0.05839, over 4733744.86 frames. ], batch size: 85, lr: 7.31e-03, grad_scale: 16.0 2023-09-29 21:09:10,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 21:09:10,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:09:17,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:09:17,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 21:09:17,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 21:09:19,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:09:23,950 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 1.876e+02 2.126e+02 2.360e+02 4.104e+02, threshold=4.251e+02, percent-clipped=0.0 2023-09-29 21:09:24,285 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:09:25,611 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:09:25,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 21:09:27,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:09:27,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:09:27,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:09:37,422 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=490446.6666666667, ans=0.0 2023-09-29 21:09:41,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:09:42,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:09:45,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:09:45,685 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:09:47,749 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:09:53,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 21:09:59,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:10:00,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 21:10:06,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:10:06,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 21:10:07,629 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:10:07,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:10:09,439 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=490580.0, ans=0.1 2023-09-29 21:10:11,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:10:11,394 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:10:14,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:10:14,402 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 21:10:14,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 21:10:14,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:10:16,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=490646.6666666667, ans=0.0 2023-09-29 21:10:19,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:10:19,903 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.18 vs. limit=6.0 2023-09-29 21:10:20,697 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:10:22,648 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:10:25,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:10:26,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:10:28,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 21:10:29,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 21:10:29,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 21:10:32,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 21:10:34,055 INFO [train.py:1039] (1/4) Epoch 14, batch 4550, loss[loss=0.177, simple_loss=0.2326, pruned_loss=0.06064, over 23402.00 frames. ], tot_loss[loss=0.1894, simple_loss=0.2628, pruned_loss=0.05799, over 4739097.99 frames. ], batch size: 285, lr: 7.31e-03, grad_scale: 16.0 2023-09-29 21:10:36,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 21:10:36,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:10:39,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:10:41,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:10:43,358 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.85 vs. limit=15.0 2023-09-29 21:10:45,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:10:49,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:10:52,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:10:52,823 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:10:52,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:10:52,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:10:55,854 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:10:57,948 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:11:01,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:11:03,328 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 21:11:04,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 21:11:06,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:11:07,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 21:11:09,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 21:11:11,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:11:13,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 21:11:15,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 21:11:18,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:11:18,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:11:19,981 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 21:11:21,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 21:11:26,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:11:27,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:11:27,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:11:29,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:11:31,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 21:11:32,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 21:11:32,828 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:11:32,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 21:11:36,660 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 21:11:36,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:11:38,264 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:11:38,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:11:39,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:11:39,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:11:42,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 21:11:42,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 21:11:44,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:11:44,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 21:11:46,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 21:11:46,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:11:46,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 21:11:49,611 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=490980.0, ans=0.125 2023-09-29 21:11:51,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:11:51,467 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:11:54,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:11:54,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:11:54,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 21:11:57,359 INFO [train.py:1039] (1/4) Epoch 14, batch 4600, loss[loss=0.1953, simple_loss=0.2848, pruned_loss=0.05288, over 24436.00 frames. ], tot_loss[loss=0.1891, simple_loss=0.2618, pruned_loss=0.05823, over 4719511.78 frames. ], batch size: 69, lr: 7.30e-03, grad_scale: 8.0 2023-09-29 21:11:57,421 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:11:57,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 21:12:02,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:12:03,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:12:07,264 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:12:07,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:12:07,515 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=491046.6666666667, ans=0.125 2023-09-29 21:12:08,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:12:08,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 21:12:11,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:12:12,507 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.411e+02 1.889e+02 2.188e+02 2.520e+02 3.712e+02, threshold=4.377e+02, percent-clipped=0.0 2023-09-29 21:12:15,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:12:15,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:12:17,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:12:27,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 21:12:28,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:12:31,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:12:34,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:12:34,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:12:39,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 21:12:39,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 21:12:39,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:12:44,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:12:44,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:12:46,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:12:50,780 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 21:12:52,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 21:12:57,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:12:58,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:13:01,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:13:01,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 21:13:01,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:13:02,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 21:13:02,725 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:13:02,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:13:05,780 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:13:05,909 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:13:07,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:13:07,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 21:13:07,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 21:13:08,607 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=491313.3333333333, ans=0.125 2023-09-29 21:13:09,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 21:13:09,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:13:09,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:13:11,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:13:11,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:13:18,270 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_abs, batch_count=491313.3333333333, ans=0.5 2023-09-29 21:13:18,356 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=491313.3333333333, ans=0.125 2023-09-29 21:13:20,986 INFO [train.py:1039] (1/4) Epoch 14, batch 4650, loss[loss=0.1689, simple_loss=0.2418, pruned_loss=0.04806, over 24470.00 frames. ], tot_loss[loss=0.1885, simple_loss=0.2611, pruned_loss=0.05802, over 4715020.27 frames. ], batch size: 58, lr: 7.30e-03, grad_scale: 8.0 2023-09-29 21:13:24,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:13:27,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:13:28,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:13:28,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:13:28,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:13:30,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:13:30,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:13:34,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 21:13:39,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:13:40,961 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 21:13:42,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:13:42,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 21:13:42,575 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:13:44,167 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 21:13:44,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 21:13:44,228 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:13:44,323 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:13:48,075 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 21:13:49,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:13:49,619 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 21:13:53,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:13:56,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 21:13:59,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:14:00,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:14:00,771 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 21:14:02,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:14:06,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:14:09,272 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:14:14,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:14:17,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:14:17,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:14:19,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:14:20,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 21:14:22,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 21:14:23,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 21:14:23,103 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 21:14:24,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:14:30,178 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=491646.6666666667, ans=0.2 2023-09-29 21:14:30,220 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=491646.6666666667, ans=0.2 2023-09-29 21:14:31,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:14:31,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:14:31,590 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 21:14:32,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:14:34,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:14:34,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:14:36,055 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:14:36,361 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=491646.6666666667, ans=0.125 2023-09-29 21:14:37,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:14:37,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:14:39,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:14:43,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:14:44,420 INFO [train.py:1039] (1/4) Epoch 14, batch 4700, loss[loss=0.1954, simple_loss=0.2632, pruned_loss=0.06384, over 23878.00 frames. ], tot_loss[loss=0.1893, simple_loss=0.2622, pruned_loss=0.05819, over 4731636.11 frames. ], batch size: 195, lr: 7.30e-03, grad_scale: 8.0 2023-09-29 21:14:44,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:14:44,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 21:14:44,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 21:14:46,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 21:14:48,258 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 21:14:54,710 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=491713.3333333333, ans=0.125 2023-09-29 21:14:56,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:14:58,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:14:59,754 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.446e+02 1.978e+02 2.336e+02 2.752e+02 4.215e+02, threshold=4.671e+02, percent-clipped=0.0 2023-09-29 21:14:59,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:15:00,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:15:02,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 21:15:08,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 21:15:08,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 21:15:09,833 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:15:11,345 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:15:11,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:15:14,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:15:21,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 21:15:23,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 21:15:25,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:15:26,886 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=491846.6666666667, ans=0.125 2023-09-29 21:15:31,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 21:15:33,414 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:15:36,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:15:38,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 21:15:40,504 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:15:43,674 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:15:45,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 21:15:46,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:15:46,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:15:51,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:15:51,905 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:15:51,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 21:15:54,090 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 21:15:55,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:15:55,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:15:55,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:15:55,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 21:15:59,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:16:02,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 21:16:04,649 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.52 vs. limit=15.0 2023-09-29 21:16:05,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:16:06,299 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=491980.0, ans=0.125 2023-09-29 21:16:08,508 INFO [train.py:1039] (1/4) Epoch 14, batch 4750, loss[loss=0.2167, simple_loss=0.2796, pruned_loss=0.07695, over 22770.00 frames. ], tot_loss[loss=0.1903, simple_loss=0.2635, pruned_loss=0.05853, over 4726043.34 frames. ], batch size: 322, lr: 7.30e-03, grad_scale: 8.0 2023-09-29 21:16:08,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:16:13,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:16:13,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:16:14,435 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=492046.6666666667, ans=0.125 2023-09-29 21:16:15,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 21:16:15,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:16:18,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 21:16:20,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:16:21,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:16:22,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:16:27,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 21:16:29,721 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=492113.3333333333, ans=0.5 2023-09-29 21:16:32,540 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:16:32,811 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=492113.3333333333, ans=0.0 2023-09-29 21:16:35,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 21:16:35,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:16:38,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:16:38,689 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:16:39,004 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=492113.3333333333, ans=0.0 2023-09-29 21:16:40,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:16:42,289 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 21:16:42,293 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 21:16:48,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 21:16:50,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:16:51,185 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.68 vs. limit=15.0 2023-09-29 21:16:52,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:16:53,888 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=492180.0, ans=0.0 2023-09-29 21:16:56,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:16:56,486 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 21:16:56,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:16:58,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:17:01,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:17:04,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 21:17:04,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 21:17:04,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:17:04,868 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:17:04,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:17:06,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 21:17:08,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 21:17:10,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 21:17:12,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:17:16,839 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:17:16,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 21:17:18,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:17:19,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:17:21,327 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:17:23,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:17:23,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 21:17:26,587 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:17:26,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 21:17:28,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 21:17:29,542 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 21:17:30,935 INFO [train.py:1039] (1/4) Epoch 14, batch 4800, loss[loss=0.1653, simple_loss=0.2395, pruned_loss=0.0456, over 24330.00 frames. ], tot_loss[loss=0.1906, simple_loss=0.2642, pruned_loss=0.05854, over 4733679.32 frames. ], batch size: 56, lr: 7.29e-03, grad_scale: 16.0 2023-09-29 21:17:33,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 21:17:34,528 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:17:36,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 21:17:37,902 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=492380.0, ans=0.125 2023-09-29 21:17:40,668 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:17:42,107 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:17:45,640 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 1.984e+02 2.307e+02 2.840e+02 4.511e+02, threshold=4.614e+02, percent-clipped=0.0 2023-09-29 21:17:46,120 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=492446.6666666667, ans=0.2 2023-09-29 21:17:47,292 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:17:47,980 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.19 vs. limit=12.0 2023-09-29 21:17:48,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:17:49,323 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten.whitening_limit, batch_count=492446.6666666667, ans=15.0 2023-09-29 21:17:50,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:17:50,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 21:17:50,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:17:51,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:17:53,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:17:55,707 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=492446.6666666667, ans=0.5 2023-09-29 21:17:58,450 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:18:00,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:18:00,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:18:01,009 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=492446.6666666667, ans=0.125 2023-09-29 21:18:02,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:18:02,217 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 21:18:02,239 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:18:03,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:18:06,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:18:10,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:18:11,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:18:11,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 21:18:13,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 21:18:14,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:18:16,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 21:18:16,645 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 21:18:18,156 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:18:19,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:18:19,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:18:19,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:18:19,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 21:18:21,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:18:21,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:18:26,663 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:18:26,998 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=492580.0, ans=0.125 2023-09-29 21:18:30,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:18:31,845 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:18:34,065 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=492580.0, ans=0.0 2023-09-29 21:18:37,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 21:18:38,488 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:18:38,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:18:38,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:18:38,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:18:42,154 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=492646.6666666667, ans=0.2 2023-09-29 21:18:43,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:18:45,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:18:45,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:18:45,805 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=492646.6666666667, ans=0.125 2023-09-29 21:18:46,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:18:46,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:18:47,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:18:51,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:18:51,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:18:51,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:18:53,063 INFO [train.py:1039] (1/4) Epoch 14, batch 4850, loss[loss=0.1863, simple_loss=0.2523, pruned_loss=0.06019, over 23470.00 frames. ], tot_loss[loss=0.19, simple_loss=0.2635, pruned_loss=0.05823, over 4739664.42 frames. ], batch size: 119, lr: 7.29e-03, grad_scale: 16.0 2023-09-29 21:18:53,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 21:18:55,171 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=492713.3333333333, ans=0.125 2023-09-29 21:18:56,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 21:18:56,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:18:56,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:18:56,576 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:18:56,578 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:19:00,334 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:19:07,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 21:19:07,913 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=492713.3333333333, ans=0.0 2023-09-29 21:19:10,673 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:19:11,150 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=492780.0, ans=0.2 2023-09-29 21:19:13,832 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:19:15,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 21:19:15,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:19:19,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:19:20,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:19:22,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:19:22,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 21:19:26,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:19:28,548 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:19:28,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 21:19:30,090 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:19:30,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 21:19:33,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:19:33,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:19:37,851 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=492846.6666666667, ans=0.1 2023-09-29 21:19:37,884 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=492846.6666666667, ans=0.0 2023-09-29 21:19:38,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:19:38,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 21:19:40,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 21:19:40,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 21:19:47,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:19:48,695 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 21:19:50,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:19:50,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:19:53,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 21:19:55,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 21:19:55,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:19:55,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 21:19:55,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:19:57,037 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:19:58,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 21:19:58,886 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=492980.0, ans=0.0 2023-09-29 21:20:00,513 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=492980.0, ans=0.125 2023-09-29 21:20:07,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:20:13,619 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:20:13,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:20:17,050 INFO [train.py:1039] (1/4) Epoch 14, batch 4900, loss[loss=0.2011, simple_loss=0.2605, pruned_loss=0.07084, over 23795.00 frames. ], tot_loss[loss=0.1891, simple_loss=0.2622, pruned_loss=0.05799, over 4732595.14 frames. ], batch size: 179, lr: 7.29e-03, grad_scale: 16.0 2023-09-29 21:20:18,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 21:20:18,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:20:23,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:20:25,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:20:25,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:20:30,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 21:20:31,620 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.872e+02 2.087e+02 2.309e+02 3.318e+02, threshold=4.174e+02, percent-clipped=0.0 2023-09-29 21:20:32,828 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.22 vs. limit=10.0 2023-09-29 21:20:33,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 21:20:37,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 21:20:38,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 21:20:40,201 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 21:20:40,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:20:40,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:20:40,310 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:20:40,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:20:42,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 21:20:48,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 21:20:48,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 21:20:48,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:20:50,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 21:20:52,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:20:53,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:20:55,162 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:20:55,186 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 21:20:56,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 21:20:58,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:20:58,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 21:20:58,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 21:20:59,925 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=493180.0, ans=0.2 2023-09-29 21:21:04,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 21:21:06,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 21:21:07,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:21:07,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:21:08,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:21:08,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 21:21:09,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:21:09,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 21:21:11,313 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:21:12,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 21:21:14,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:21:20,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 21:21:21,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:21:21,619 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 21:21:23,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 21:21:27,023 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=493313.3333333333, ans=0.125 2023-09-29 21:21:28,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:21:31,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:21:33,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 21:21:33,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 21:21:33,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:21:36,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:21:39,479 INFO [train.py:1039] (1/4) Epoch 14, batch 4950, loss[loss=0.1876, simple_loss=0.2593, pruned_loss=0.05796, over 23353.00 frames. ], tot_loss[loss=0.1881, simple_loss=0.2603, pruned_loss=0.05788, over 4716192.42 frames. ], batch size: 93, lr: 7.29e-03, grad_scale: 16.0 2023-09-29 21:21:39,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:21:39,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:21:39,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:21:39,769 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 21:21:42,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 21:21:44,397 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:21:45,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 21:21:49,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 21:21:49,525 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 21:21:49,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 21:21:49,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 21:21:49,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:21:49,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:21:51,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:21:51,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:21:54,186 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:21:55,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:21:55,796 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:21:57,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:22:00,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:22:00,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:22:02,640 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=493446.6666666667, ans=0.0 2023-09-29 21:22:04,058 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=493446.6666666667, ans=0.0 2023-09-29 21:22:05,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 21:22:10,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:22:10,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:22:11,148 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=493513.3333333333, ans=0.1 2023-09-29 21:22:12,373 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:22:13,756 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:22:15,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:22:16,790 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 21:22:16,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 21:22:19,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:22:22,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:22:22,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:22:23,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:22:23,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:22:25,170 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 21:22:26,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:22:29,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 21:22:32,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:22:33,193 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=493580.0, ans=0.0 2023-09-29 21:22:34,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:22:34,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:22:36,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 21:22:36,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:22:38,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 21:22:41,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:22:43,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:22:43,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:22:45,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:22:45,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:22:46,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:22:48,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:22:49,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:22:49,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:22:51,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 21:22:53,281 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=493646.6666666667, ans=0.0 2023-09-29 21:22:54,569 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:23:01,122 INFO [train.py:1039] (1/4) Epoch 14, batch 5000, loss[loss=0.1597, simple_loss=0.2347, pruned_loss=0.04242, over 24318.00 frames. ], tot_loss[loss=0.1872, simple_loss=0.2596, pruned_loss=0.05744, over 4721347.13 frames. ], batch size: 56, lr: 7.28e-03, grad_scale: 16.0 2023-09-29 21:23:01,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 21:23:01,325 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 21:23:01,668 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=493713.3333333333, ans=0.09899494936611666 2023-09-29 21:23:06,294 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:23:06,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:23:09,117 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 21:23:09,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 21:23:11,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:23:14,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 21:23:14,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:23:14,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 21:23:15,143 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=493713.3333333333, ans=0.125 2023-09-29 21:23:16,185 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.615e+02 1.874e+02 2.097e+02 2.409e+02 3.545e+02, threshold=4.194e+02, percent-clipped=0.0 2023-09-29 21:23:16,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 21:23:16,456 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:23:17,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:23:19,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 21:23:19,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:23:19,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:23:22,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 21:23:22,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 21:23:22,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:23:23,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 21:23:23,983 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 21:23:24,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:23:25,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 21:23:25,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 21:23:25,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 21:23:27,230 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=493780.0, ans=0.0 2023-09-29 21:23:28,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 21:23:28,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:23:30,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:23:30,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 21:23:30,839 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:23:33,965 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:23:35,512 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:23:35,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 21:23:35,943 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 21:23:36,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:23:39,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:23:39,510 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=493846.6666666667, ans=0.2 2023-09-29 21:23:40,782 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 21:23:46,393 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:23:46,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:23:46,560 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:23:48,325 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=493846.6666666667, ans=0.0 2023-09-29 21:23:51,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 21:23:51,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:23:51,309 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:23:51,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:23:54,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 21:23:54,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:23:56,228 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=493913.3333333333, ans=0.1 2023-09-29 21:23:57,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:23:57,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:24:04,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 21:24:09,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:24:09,605 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=493980.0, ans=0.0 2023-09-29 21:24:18,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:24:20,931 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:24:20,943 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:24:22,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:24:22,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 21:24:22,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:24:22,934 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:24:24,190 INFO [train.py:1039] (1/4) Epoch 14, batch 5050, loss[loss=0.2038, simple_loss=0.2687, pruned_loss=0.06943, over 23788.00 frames. ], tot_loss[loss=0.1877, simple_loss=0.2603, pruned_loss=0.05756, over 4720953.03 frames. ], batch size: 164, lr: 7.28e-03, grad_scale: 16.0 2023-09-29 21:24:24,737 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=494046.6666666667, ans=0.125 2023-09-29 21:24:28,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:24:28,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 21:24:31,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:24:34,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:24:34,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:24:36,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 21:24:38,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:24:38,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:24:40,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 21:24:41,138 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=494113.3333333333, ans=0.0 2023-09-29 21:24:42,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:24:42,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 21:24:51,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 21:24:51,789 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 21:24:53,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:24:53,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 21:24:55,427 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:24:56,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:24:58,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:24:58,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:24:58,366 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 21:25:00,464 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 21:25:02,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:25:03,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:25:05,335 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_abs, batch_count=494180.0, ans=0.5 2023-09-29 21:25:06,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:25:06,856 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=494180.0, ans=0.125 2023-09-29 21:25:08,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 21:25:08,473 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=494180.0, ans=0.125 2023-09-29 21:25:09,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:25:13,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 21:25:13,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 21:25:14,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:25:14,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:25:16,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:25:18,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:25:20,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:25:20,590 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=494246.6666666667, ans=0.0 2023-09-29 21:25:21,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:25:21,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:25:21,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:25:23,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 21:25:24,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:25:26,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:25:26,605 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=494246.6666666667, ans=0.0 2023-09-29 21:25:31,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:25:31,413 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 21:25:31,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 21:25:33,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:25:33,609 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:25:33,646 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 21:25:36,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:25:36,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 21:25:36,710 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:25:41,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:25:41,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:25:42,117 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=28.44 vs. limit=22.5 2023-09-29 21:25:42,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 21:25:44,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 21:25:46,461 INFO [train.py:1039] (1/4) Epoch 14, batch 5100, loss[loss=0.1993, simple_loss=0.2676, pruned_loss=0.06555, over 23787.00 frames. ], tot_loss[loss=0.1885, simple_loss=0.2615, pruned_loss=0.05775, over 4712456.68 frames. ], batch size: 179, lr: 7.28e-03, grad_scale: 16.0 2023-09-29 21:25:48,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:25:48,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:25:48,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:25:51,280 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 21:25:54,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:25:57,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 21:25:59,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 21:25:59,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:26:00,834 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.544e+02 1.801e+02 1.996e+02 2.340e+02 4.098e+02, threshold=3.991e+02, percent-clipped=0.0 2023-09-29 21:26:01,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:26:04,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:26:04,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 21:26:04,178 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 21:26:11,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:26:11,258 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:26:13,138 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=494446.6666666667, ans=0.1 2023-09-29 21:26:15,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:26:18,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 21:26:18,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:26:21,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:26:22,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 21:26:25,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:26:25,599 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:26:25,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 21:26:28,552 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 21:26:28,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:26:28,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 21:26:29,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 21:26:31,249 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=494513.3333333333, ans=0.0 2023-09-29 21:26:32,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:26:35,829 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=494580.0, ans=0.0 2023-09-29 21:26:40,214 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:26:42,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 21:26:42,503 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 21:26:42,518 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 21:26:45,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 21:26:45,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:26:45,963 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=494580.0, ans=0.2 2023-09-29 21:26:47,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 21:26:51,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 21:26:53,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 21:26:55,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:26:58,549 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 21:26:58,784 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 21:27:00,800 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 21:27:05,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:27:05,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:27:05,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:27:05,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:27:05,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 21:27:07,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:27:08,859 INFO [train.py:1039] (1/4) Epoch 14, batch 5150, loss[loss=0.189, simple_loss=0.268, pruned_loss=0.05496, over 24295.00 frames. ], tot_loss[loss=0.1894, simple_loss=0.2625, pruned_loss=0.05817, over 4702836.29 frames. ], batch size: 61, lr: 7.28e-03, grad_scale: 16.0 2023-09-29 21:27:08,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 21:27:08,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 21:27:10,342 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 21:27:10,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:27:10,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 21:27:11,088 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.88 vs. limit=15.0 2023-09-29 21:27:11,893 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:27:11,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 21:27:14,300 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=494713.3333333333, ans=0.125 2023-09-29 21:27:15,555 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:27:15,729 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:27:20,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 21:27:21,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 21:27:21,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:27:22,260 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=494713.3333333333, ans=0.1 2023-09-29 21:27:23,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 21:27:25,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 21:27:25,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:27:25,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:27:26,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:27:26,778 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:27:26,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 21:27:30,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:27:30,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:27:32,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 21:27:34,280 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 21:27:34,563 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=494780.0, ans=0.125 2023-09-29 21:27:35,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:27:42,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 21:27:43,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 21:27:45,990 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.45 vs. limit=15.0 2023-09-29 21:27:48,936 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:27:53,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:27:55,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:28:00,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:28:00,189 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:28:05,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 21:28:10,288 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:28:11,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:28:11,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:28:14,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:28:16,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:28:18,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 21:28:22,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:28:23,386 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.51 vs. limit=22.5 2023-09-29 21:28:24,760 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 21:28:28,138 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:28:28,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:28:29,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 21:28:29,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 21:28:29,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:28:29,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:28:31,108 INFO [train.py:1039] (1/4) Epoch 14, batch 5200, loss[loss=0.2059, simple_loss=0.2677, pruned_loss=0.07205, over 23709.00 frames. ], tot_loss[loss=0.1901, simple_loss=0.2632, pruned_loss=0.05845, over 4709127.79 frames. ], batch size: 232, lr: 7.27e-03, grad_scale: 32.0 2023-09-29 21:28:32,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:28:33,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:28:36,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:28:40,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 21:28:40,572 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.14 vs. limit=10.0 2023-09-29 21:28:41,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:28:43,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:28:45,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:28:45,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:28:45,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:28:47,025 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=495113.3333333333, ans=0.125 2023-09-29 21:28:48,045 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.902e+02 2.088e+02 2.453e+02 3.691e+02, threshold=4.175e+02, percent-clipped=0.0 2023-09-29 21:28:48,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 21:28:49,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 21:28:51,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:28:54,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 21:28:57,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 21:28:59,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:29:01,391 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 21:29:01,467 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 21:29:01,768 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=495113.3333333333, ans=0.125 2023-09-29 21:29:01,880 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=495113.3333333333, ans=0.125 2023-09-29 21:29:04,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 21:29:04,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:29:04,525 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 21:29:04,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:29:04,749 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=495180.0, ans=0.1 2023-09-29 21:29:07,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:29:07,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:29:09,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 21:29:10,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:29:12,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:29:13,393 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=495180.0, ans=0.2 2023-09-29 21:29:16,157 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 21:29:16,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 21:29:16,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 21:29:20,185 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=495246.6666666667, ans=0.125 2023-09-29 21:29:21,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 21:29:21,717 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=495246.6666666667, ans=0.125 2023-09-29 21:29:22,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 21:29:27,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:29:27,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:29:29,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 21:29:30,799 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:29:30,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 21:29:30,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:29:30,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 21:29:33,386 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.25 vs. limit=6.0 2023-09-29 21:29:35,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:29:39,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:29:43,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:29:44,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:29:44,025 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:29:50,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:29:52,210 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 21:29:52,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:29:52,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:29:53,789 INFO [train.py:1039] (1/4) Epoch 14, batch 5250, loss[loss=0.1799, simple_loss=0.264, pruned_loss=0.04788, over 24031.00 frames. ], tot_loss[loss=0.1895, simple_loss=0.2625, pruned_loss=0.05823, over 4708899.86 frames. ], batch size: 80, lr: 7.27e-03, grad_scale: 16.0 2023-09-29 21:29:54,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:29:54,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 21:29:54,909 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=495380.0, ans=0.0 2023-09-29 21:29:57,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:29:58,128 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=495380.0, ans=0.125 2023-09-29 21:29:59,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:30:00,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:30:01,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:30:01,240 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=495380.0, ans=0.125 2023-09-29 21:30:02,468 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:30:07,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:30:09,297 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=495446.6666666667, ans=0.1 2023-09-29 21:30:10,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:30:12,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:30:15,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 21:30:17,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 21:30:17,443 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:30:17,578 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:30:20,778 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=495446.6666666667, ans=0.0 2023-09-29 21:30:50,722 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=495580.0, ans=0.125 2023-09-29 21:31:05,003 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=495646.6666666667, ans=0.0 2023-09-29 21:31:07,630 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=495713.3333333333, ans=0.0 2023-09-29 21:31:08,522 INFO [train.py:1039] (1/4) Epoch 14, batch 5300, loss[loss=0.1809, simple_loss=0.2341, pruned_loss=0.06388, over 22721.00 frames. ], tot_loss[loss=0.1892, simple_loss=0.2612, pruned_loss=0.0586, over 4692206.07 frames. ], batch size: 322, lr: 7.27e-03, grad_scale: 16.0 2023-09-29 21:31:14,345 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=495713.3333333333, ans=0.125 2023-09-29 21:31:14,405 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=495713.3333333333, ans=0.125 2023-09-29 21:31:20,347 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=495713.3333333333, ans=0.0 2023-09-29 21:31:22,581 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 1.904e+02 2.089e+02 2.457e+02 4.761e+02, threshold=4.177e+02, percent-clipped=1.0 2023-09-29 21:31:23,297 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=7.93 vs. limit=12.0 2023-09-29 21:31:24,317 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=495780.0, ans=0.125 2023-09-29 21:31:25,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:31:25,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 21:31:25,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 21:31:25,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:31:26,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:31:26,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:31:26,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:31:26,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:31:26,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:31:26,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:31:26,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 21:31:27,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:31:27,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 21:31:27,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 21:31:27,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 21:31:27,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 21:31:27,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 21:31:28,074 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 21:31:28,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:31:28,763 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:31:28,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:31:28,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:31:29,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:31:30,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:31:30,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:31:30,183 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:31:30,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:31:30,368 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:31:30,376 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:31:30,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:31:30,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:31:31,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 21:31:31,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:31:31,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:31:32,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 21:31:32,030 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 21:31:32,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:31:32,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:31:32,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 21:31:32,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 21:31:32,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 21:31:33,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:31:34,100 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:31:34,259 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 21:31:34,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 21:31:34,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 21:31:34,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:31:34,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 21:31:34,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 21:31:34,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 21:31:35,234 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 21:31:43,618 INFO [train.py:1039] (1/4) Epoch 15, batch 0, loss[loss=0.1931, simple_loss=0.2798, pruned_loss=0.05319, over 24581.00 frames. ], tot_loss[loss=0.1931, simple_loss=0.2798, pruned_loss=0.05319, over 24581.00 frames. ], batch size: 71, lr: 7.02e-03, grad_scale: 32.0 2023-09-29 21:31:43,619 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-29 21:31:58,979 INFO [train.py:1071] (1/4) Epoch 15, validation: loss=0.2846, simple_loss=0.2783, pruned_loss=0.1455, over 1125622.00 frames. 2023-09-29 21:31:58,980 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-29 21:32:01,470 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=495800.0, ans=0.0 2023-09-29 21:32:02,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 21:32:06,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:32:07,856 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:32:11,272 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:32:11,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:32:11,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:32:12,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 21:32:14,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 21:32:14,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=495866.6666666667, ans=0.125 2023-09-29 21:32:17,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:32:18,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:32:22,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:32:23,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:32:23,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:32:23,572 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:32:25,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 21:32:26,702 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:32:27,465 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.92 vs. limit=22.5 2023-09-29 21:32:35,640 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:32:35,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:32:39,191 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 21:32:44,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:32:44,189 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:32:45,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:32:50,605 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:32:53,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:32:56,005 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.56 vs. limit=6.0 2023-09-29 21:32:58,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 21:32:59,582 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.12 vs. limit=15.0 2023-09-29 21:33:02,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 21:33:03,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:33:03,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:33:04,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:33:05,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:33:06,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 21:33:11,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:33:11,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:33:16,957 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:33:17,363 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=496066.6666666667, ans=0.0 2023-09-29 21:33:18,791 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 21:33:19,031 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=496066.6666666667, ans=0.1 2023-09-29 21:33:21,699 INFO [train.py:1039] (1/4) Epoch 15, batch 50, loss[loss=0.1929, simple_loss=0.2747, pruned_loss=0.05552, over 24090.00 frames. ], tot_loss[loss=0.1887, simple_loss=0.2639, pruned_loss=0.05674, over 1072399.55 frames. ], batch size: 80, lr: 7.02e-03, grad_scale: 32.0 2023-09-29 21:33:21,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:33:23,711 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=496133.3333333333, ans=0.0 2023-09-29 21:33:24,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:33:26,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:33:26,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 21:33:27,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:33:27,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:33:29,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:33:31,187 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=496133.3333333333, ans=0.125 2023-09-29 21:33:32,277 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:33:33,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:33:37,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 21:33:37,723 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:33:37,979 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=496200.0, ans=0.0 2023-09-29 21:33:42,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 21:33:45,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 21:33:46,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 21:33:48,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:33:49,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:33:49,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:33:50,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:33:52,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 21:33:52,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 21:33:52,896 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:33:59,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:34:00,950 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:34:00,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 21:34:02,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 21:34:04,056 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:34:05,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 21:34:05,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 21:34:07,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:34:10,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 21:34:17,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:34:18,275 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=6.18 vs. limit=12.0 2023-09-29 21:34:19,657 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:34:19,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:34:21,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:34:21,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 21:34:21,627 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=496333.3333333333, ans=0.2 2023-09-29 21:34:25,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 21:34:25,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 21:34:28,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:34:28,277 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 21:34:28,616 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=496400.0, ans=0.0 2023-09-29 21:34:30,195 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=496400.0, ans=0.125 2023-09-29 21:34:31,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:34:31,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:34:32,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 21:34:32,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 21:34:34,414 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 21:34:35,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:34:36,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:34:37,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 21:34:37,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 21:34:37,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:34:37,869 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:34:39,136 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.715e+02 2.074e+02 2.565e+02 3.305e+02 5.603e+02, threshold=5.131e+02, percent-clipped=8.0 2023-09-29 21:34:40,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 21:34:40,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:34:42,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:34:44,221 INFO [train.py:1039] (1/4) Epoch 15, batch 100, loss[loss=0.1944, simple_loss=0.2774, pruned_loss=0.05572, over 24458.00 frames. ], tot_loss[loss=0.1931, simple_loss=0.2662, pruned_loss=0.05996, over 1864238.57 frames. ], batch size: 69, lr: 7.02e-03, grad_scale: 32.0 2023-09-29 21:34:45,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:34:49,651 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=13.10 vs. limit=15.0 2023-09-29 21:34:50,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:34:53,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 21:34:53,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:34:57,843 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:34:57,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:34:57,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:34:57,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:34:57,965 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:34:59,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 21:35:03,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 21:35:03,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:35:03,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:35:03,331 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:35:07,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 21:35:10,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:35:11,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:35:12,650 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 21:35:13,049 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=496533.3333333333, ans=0.0 2023-09-29 21:35:14,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 21:35:14,488 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=496533.3333333333, ans=0.0 2023-09-29 21:35:18,032 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 21:35:19,414 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 21:35:20,984 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:35:20,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:35:21,898 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.92 vs. limit=15.0 2023-09-29 21:35:25,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 21:35:27,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:35:27,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:35:33,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:35:34,024 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 21:35:37,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 21:35:39,905 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=496666.6666666667, ans=0.125 2023-09-29 21:35:42,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:35:44,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:35:45,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:35:48,912 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:35:52,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:35:52,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:35:54,867 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.89 vs. limit=15.0 2023-09-29 21:35:55,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:35:57,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:35:57,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:35:57,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:35:58,704 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:36:00,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 21:36:00,196 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 21:36:00,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:36:00,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:36:02,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:36:02,388 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:36:02,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 21:36:02,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 21:36:03,966 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 21:36:03,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:36:05,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:36:06,713 INFO [train.py:1039] (1/4) Epoch 15, batch 150, loss[loss=0.2022, simple_loss=0.2807, pruned_loss=0.06184, over 24521.00 frames. ], tot_loss[loss=0.1935, simple_loss=0.2663, pruned_loss=0.06038, over 2493236.38 frames. ], batch size: 66, lr: 7.01e-03, grad_scale: 32.0 2023-09-29 21:36:06,910 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:36:08,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:36:08,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:36:11,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:36:13,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:36:13,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:36:15,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:36:18,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:36:18,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:36:21,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:36:23,076 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:36:24,705 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=496866.6666666667, ans=0.125 2023-09-29 21:36:29,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 21:36:29,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 21:36:29,510 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 21:36:30,243 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.46 vs. limit=22.5 2023-09-29 21:36:32,589 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:36:32,597 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 21:36:32,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:36:34,277 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:36:34,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:36:34,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:36:34,439 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:36:36,793 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 21:36:39,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:36:43,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:36:43,397 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=496933.3333333333, ans=0.07 2023-09-29 21:36:46,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:36:48,359 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 21:36:52,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:36:52,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:36:52,882 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:36:54,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:36:54,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:36:56,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:36:57,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:36:57,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 21:37:01,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:37:04,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:37:04,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:37:04,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:37:08,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:37:08,688 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=497000.0, ans=0.0 2023-09-29 21:37:09,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 21:37:12,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:37:14,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:37:15,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:37:19,309 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 21:37:19,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 21:37:19,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:37:19,415 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 21:37:23,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:37:26,178 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.515e+02 1.806e+02 2.055e+02 2.590e+02 4.271e+02, threshold=4.110e+02, percent-clipped=0.0 2023-09-29 21:37:26,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:37:26,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:37:29,058 INFO [train.py:1039] (1/4) Epoch 15, batch 200, loss[loss=0.1765, simple_loss=0.2615, pruned_loss=0.04574, over 24552.00 frames. ], tot_loss[loss=0.1932, simple_loss=0.2666, pruned_loss=0.05992, over 2996890.16 frames. ], batch size: 71, lr: 7.01e-03, grad_scale: 16.0 2023-09-29 21:37:29,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 21:37:30,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:37:30,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:37:34,557 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 21:37:36,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 21:37:39,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:37:39,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:37:39,814 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.55 vs. limit=15.0 2023-09-29 21:37:43,418 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.03 vs. limit=6.0 2023-09-29 21:37:44,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:37:44,443 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:37:45,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:37:49,272 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=497200.0, ans=0.0 2023-09-29 21:38:08,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:38:10,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:38:10,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:38:10,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:38:12,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 21:38:12,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 21:38:14,184 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=497266.6666666667, ans=0.2 2023-09-29 21:38:14,203 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=497266.6666666667, ans=0.125 2023-09-29 21:38:15,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:38:16,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:38:18,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:38:20,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:38:21,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 21:38:21,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 21:38:21,711 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:38:25,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:38:30,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:38:32,162 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=497333.3333333333, ans=0.0 2023-09-29 21:38:39,038 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:38:39,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:38:47,297 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:38:48,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 21:38:50,398 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:38:50,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:38:50,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:38:51,884 INFO [train.py:1039] (1/4) Epoch 15, batch 250, loss[loss=0.1894, simple_loss=0.2685, pruned_loss=0.05515, over 24453.00 frames. ], tot_loss[loss=0.192, simple_loss=0.2662, pruned_loss=0.05889, over 3394569.72 frames. ], batch size: 63, lr: 7.01e-03, grad_scale: 16.0 2023-09-29 21:38:51,962 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:38:53,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 21:38:54,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:38:54,918 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 21:38:56,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:38:58,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:39:02,038 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:39:02,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:39:03,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:39:03,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:39:05,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:39:09,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:39:20,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:39:24,043 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:39:25,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:39:30,411 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=497600.0, ans=0.0 2023-09-29 21:39:31,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 21:39:31,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 21:39:33,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:39:35,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:39:35,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 21:39:35,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:39:35,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:39:37,805 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.07 vs. limit=22.5 2023-09-29 21:39:38,647 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:39:41,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 21:39:43,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:39:44,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:39:45,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 21:39:45,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:39:45,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:39:47,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:39:47,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 21:39:47,580 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=497666.6666666667, ans=0.0 2023-09-29 21:39:48,943 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:39:50,400 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:39:50,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:39:53,113 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=497666.6666666667, ans=0.2 2023-09-29 21:39:56,211 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:39:56,428 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=497666.6666666667, ans=0.125 2023-09-29 21:39:58,270 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=9.59 vs. limit=22.5 2023-09-29 21:40:00,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:40:02,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:40:06,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:40:08,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:40:12,380 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.473e+02 1.826e+02 2.078e+02 2.374e+02 4.039e+02, threshold=4.156e+02, percent-clipped=0.0 2023-09-29 21:40:12,613 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 21:40:12,942 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=497733.3333333333, ans=0.125 2023-09-29 21:40:14,664 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:40:14,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:40:16,025 INFO [train.py:1039] (1/4) Epoch 15, batch 300, loss[loss=0.1925, simple_loss=0.253, pruned_loss=0.06597, over 23757.00 frames. ], tot_loss[loss=0.1903, simple_loss=0.2638, pruned_loss=0.05846, over 3686536.80 frames. ], batch size: 164, lr: 7.01e-03, grad_scale: 16.0 2023-09-29 21:40:17,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 21:40:17,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 21:40:19,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:40:19,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 21:40:24,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:40:25,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:40:29,202 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=497800.0, ans=0.0 2023-09-29 21:40:31,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:40:31,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 21:40:31,265 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:40:32,006 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.48 vs. limit=15.0 2023-09-29 21:40:32,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 21:40:34,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 21:40:34,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:40:37,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:40:40,623 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=497866.6666666667, ans=0.0 2023-09-29 21:40:42,691 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:40:44,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 21:40:44,198 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=497866.6666666667, ans=0.125 2023-09-29 21:40:48,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 21:40:48,638 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:40:52,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:40:54,371 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.68 vs. limit=10.0 2023-09-29 21:40:55,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:40:55,239 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 21:40:55,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:40:55,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:40:58,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:40:58,497 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:41:04,981 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 21:41:04,988 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 21:41:05,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:41:08,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:41:10,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 21:41:11,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:41:15,435 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:41:17,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:41:17,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 21:41:21,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:41:21,616 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 21:41:23,735 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:41:25,287 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:41:26,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 21:41:26,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 21:41:28,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:41:29,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 21:41:32,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:41:32,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:41:34,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:41:34,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:41:35,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:41:37,678 INFO [train.py:1039] (1/4) Epoch 15, batch 350, loss[loss=0.1928, simple_loss=0.2785, pruned_loss=0.05352, over 24037.00 frames. ], tot_loss[loss=0.1885, simple_loss=0.261, pruned_loss=0.05796, over 3905988.81 frames. ], batch size: 86, lr: 7.00e-03, grad_scale: 16.0 2023-09-29 21:41:40,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:41:40,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 21:41:44,521 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:41:49,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:41:52,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:41:54,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:41:57,210 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 21:41:59,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:41:59,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 21:42:01,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:42:02,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 21:42:02,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:42:06,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 21:42:07,162 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=498200.0, ans=0.2 2023-09-29 21:42:08,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:42:10,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:42:10,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:42:12,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:42:13,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:42:13,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:42:13,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:42:13,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:42:15,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:42:17,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:42:24,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:42:24,322 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 21:42:25,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:42:27,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:42:31,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 21:42:31,943 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:42:34,508 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=498333.3333333333, ans=0.125 2023-09-29 21:42:37,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:42:37,283 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:42:37,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:42:40,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 21:42:40,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:42:40,889 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=498333.3333333333, ans=0.125 2023-09-29 21:42:42,091 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 21:42:43,639 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 21:42:43,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:42:45,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:42:45,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 21:42:47,580 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=498400.0, ans=0.125 2023-09-29 21:42:47,772 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=498400.0, ans=0.125 2023-09-29 21:42:49,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:42:51,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:42:52,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:42:54,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:42:54,378 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:42:56,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:42:57,507 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.447e+02 1.856e+02 2.198e+02 2.696e+02 4.798e+02, threshold=4.395e+02, percent-clipped=2.0 2023-09-29 21:42:59,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:43:00,683 INFO [train.py:1039] (1/4) Epoch 15, batch 400, loss[loss=0.2023, simple_loss=0.2876, pruned_loss=0.0585, over 24625.00 frames. ], tot_loss[loss=0.1874, simple_loss=0.26, pruned_loss=0.05744, over 4089697.03 frames. ], batch size: 68, lr: 7.00e-03, grad_scale: 32.0 2023-09-29 21:43:00,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:43:02,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 21:43:02,402 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:43:03,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:43:05,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:43:07,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:43:10,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:43:12,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:43:13,670 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 21:43:13,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 21:43:13,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:43:15,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 21:43:15,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:43:17,870 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=498533.3333333333, ans=0.125 2023-09-29 21:43:20,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:43:20,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:43:20,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 21:43:20,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:43:22,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:43:22,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:43:22,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:43:27,359 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 21:43:27,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 21:43:29,506 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=498533.3333333333, ans=0.0 2023-09-29 21:43:32,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:43:33,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:43:33,928 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=498600.0, ans=0.125 2023-09-29 21:43:35,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 21:43:36,515 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 21:43:39,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:43:42,168 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:43:48,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 21:43:52,601 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 21:43:54,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 21:43:58,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:44:01,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:44:02,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 21:44:04,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:44:07,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 21:44:08,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:44:13,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:44:13,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 21:44:13,764 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=498733.3333333333, ans=0.1 2023-09-29 21:44:15,081 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 21:44:16,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 21:44:18,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:44:18,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:44:20,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 21:44:21,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 21:44:23,364 INFO [train.py:1039] (1/4) Epoch 15, batch 450, loss[loss=0.1694, simple_loss=0.2418, pruned_loss=0.04853, over 24471.00 frames. ], tot_loss[loss=0.188, simple_loss=0.2607, pruned_loss=0.05765, over 4222832.67 frames. ], batch size: 58, lr: 7.00e-03, grad_scale: 32.0 2023-09-29 21:44:23,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:44:23,562 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 21:44:25,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 21:44:25,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:44:26,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:44:28,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:44:28,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 21:44:30,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:44:32,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 21:44:33,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:44:43,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:44:45,031 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:44:45,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 21:44:45,968 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.15 vs. limit=12.0 2023-09-29 21:44:46,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 21:44:48,502 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=498866.6666666667, ans=0.125 2023-09-29 21:44:53,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 21:44:54,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:44:57,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:45:00,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:45:00,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:45:01,042 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=498933.3333333333, ans=0.1 2023-09-29 21:45:03,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 21:45:05,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 21:45:07,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 21:45:07,862 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:45:09,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:45:09,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:45:11,811 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 21:45:11,824 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 21:45:11,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:45:13,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:45:13,649 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 21:45:18,199 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 21:45:18,268 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:45:19,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 21:45:19,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 21:45:22,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:45:24,833 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 21:45:24,880 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 21:45:26,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 21:45:29,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:45:31,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 21:45:31,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 21:45:32,107 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.55 vs. limit=15.0 2023-09-29 21:45:32,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:45:37,504 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=499066.6666666667, ans=0.0 2023-09-29 21:45:39,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:45:40,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:45:42,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:45:42,983 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 21:45:44,924 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.892e+02 2.180e+02 2.454e+02 3.588e+02, threshold=4.360e+02, percent-clipped=0.0 2023-09-29 21:45:46,451 INFO [train.py:1039] (1/4) Epoch 15, batch 500, loss[loss=0.1866, simple_loss=0.2725, pruned_loss=0.05036, over 24472.00 frames. ], tot_loss[loss=0.1886, simple_loss=0.2613, pruned_loss=0.05793, over 4343347.98 frames. ], batch size: 69, lr: 7.00e-03, grad_scale: 16.0 2023-09-29 21:45:48,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:45:49,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:45:51,114 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:45:51,130 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 21:45:52,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 21:45:52,661 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:45:54,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 21:45:59,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 21:46:01,164 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:46:02,749 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:46:02,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:46:04,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:46:05,930 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=499200.0, ans=0.125 2023-09-29 21:46:16,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:46:18,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 21:46:18,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 21:46:20,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:46:21,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 21:46:21,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:46:23,392 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=6.32 vs. limit=15.0 2023-09-29 21:46:24,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:46:24,460 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=499266.6666666667, ans=0.125 2023-09-29 21:46:25,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:46:25,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:46:25,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:46:27,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 21:46:30,126 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 21:46:31,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:46:33,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:46:34,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:46:34,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:46:37,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 21:46:38,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 21:46:41,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:46:41,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:46:45,168 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=499333.3333333333, ans=0.125 2023-09-29 21:46:46,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:46:50,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:46:50,611 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=499400.0, ans=0.07 2023-09-29 21:46:52,078 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.max_abs, batch_count=499400.0, ans=10.0 2023-09-29 21:46:56,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:47:01,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 21:47:01,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:47:01,357 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:47:02,167 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.51 vs. limit=15.0 2023-09-29 21:47:03,259 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=499400.0, ans=0.125 2023-09-29 21:47:04,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 21:47:04,553 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 21:47:06,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:47:07,619 INFO [train.py:1039] (1/4) Epoch 15, batch 550, loss[loss=0.1751, simple_loss=0.2656, pruned_loss=0.04235, over 24367.00 frames. ], tot_loss[loss=0.1905, simple_loss=0.2633, pruned_loss=0.05884, over 4430551.27 frames. ], batch size: 77, lr: 6.99e-03, grad_scale: 16.0 2023-09-29 21:47:09,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 21:47:10,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 21:47:13,020 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:47:13,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 21:47:14,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:47:14,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:47:14,811 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=499466.6666666667, ans=0.0 2023-09-29 21:47:16,037 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:47:16,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:47:16,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:47:17,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:47:18,348 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.47 vs. limit=15.0 2023-09-29 21:47:19,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:47:22,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 21:47:22,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:47:23,105 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=499533.3333333333, ans=0.1 2023-09-29 21:47:25,339 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=499533.3333333333, ans=0.0 2023-09-29 21:47:28,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:47:30,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:47:33,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:47:33,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:47:37,861 WARNING [train.py:1197] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 21:47:39,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 21:47:41,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:47:44,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:47:44,594 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 21:47:46,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 21:47:51,004 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:47:51,013 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 21:47:52,471 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:47:52,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 21:47:55,765 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 21:47:57,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 21:47:57,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 21:47:59,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:48:00,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 21:48:02,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 21:48:04,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:48:04,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:48:06,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:48:06,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:48:08,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:48:09,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:48:12,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:48:13,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:48:14,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 21:48:16,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:48:18,079 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.89 vs. limit=15.0 2023-09-29 21:48:18,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:48:19,054 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:48:19,137 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:48:21,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 21:48:21,300 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 21:48:23,132 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=499733.3333333333, ans=0.1 2023-09-29 21:48:27,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 21:48:28,879 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 1.896e+02 2.048e+02 2.393e+02 3.212e+02, threshold=4.096e+02, percent-clipped=0.0 2023-09-29 21:48:30,433 INFO [train.py:1039] (1/4) Epoch 15, batch 600, loss[loss=0.2044, simple_loss=0.2727, pruned_loss=0.06807, over 23760.00 frames. ], tot_loss[loss=0.192, simple_loss=0.265, pruned_loss=0.05955, over 4485497.39 frames. ], batch size: 135, lr: 6.99e-03, grad_scale: 16.0 2023-09-29 21:48:31,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 21:48:33,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:48:33,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 21:48:33,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:48:41,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:48:41,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 21:48:42,937 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 21:48:45,853 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 21:48:47,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:48:49,135 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:48:52,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 21:48:52,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:48:58,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 21:48:58,965 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.min_positive, batch_count=499866.6666666667, ans=0.05 2023-09-29 21:49:01,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:49:01,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:49:03,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:49:09,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:49:09,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:49:09,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:49:14,418 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=499933.3333333333, ans=0.0 2023-09-29 21:49:17,103 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:49:22,902 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:49:22,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:49:22,921 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:49:30,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 21:49:30,927 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=500000.0, ans=0.0 2023-09-29 21:49:38,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 21:49:38,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:49:40,051 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=500066.6666666667, ans=0.2 2023-09-29 21:49:44,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 21:49:45,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:49:49,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 21:49:49,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:49:50,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 21:49:53,635 INFO [train.py:1039] (1/4) Epoch 15, batch 650, loss[loss=0.21, simple_loss=0.2649, pruned_loss=0.07753, over 23826.00 frames. ], tot_loss[loss=0.1911, simple_loss=0.2636, pruned_loss=0.05925, over 4529819.58 frames. ], batch size: 150, lr: 6.99e-03, grad_scale: 8.0 2023-09-29 21:49:53,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 21:49:55,466 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:49:57,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:49:59,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:50:00,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:50:04,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 21:50:05,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:50:10,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:50:10,444 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:50:15,673 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:50:19,217 WARNING [train.py:1197] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 21:50:20,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:50:20,847 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:50:25,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:50:25,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 21:50:28,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:50:30,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:50:32,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 21:50:32,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:50:32,693 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=500266.6666666667, ans=0.04949747468305833 2023-09-29 21:50:33,748 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:50:35,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:50:35,501 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 21:50:35,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:50:35,550 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:50:40,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:50:40,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:50:41,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:50:43,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 21:50:44,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 21:50:44,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:50:44,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:50:46,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 21:50:46,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:50:46,719 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=500333.3333333333, ans=0.125 2023-09-29 21:50:47,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 21:50:50,099 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 21:50:52,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 21:50:52,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:50:52,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:50:52,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:50:53,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:50:54,273 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=500333.3333333333, ans=0.125 2023-09-29 21:50:55,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:50:58,853 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=500400.0, ans=0.07 2023-09-29 21:51:00,008 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:51:00,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:51:01,672 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:51:05,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:51:05,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 21:51:05,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:51:07,240 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=500400.0, ans=0.125 2023-09-29 21:51:12,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 21:51:12,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:51:14,439 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:51:14,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:51:15,839 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.967e+02 2.230e+02 2.701e+02 4.378e+02, threshold=4.460e+02, percent-clipped=5.0 2023-09-29 21:51:15,881 INFO [train.py:1039] (1/4) Epoch 15, batch 700, loss[loss=0.184, simple_loss=0.2699, pruned_loss=0.04904, over 24417.00 frames. ], tot_loss[loss=0.1901, simple_loss=0.2627, pruned_loss=0.05869, over 4573952.29 frames. ], batch size: 69, lr: 6.99e-03, grad_scale: 8.0 2023-09-29 21:51:16,451 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=500466.6666666667, ans=0.125 2023-09-29 21:51:20,582 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 21:51:20,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 21:51:23,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 21:51:24,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:51:28,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:51:29,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 21:51:32,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:51:33,346 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=500533.3333333333, ans=0.0 2023-09-29 21:51:36,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:51:36,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:51:39,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 21:51:39,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:51:42,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:51:45,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 21:51:45,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:51:47,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 21:51:47,762 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=500600.0, ans=0.125 2023-09-29 21:51:49,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 21:51:53,846 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 21:51:53,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:51:56,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 21:52:03,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:52:03,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 21:52:07,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:52:09,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:52:09,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 21:52:11,891 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=500666.6666666667, ans=0.125 2023-09-29 21:52:14,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:52:15,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:52:18,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:52:20,986 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=500733.3333333333, ans=0.1 2023-09-29 21:52:25,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:52:25,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 21:52:27,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 21:52:28,424 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 21:52:30,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:52:31,786 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=500733.3333333333, ans=0.125 2023-09-29 21:52:32,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:52:33,039 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:52:34,654 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:52:34,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 21:52:37,181 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=11.56 vs. limit=15.0 2023-09-29 21:52:38,448 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=500800.0, ans=10.0 2023-09-29 21:52:39,404 INFO [train.py:1039] (1/4) Epoch 15, batch 750, loss[loss=0.2091, simple_loss=0.261, pruned_loss=0.0786, over 18960.00 frames. ], tot_loss[loss=0.1899, simple_loss=0.2623, pruned_loss=0.05869, over 4594754.74 frames. ], batch size: 388, lr: 6.99e-03, grad_scale: 8.0 2023-09-29 21:52:41,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 21:52:41,054 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 21:52:41,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 21:52:42,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 21:52:42,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 21:52:44,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:52:46,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 21:52:46,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:52:47,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 21:52:48,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:52:49,584 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:52:51,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 21:52:51,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:52:54,132 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:52:54,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:52:57,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:52:58,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:52:59,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:53:00,996 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 21:53:01,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:53:03,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:53:04,740 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:53:04,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 21:53:06,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 21:53:07,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:53:10,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 21:53:10,127 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 21:53:11,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 21:53:11,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:53:11,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 21:53:14,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:53:16,590 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=500933.3333333333, ans=0.125 2023-09-29 21:53:18,010 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=500933.3333333333, ans=0.0 2023-09-29 21:53:21,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 21:53:22,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:53:22,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 21:53:24,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:53:25,088 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.63 vs. limit=6.0 2023-09-29 21:53:26,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:53:26,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 21:53:27,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:53:29,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 21:53:29,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:53:32,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:53:32,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 21:53:34,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:53:35,257 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.35 vs. limit=15.0 2023-09-29 21:53:41,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:53:41,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:53:41,463 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=501000.0, ans=0.1 2023-09-29 21:53:42,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:53:44,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:53:48,275 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=501066.6666666667, ans=0.125 2023-09-29 21:53:49,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 21:53:49,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:53:49,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:53:54,428 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:53:54,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:53:57,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:53:57,625 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 21:54:02,046 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.875e+02 2.035e+02 2.283e+02 3.726e+02, threshold=4.071e+02, percent-clipped=0.0 2023-09-29 21:54:02,104 INFO [train.py:1039] (1/4) Epoch 15, batch 800, loss[loss=0.2278, simple_loss=0.2915, pruned_loss=0.08202, over 22785.00 frames. ], tot_loss[loss=0.1905, simple_loss=0.2628, pruned_loss=0.05914, over 4606964.11 frames. ], batch size: 322, lr: 6.98e-03, grad_scale: 16.0 2023-09-29 21:54:04,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:54:04,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:54:06,379 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=501133.3333333333, ans=0.0 2023-09-29 21:54:07,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:54:07,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:54:08,032 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=501133.3333333333, ans=0.1 2023-09-29 21:54:09,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:54:09,195 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:54:11,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:54:15,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:54:15,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:54:18,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 21:54:20,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:54:21,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:54:21,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:54:21,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:54:23,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 21:54:23,956 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:54:25,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 21:54:28,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:54:32,256 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:54:35,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:54:35,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:54:38,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:54:38,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:54:42,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:54:43,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 21:54:43,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 21:54:47,199 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 21:54:47,238 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 21:54:47,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:54:47,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:54:48,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:54:48,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:54:50,930 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.96 vs. limit=12.0 2023-09-29 21:54:51,241 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.89 vs. limit=12.0 2023-09-29 21:54:55,122 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 21:54:55,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 21:54:58,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:54:59,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:55:04,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:55:09,825 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:55:11,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 21:55:11,397 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:55:14,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 21:55:21,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:55:24,943 INFO [train.py:1039] (1/4) Epoch 15, batch 850, loss[loss=0.1894, simple_loss=0.2702, pruned_loss=0.05433, over 24627.00 frames. ], tot_loss[loss=0.19, simple_loss=0.263, pruned_loss=0.05852, over 4646968.04 frames. ], batch size: 68, lr: 6.98e-03, grad_scale: 16.0 2023-09-29 21:55:24,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:55:25,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 21:55:25,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:55:26,745 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:55:28,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 21:55:29,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:55:31,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:55:32,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:55:34,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 21:55:35,711 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:55:37,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 21:55:37,359 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 21:55:39,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 21:55:40,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:55:40,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:55:42,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:55:42,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:55:44,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:55:50,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:55:50,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:55:52,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 21:55:55,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 21:55:58,840 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:56:01,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 21:56:05,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 21:56:07,170 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 21:56:08,840 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 21:56:08,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:56:08,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:56:08,901 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 21:56:11,917 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:56:12,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:56:13,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 21:56:16,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:56:17,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:56:18,590 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:56:18,635 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 21:56:20,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:56:22,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 21:56:24,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 21:56:27,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:56:28,555 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:56:28,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:56:28,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:56:30,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:56:34,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:56:36,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:56:36,614 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=501733.3333333333, ans=0.07 2023-09-29 21:56:37,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 21:56:39,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:56:39,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 21:56:46,686 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.806e+02 2.003e+02 2.266e+02 2.717e+02, threshold=4.007e+02, percent-clipped=0.0 2023-09-29 21:56:46,730 INFO [train.py:1039] (1/4) Epoch 15, batch 900, loss[loss=0.1573, simple_loss=0.2377, pruned_loss=0.03849, over 24299.00 frames. ], tot_loss[loss=0.19, simple_loss=0.2633, pruned_loss=0.05833, over 4663174.14 frames. ], batch size: 56, lr: 6.98e-03, grad_scale: 16.0 2023-09-29 21:56:48,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 21:56:50,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:56:50,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 21:56:51,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:56:51,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:56:53,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 21:56:54,558 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.99 vs. limit=15.0 2023-09-29 21:56:55,360 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=501800.0, ans=0.0 2023-09-29 21:57:00,220 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:57:03,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:57:03,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 21:57:07,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 21:57:07,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 21:57:09,154 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 21:57:09,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:57:09,317 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:57:10,746 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 21:57:10,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:57:22,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:57:22,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:57:22,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 21:57:25,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:57:28,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 21:57:31,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:57:36,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 21:57:36,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 21:57:39,111 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 21:57:39,232 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 21:57:45,331 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 21:57:46,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:57:46,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 21:57:52,996 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:57:53,012 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:57:54,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 21:57:54,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:57:56,303 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 21:57:58,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:57:58,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:58:01,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:58:01,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:58:06,786 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 21:58:06,868 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 21:58:08,455 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 21:58:08,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 21:58:09,932 INFO [train.py:1039] (1/4) Epoch 15, batch 950, loss[loss=0.2076, simple_loss=0.2901, pruned_loss=0.06258, over 24339.00 frames. ], tot_loss[loss=0.191, simple_loss=0.264, pruned_loss=0.05896, over 4674778.81 frames. ], batch size: 77, lr: 6.98e-03, grad_scale: 16.0 2023-09-29 21:58:12,299 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:58:12,500 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=502133.3333333333, ans=0.125 2023-09-29 21:58:15,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 21:58:22,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:58:22,551 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=502133.3333333333, ans=0.0 2023-09-29 21:58:23,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:58:23,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:58:25,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 21:58:28,124 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 21:58:30,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:58:30,431 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:58:30,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:58:30,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:58:31,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 21:58:33,549 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 21:58:35,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:58:36,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 21:58:36,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:58:41,081 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.01 vs. limit=12.0 2023-09-29 21:58:44,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:58:44,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:58:44,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:58:45,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 21:58:49,125 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 21:58:50,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:58:52,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 21:58:58,320 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:58:58,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:58:58,635 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=502333.3333333333, ans=0.05 2023-09-29 21:59:01,419 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 21:59:02,911 WARNING [train.py:1197] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 21:59:02,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:59:04,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:59:04,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:59:04,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:59:09,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 21:59:12,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:59:16,169 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:59:16,279 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:59:16,307 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 21:59:16,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:59:16,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:59:18,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 21:59:23,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:59:24,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:59:28,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:59:28,431 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=502400.0, ans=0.04949747468305833 2023-09-29 21:59:29,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 21:59:29,885 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 21:59:32,778 INFO [train.py:1039] (1/4) Epoch 15, batch 1000, loss[loss=0.1699, simple_loss=0.2199, pruned_loss=0.05996, over 19508.00 frames. ], tot_loss[loss=0.1901, simple_loss=0.2627, pruned_loss=0.05869, over 4674351.70 frames. ], batch size: 388, lr: 6.97e-03, grad_scale: 8.0 2023-09-29 21:59:34,249 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.874e+02 2.213e+02 2.619e+02 3.676e+02, threshold=4.426e+02, percent-clipped=0.0 2023-09-29 21:59:34,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:59:37,581 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 21:59:38,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:59:44,458 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=502466.6666666667, ans=0.125 2023-09-29 21:59:45,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:59:47,779 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 21:59:47,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 21:59:53,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:59:53,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:59:54,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:59:58,347 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 22:00:00,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 22:00:01,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 22:00:01,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:00:04,766 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 22:00:06,449 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 22:00:06,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 22:00:08,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:00:09,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:00:11,314 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=502600.0, ans=0.125 2023-09-29 22:00:17,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:00:17,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:00:17,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:00:20,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:00:20,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 22:00:20,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:00:20,246 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:00:21,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:00:21,947 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 22:00:25,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 22:00:27,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 22:00:30,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 22:00:32,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:00:39,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:00:39,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:00:39,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:00:42,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:00:42,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 22:00:44,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:00:45,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 22:00:45,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 22:00:47,509 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:00:47,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:00:51,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:00:54,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 22:00:56,333 INFO [train.py:1039] (1/4) Epoch 15, batch 1050, loss[loss=0.1739, simple_loss=0.2361, pruned_loss=0.05585, over 23408.00 frames. ], tot_loss[loss=0.1886, simple_loss=0.2611, pruned_loss=0.05801, over 4681769.47 frames. ], batch size: 285, lr: 6.97e-03, grad_scale: 8.0 2023-09-29 22:00:56,461 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:01:00,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:01:00,918 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.62 vs. limit=12.0 2023-09-29 22:01:01,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:01:03,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 22:01:04,828 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:01:07,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:01:10,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:01:12,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:01:14,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:01:15,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:01:15,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 22:01:17,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:01:17,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 22:01:18,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:01:18,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 22:01:19,361 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=502866.6666666667, ans=0.05 2023-09-29 22:01:20,575 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:01:20,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 22:01:20,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 22:01:24,010 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=502866.6666666667, ans=0.04949747468305833 2023-09-29 22:01:27,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:01:29,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:01:29,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:01:32,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 22:01:34,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 22:01:34,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:01:36,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 22:01:38,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 22:01:39,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:01:44,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 22:01:47,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 22:01:47,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:01:47,509 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=503000.0, ans=0.0 2023-09-29 22:01:48,537 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:01:51,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:01:54,214 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.17 vs. limit=22.5 2023-09-29 22:01:54,994 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 22:01:56,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 22:01:56,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 22:01:56,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:01:56,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:02:00,206 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 22:02:05,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:02:07,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:02:07,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:02:08,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:02:08,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:02:13,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:02:13,410 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 22:02:14,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:02:14,973 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 22:02:15,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 22:02:16,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:02:17,889 INFO [train.py:1039] (1/4) Epoch 15, batch 1100, loss[loss=0.2039, simple_loss=0.2669, pruned_loss=0.07045, over 23812.00 frames. ], tot_loss[loss=0.1876, simple_loss=0.2604, pruned_loss=0.05739, over 4692277.90 frames. ], batch size: 212, lr: 6.97e-03, grad_scale: 8.0 2023-09-29 22:02:19,327 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.797e+02 2.092e+02 2.502e+02 4.130e+02, threshold=4.184e+02, percent-clipped=0.0 2023-09-29 22:02:19,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:02:25,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:02:28,950 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=503133.3333333333, ans=0.0 2023-09-29 22:02:30,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 22:02:30,366 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=503133.3333333333, ans=0.95 2023-09-29 22:02:31,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:02:31,701 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:02:33,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 22:02:33,413 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=503200.0, ans=0.125 2023-09-29 22:02:35,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:02:35,514 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=503200.0, ans=0.125 2023-09-29 22:02:37,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 22:02:39,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:02:44,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:02:44,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 22:02:47,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 22:02:47,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:02:47,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:02:50,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:02:52,606 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 22:02:54,498 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=503266.6666666667, ans=0.0 2023-09-29 22:02:57,208 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:03:00,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 22:03:01,759 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 22:03:01,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:03:04,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:03:05,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 22:03:06,421 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:03:08,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 22:03:08,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:03:10,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:03:10,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:03:10,234 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:03:10,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 22:03:14,448 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=503333.3333333333, ans=0.1 2023-09-29 22:03:16,427 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:03:16,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 22:03:20,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 22:03:24,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 22:03:26,472 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 22:03:26,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 22:03:26,842 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=503400.0, ans=0.125 2023-09-29 22:03:28,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:03:32,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:03:32,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:03:34,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 22:03:34,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:03:35,523 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:03:35,856 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=503400.0, ans=0.0 2023-09-29 22:03:37,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 22:03:37,070 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:03:37,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 22:03:38,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:03:38,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:03:40,085 INFO [train.py:1039] (1/4) Epoch 15, batch 1150, loss[loss=0.1629, simple_loss=0.2432, pruned_loss=0.04127, over 24595.00 frames. ], tot_loss[loss=0.1878, simple_loss=0.2607, pruned_loss=0.05747, over 4697145.32 frames. ], batch size: 60, lr: 6.97e-03, grad_scale: 8.0 2023-09-29 22:03:40,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:03:42,149 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=503466.6666666667, ans=0.1 2023-09-29 22:03:43,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:03:48,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:03:50,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:03:50,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:03:51,623 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 22:03:52,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:03:56,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 22:03:57,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:03:57,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 22:04:05,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 22:04:06,908 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:04:09,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:04:10,098 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:04:10,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 22:04:10,235 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:04:10,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:04:13,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 22:04:14,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:04:16,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:04:29,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:04:36,669 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:04:36,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 22:04:36,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:04:36,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:04:43,209 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 22:04:43,408 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=503666.6666666667, ans=0.0 2023-09-29 22:04:44,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:04:51,311 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=503733.3333333333, ans=0.0 2023-09-29 22:04:51,341 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=503733.3333333333, ans=0.0 2023-09-29 22:04:52,397 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 22:04:57,883 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=503733.3333333333, ans=0.125 2023-09-29 22:04:59,002 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:04:59,138 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:05:00,997 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 22:05:01,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:05:03,942 INFO [train.py:1039] (1/4) Epoch 15, batch 1200, loss[loss=0.1915, simple_loss=0.2603, pruned_loss=0.06141, over 23775.00 frames. ], tot_loss[loss=0.188, simple_loss=0.2613, pruned_loss=0.05733, over 4710283.70 frames. ], batch size: 212, lr: 6.96e-03, grad_scale: 16.0 2023-09-29 22:05:05,382 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.402e+02 1.796e+02 2.044e+02 2.374e+02 3.909e+02, threshold=4.087e+02, percent-clipped=0.0 2023-09-29 22:05:05,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:05:10,619 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=503800.0, ans=0.125 2023-09-29 22:05:11,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:05:11,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:05:14,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:05:14,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:05:14,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:05:16,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:05:18,058 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 22:05:19,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:05:21,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:05:22,757 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 22:05:24,382 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 22:05:27,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:05:31,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:05:33,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:05:36,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:05:36,414 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 22:05:38,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:05:44,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 22:05:44,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:05:44,963 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=503933.3333333333, ans=0.125 2023-09-29 22:05:46,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 22:05:46,233 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:05:47,145 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.45 vs. limit=15.0 2023-09-29 22:05:49,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 22:05:55,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 22:05:55,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:05:57,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:05:58,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:06:00,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 22:06:01,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:06:01,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:06:03,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:06:03,288 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 22:06:05,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 22:06:05,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:06:05,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 22:06:05,682 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=504000.0, ans=0.0 2023-09-29 22:06:06,971 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:06:06,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:06:13,923 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 22:06:14,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:06:17,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 22:06:22,005 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 22:06:23,565 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:06:24,957 INFO [train.py:1039] (1/4) Epoch 15, batch 1250, loss[loss=0.1707, simple_loss=0.2507, pruned_loss=0.04535, over 24670.00 frames. ], tot_loss[loss=0.1883, simple_loss=0.2617, pruned_loss=0.05744, over 4711993.54 frames. ], batch size: 65, lr: 6.96e-03, grad_scale: 16.0 2023-09-29 22:06:25,370 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=504133.3333333333, ans=0.125 2023-09-29 22:06:26,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:06:26,877 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=504133.3333333333, ans=0.125 2023-09-29 22:06:29,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:06:29,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:06:31,395 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=504133.3333333333, ans=0.0 2023-09-29 22:06:34,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 22:06:37,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:06:37,525 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=504133.3333333333, ans=0.1 2023-09-29 22:06:39,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:06:40,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 22:06:41,621 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.18 vs. limit=15.0 2023-09-29 22:06:42,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:06:42,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 22:06:48,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 22:06:49,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:06:51,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:06:51,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:06:52,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 22:06:57,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 22:06:57,556 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 22:06:57,564 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:06:59,199 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:06:59,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:07:01,118 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 22:07:02,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:07:03,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 22:07:04,090 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=504266.6666666667, ans=0.0 2023-09-29 22:07:08,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 22:07:08,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:07:08,907 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=504266.6666666667, ans=0.0 2023-09-29 22:07:11,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:07:11,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 22:07:13,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:07:13,235 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 22:07:13,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:07:13,272 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:07:19,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:07:22,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:07:23,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:07:25,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 22:07:25,426 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 22:07:25,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 22:07:28,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:07:28,893 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=504400.0, ans=0.0 2023-09-29 22:07:30,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 22:07:30,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:07:33,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 22:07:33,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:07:34,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 22:07:34,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 22:07:36,224 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:07:36,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 22:07:37,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:07:37,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 22:07:40,990 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:07:41,856 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=8.69 vs. limit=15.0 2023-09-29 22:07:42,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:07:42,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 22:07:44,316 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=504466.6666666667, ans=0.04949747468305833 2023-09-29 22:07:46,007 INFO [train.py:1039] (1/4) Epoch 15, batch 1300, loss[loss=0.1854, simple_loss=0.2729, pruned_loss=0.04898, over 24654.00 frames. ], tot_loss[loss=0.1889, simple_loss=0.2624, pruned_loss=0.05772, over 4707374.27 frames. ], batch size: 73, lr: 6.96e-03, grad_scale: 16.0 2023-09-29 22:07:46,227 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 22:07:48,103 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.942e+02 2.297e+02 2.853e+02 4.160e+02, threshold=4.593e+02, percent-clipped=1.0 2023-09-29 22:07:50,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:07:50,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 22:07:55,529 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:07:58,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 22:07:59,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:08:00,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:08:00,169 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:08:01,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 22:08:07,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:08:09,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:08:10,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 22:08:11,183 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=504533.3333333333, ans=0.125 2023-09-29 22:08:12,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 22:08:15,749 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=504533.3333333333, ans=0.1 2023-09-29 22:08:17,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:08:19,139 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:08:20,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:08:22,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:08:22,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 22:08:23,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 22:08:23,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 22:08:30,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:08:30,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 22:08:31,990 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 22:08:32,092 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 22:08:33,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:08:38,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:08:38,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 22:08:38,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:08:38,233 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 22:08:39,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:08:43,034 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:08:43,039 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:08:47,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 22:08:49,724 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 22:08:51,816 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 22:08:56,339 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:08:58,635 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 22:09:01,599 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:09:02,002 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=504733.3333333333, ans=0.2 2023-09-29 22:09:07,787 INFO [train.py:1039] (1/4) Epoch 15, batch 1350, loss[loss=0.1825, simple_loss=0.2565, pruned_loss=0.05423, over 23372.00 frames. ], tot_loss[loss=0.1878, simple_loss=0.2614, pruned_loss=0.05709, over 4720246.23 frames. ], batch size: 119, lr: 6.96e-03, grad_scale: 16.0 2023-09-29 22:09:07,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 22:09:09,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:09:09,907 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=504800.0, ans=0.1 2023-09-29 22:09:11,525 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=504800.0, ans=0.0 2023-09-29 22:09:12,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:09:15,693 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:09:17,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:09:18,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:09:18,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:09:26,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:09:26,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 22:09:27,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 22:09:29,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:09:33,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 22:09:33,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:09:35,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:09:35,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 22:09:36,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 22:09:39,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 22:09:41,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:09:41,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 22:09:53,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:10:03,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:10:03,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:10:04,010 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 22:10:08,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:10:10,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 22:10:10,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 22:10:11,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:10:14,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:10:16,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 22:10:17,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:10:21,824 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.39 vs. limit=10.0 2023-09-29 22:10:22,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 22:10:24,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 22:10:29,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 22:10:29,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:10:31,216 INFO [train.py:1039] (1/4) Epoch 15, batch 1400, loss[loss=0.1794, simple_loss=0.2669, pruned_loss=0.04596, over 24672.00 frames. ], tot_loss[loss=0.1868, simple_loss=0.2599, pruned_loss=0.05682, over 4713836.93 frames. ], batch size: 73, lr: 6.96e-03, grad_scale: 16.0 2023-09-29 22:10:33,170 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.906e+02 2.114e+02 2.329e+02 4.269e+02, threshold=4.227e+02, percent-clipped=0.0 2023-09-29 22:10:34,850 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:10:34,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:10:42,615 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 22:10:43,862 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 22:10:45,465 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 22:10:45,877 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=505133.3333333333, ans=0.1 2023-09-29 22:10:52,187 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=505200.0, ans=0.2 2023-09-29 22:10:53,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:10:54,452 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.45 vs. limit=15.0 2023-09-29 22:10:56,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:10:59,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:10:59,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 22:11:02,776 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:11:04,336 WARNING [train.py:1197] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 22:11:14,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:11:15,745 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:11:20,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 22:11:22,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:11:23,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:11:25,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:11:25,169 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:11:26,886 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=505333.3333333333, ans=0.0 2023-09-29 22:11:27,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:11:27,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:11:28,080 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:11:28,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 22:11:29,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:11:34,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:11:37,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:11:37,907 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=505400.0, ans=0.125 2023-09-29 22:11:43,178 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 22:11:44,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 22:11:44,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:11:50,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 22:11:51,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:11:53,651 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:11:54,999 INFO [train.py:1039] (1/4) Epoch 15, batch 1450, loss[loss=0.2112, simple_loss=0.285, pruned_loss=0.06868, over 23314.00 frames. ], tot_loss[loss=0.1866, simple_loss=0.2594, pruned_loss=0.05689, over 4697641.72 frames. ], batch size: 93, lr: 6.95e-03, grad_scale: 16.0 2023-09-29 22:11:55,370 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=505466.6666666667, ans=0.125 2023-09-29 22:11:56,219 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.10 vs. limit=15.0 2023-09-29 22:11:56,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:12:01,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:12:01,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:12:01,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 22:12:05,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:12:07,383 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 22:12:08,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:12:08,999 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 22:12:10,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 22:12:11,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 22:12:12,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:12:13,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:12:13,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 22:12:13,838 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:12:15,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 22:12:16,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 22:12:16,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:12:18,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:12:20,347 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:12:24,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:12:26,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:12:27,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:12:29,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:12:29,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:12:30,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:12:30,990 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:12:31,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:12:32,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:12:36,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 22:12:39,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:12:40,374 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=505600.0, ans=0.04949747468305833 2023-09-29 22:12:44,382 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 22:12:45,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:12:46,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:12:47,643 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:12:47,791 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=505666.6666666667, ans=0.2 2023-09-29 22:12:49,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 22:12:53,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:12:56,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 22:12:59,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 22:12:59,518 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=505733.3333333333, ans=0.07 2023-09-29 22:13:00,715 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:13:02,396 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=505733.3333333333, ans=0.04949747468305833 2023-09-29 22:13:03,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:13:05,137 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:13:06,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 22:13:08,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 22:13:08,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 22:13:09,965 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:13:11,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 22:13:16,247 INFO [train.py:1039] (1/4) Epoch 15, batch 1500, loss[loss=0.1672, simple_loss=0.2498, pruned_loss=0.04225, over 24301.00 frames. ], tot_loss[loss=0.1866, simple_loss=0.26, pruned_loss=0.05664, over 4703313.24 frames. ], batch size: 61, lr: 6.95e-03, grad_scale: 16.0 2023-09-29 22:13:17,607 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.853e+02 2.114e+02 2.421e+02 4.526e+02, threshold=4.227e+02, percent-clipped=1.0 2023-09-29 22:13:21,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 22:13:21,324 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=505800.0, ans=0.125 2023-09-29 22:13:22,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:13:22,481 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:13:22,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:13:24,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:13:26,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:13:27,533 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 22:13:29,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:13:29,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 22:13:29,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:13:31,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:13:32,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:13:34,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:13:38,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:13:39,001 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 22:13:39,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:13:40,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:13:40,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:13:43,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 22:13:48,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 22:13:49,701 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:13:51,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 22:13:54,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 22:13:54,387 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=505933.3333333333, ans=0.125 2023-09-29 22:13:57,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:13:57,173 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:13:57,194 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:13:58,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 22:13:58,808 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:13:58,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:14:00,809 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 22:14:02,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:14:06,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:14:06,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 22:14:12,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 22:14:14,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 22:14:20,418 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 22:14:20,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:14:20,506 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 22:14:20,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:14:22,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:14:23,614 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 22:14:25,128 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:14:29,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 22:14:30,594 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.26 vs. limit=15.0 2023-09-29 22:14:31,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:14:34,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:14:34,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:14:34,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:14:34,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:14:35,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:14:36,451 INFO [train.py:1039] (1/4) Epoch 15, batch 1550, loss[loss=0.1731, simple_loss=0.2534, pruned_loss=0.0464, over 24659.00 frames. ], tot_loss[loss=0.188, simple_loss=0.2611, pruned_loss=0.05739, over 4701276.88 frames. ], batch size: 65, lr: 6.95e-03, grad_scale: 16.0 2023-09-29 22:14:36,756 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 22:14:38,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 22:14:38,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:14:39,777 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 22:14:39,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 22:14:43,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:14:45,092 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:14:46,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:14:46,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:14:48,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:14:48,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:14:50,021 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=506133.3333333333, ans=0.125 2023-09-29 22:14:51,531 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=506200.0, ans=0.2 2023-09-29 22:14:52,814 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 22:14:54,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:14:54,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:14:54,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 22:14:57,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 22:14:57,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 22:14:59,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:14:59,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 22:15:01,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 22:15:01,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 22:15:02,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:15:03,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:15:08,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:15:11,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 22:15:11,997 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 22:15:19,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:15:22,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:15:22,989 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=506266.6666666667, ans=0.125 2023-09-29 22:15:23,445 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.83 vs. limit=22.5 2023-09-29 22:15:24,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 22:15:24,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:15:25,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 22:15:30,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 22:15:31,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:15:34,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:15:37,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:15:39,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:15:39,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 22:15:39,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:15:40,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:15:40,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:15:42,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 22:15:42,424 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 22:15:47,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:15:52,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 22:15:52,982 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=506400.0, ans=0.0 2023-09-29 22:15:57,915 INFO [train.py:1039] (1/4) Epoch 15, batch 1600, loss[loss=0.188, simple_loss=0.2536, pruned_loss=0.06126, over 23455.00 frames. ], tot_loss[loss=0.1885, simple_loss=0.2616, pruned_loss=0.05774, over 4700667.27 frames. ], batch size: 134, lr: 6.95e-03, grad_scale: 32.0 2023-09-29 22:15:59,412 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.908e+02 2.211e+02 2.589e+02 3.896e+02, threshold=4.422e+02, percent-clipped=0.0 2023-09-29 22:15:59,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:16:00,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:16:01,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 22:16:02,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:16:02,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:16:02,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:16:02,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:16:03,520 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.65 vs. limit=22.5 2023-09-29 22:16:04,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:16:07,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:16:09,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 22:16:09,474 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=506466.6666666667, ans=0.125 2023-09-29 22:16:10,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 22:16:12,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 22:16:15,149 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:16:15,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 22:16:16,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:16:19,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:16:23,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:16:28,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 22:16:31,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:16:33,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 22:16:34,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:16:34,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 22:16:39,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 22:16:39,903 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=506600.0, ans=0.125 2023-09-29 22:16:45,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:16:45,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 22:16:46,176 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=506666.6666666667, ans=0.025 2023-09-29 22:16:50,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:16:50,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:16:50,966 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:16:52,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 22:16:58,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 22:17:00,138 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:17:00,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:17:00,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:17:01,732 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:17:05,350 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:17:06,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:17:07,016 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:17:10,271 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=506733.3333333333, ans=0.0 2023-09-29 22:17:11,965 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=506733.3333333333, ans=0.125 2023-09-29 22:17:13,204 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=506733.3333333333, ans=0.0 2023-09-29 22:17:14,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:17:14,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:17:14,967 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=506733.3333333333, ans=0.0 2023-09-29 22:17:17,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 22:17:17,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:17:17,735 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 22:17:23,492 INFO [train.py:1039] (1/4) Epoch 15, batch 1650, loss[loss=0.1985, simple_loss=0.2709, pruned_loss=0.06308, over 23352.00 frames. ], tot_loss[loss=0.1901, simple_loss=0.2629, pruned_loss=0.05868, over 4688523.06 frames. ], batch size: 93, lr: 6.94e-03, grad_scale: 32.0 2023-09-29 22:17:25,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:17:25,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:17:27,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:17:27,280 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 22:17:27,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 22:17:27,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 22:17:29,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 22:17:32,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:17:32,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:17:34,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:17:34,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 22:17:37,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:17:39,336 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 22:17:41,815 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=506866.6666666667, ans=0.125 2023-09-29 22:17:44,466 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:17:44,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:17:44,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:17:44,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:17:45,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 22:17:45,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 22:17:53,429 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 22:17:55,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 22:18:01,908 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.51 vs. limit=6.0 2023-09-29 22:18:03,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 22:18:03,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:18:07,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 22:18:10,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:18:11,644 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.42 vs. limit=15.0 2023-09-29 22:18:12,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:18:13,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:18:14,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:18:14,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:18:14,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:18:17,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:18:19,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:18:19,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:18:19,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:18:20,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:18:21,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:18:23,403 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.52 vs. limit=12.0 2023-09-29 22:18:24,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:18:26,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 22:18:27,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:18:27,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 22:18:28,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 22:18:28,735 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 22:18:28,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:18:30,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:18:30,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:18:31,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:18:31,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 22:18:35,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:18:38,602 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:18:38,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:18:41,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 22:18:47,119 INFO [train.py:1039] (1/4) Epoch 15, batch 1700, loss[loss=0.1693, simple_loss=0.2244, pruned_loss=0.05708, over 23373.00 frames. ], tot_loss[loss=0.1891, simple_loss=0.2618, pruned_loss=0.05818, over 4690964.28 frames. ], batch size: 285, lr: 6.94e-03, grad_scale: 32.0 2023-09-29 22:18:48,608 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.631e+02 1.972e+02 2.169e+02 2.497e+02 4.927e+02, threshold=4.339e+02, percent-clipped=2.0 2023-09-29 22:18:48,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:18:48,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:18:48,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 22:18:48,943 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=507133.3333333333, ans=0.0 2023-09-29 22:18:49,050 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=507133.3333333333, ans=0.125 2023-09-29 22:18:50,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:18:50,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:18:51,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:18:54,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:18:54,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:18:54,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 22:18:57,728 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 22:19:05,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:19:07,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:19:14,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 22:19:14,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:19:14,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:19:14,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:19:19,620 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 22:19:21,276 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:19:21,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:19:22,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:19:24,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 22:19:26,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 22:19:26,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 22:19:27,685 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:19:29,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 22:19:30,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:19:38,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:19:38,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:19:40,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:19:42,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 22:19:42,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 22:19:42,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:19:45,868 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:19:45,869 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 22:19:45,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:19:45,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:19:47,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:19:47,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:19:48,297 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=507333.3333333333, ans=0.0 2023-09-29 22:19:50,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:19:50,134 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:19:51,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:19:51,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:19:51,764 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:19:56,344 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:19:57,919 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 22:20:00,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:20:00,962 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:20:04,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 22:20:08,730 INFO [train.py:1039] (1/4) Epoch 15, batch 1750, loss[loss=0.1757, simple_loss=0.2451, pruned_loss=0.05313, over 21144.00 frames. ], tot_loss[loss=0.1872, simple_loss=0.2595, pruned_loss=0.05747, over 4689026.15 frames. ], batch size: 46, lr: 6.94e-03, grad_scale: 32.0 2023-09-29 22:20:08,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:20:11,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:20:12,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 22:20:14,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 22:20:14,132 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:20:17,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:20:18,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:20:24,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 22:20:26,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:20:28,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 22:20:28,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:20:29,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:20:34,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 22:20:35,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 22:20:37,400 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:20:37,438 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 22:20:46,526 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:20:50,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:20:50,329 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:20:53,566 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:20:53,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:20:55,806 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:20:57,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:21:00,721 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:21:02,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:21:02,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 22:21:02,506 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=507666.6666666667, ans=0.0 2023-09-29 22:21:03,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:21:06,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 22:21:07,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:21:10,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:21:11,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:21:16,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 22:21:16,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 22:21:16,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:21:19,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:21:22,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:21:25,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:21:27,353 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:21:29,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 22:21:29,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:21:30,898 INFO [train.py:1039] (1/4) Epoch 15, batch 1800, loss[loss=0.1762, simple_loss=0.2535, pruned_loss=0.04946, over 17232.00 frames. ], tot_loss[loss=0.1865, simple_loss=0.2589, pruned_loss=0.05705, over 4699049.22 frames. ], batch size: 37, lr: 6.94e-03, grad_scale: 16.0 2023-09-29 22:21:30,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 22:21:30,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:21:30,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:21:31,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:21:31,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:21:34,568 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.599e+02 1.864e+02 2.038e+02 2.350e+02 3.855e+02, threshold=4.075e+02, percent-clipped=0.0 2023-09-29 22:21:34,848 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:21:36,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:21:36,804 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=507800.0, ans=0.2 2023-09-29 22:21:37,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 22:21:40,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:21:42,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 22:21:45,516 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:21:48,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:21:51,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:21:51,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:21:53,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:21:54,669 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:21:54,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 22:21:56,676 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:21:59,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:22:03,631 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 22:22:06,313 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=507933.3333333333, ans=0.125 2023-09-29 22:22:07,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 22:22:07,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 22:22:07,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:22:07,757 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=507933.3333333333, ans=0.1 2023-09-29 22:22:07,881 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=507933.3333333333, ans=0.0 2023-09-29 22:22:08,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:22:08,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:22:11,007 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:22:17,290 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 22:22:17,467 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:22:20,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:22:22,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 22:22:23,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 22:22:23,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:22:25,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:22:27,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 22:22:33,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 22:22:38,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:22:40,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 22:22:42,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:22:42,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:22:42,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:22:43,861 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 22:22:45,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:22:45,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:22:49,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 22:22:49,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:22:50,988 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:22:51,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 22:22:51,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:22:51,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:22:52,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:22:54,103 INFO [train.py:1039] (1/4) Epoch 15, batch 1850, loss[loss=0.2041, simple_loss=0.2764, pruned_loss=0.06591, over 23498.00 frames. ], tot_loss[loss=0.1869, simple_loss=0.2594, pruned_loss=0.05718, over 4709229.68 frames. ], batch size: 106, lr: 6.94e-03, grad_scale: 16.0 2023-09-29 22:22:55,677 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:22:55,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:22:58,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:23:00,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:23:01,995 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=508133.3333333333, ans=0.125 2023-09-29 22:23:07,156 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=508133.3333333333, ans=0.125 2023-09-29 22:23:08,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:23:08,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 22:23:15,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 22:23:18,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 22:23:23,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:23:23,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 22:23:23,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 22:23:33,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:23:33,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 22:23:36,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:23:37,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:23:38,288 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=508266.6666666667, ans=0.125 2023-09-29 22:23:41,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 22:23:41,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:23:41,898 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 22:23:43,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:23:45,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:23:47,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:23:51,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 22:23:51,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:23:52,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 22:23:52,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:23:53,733 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:23:55,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:23:57,677 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=508333.3333333333, ans=0.0 2023-09-29 22:23:58,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 22:24:00,325 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:24:03,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:24:05,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:24:05,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 22:24:05,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 22:24:06,705 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 22:24:08,208 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 22:24:08,685 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=508400.0, ans=0.1 2023-09-29 22:24:09,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 22:24:09,769 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:24:09,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:24:09,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:24:09,943 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 22:24:09,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:24:11,417 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:24:11,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 22:24:11,878 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=508400.0, ans=0.125 2023-09-29 22:24:15,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 22:24:17,055 INFO [train.py:1039] (1/4) Epoch 15, batch 1900, loss[loss=0.184, simple_loss=0.2486, pruned_loss=0.0597, over 23678.00 frames. ], tot_loss[loss=0.1874, simple_loss=0.2602, pruned_loss=0.05728, over 4710700.23 frames. ], batch size: 164, lr: 6.93e-03, grad_scale: 16.0 2023-09-29 22:24:17,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:24:17,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 22:24:18,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:24:18,824 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 22:24:18,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:24:20,831 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 2.037e+02 2.291e+02 2.918e+02 4.608e+02, threshold=4.583e+02, percent-clipped=3.0 2023-09-29 22:24:20,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:24:22,061 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.36 vs. limit=15.0 2023-09-29 22:24:25,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:24:28,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:24:28,870 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 22:24:29,947 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 22:24:30,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 22:24:33,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:24:33,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:24:33,940 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=508533.3333333333, ans=0.05 2023-09-29 22:24:34,975 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 22:24:35,030 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 22:24:39,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 22:24:41,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:24:44,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 22:24:45,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 22:24:49,070 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer_na.min_abs, batch_count=508600.0, ans=0.02 2023-09-29 22:24:54,903 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=508600.0, ans=10.0 2023-09-29 22:24:59,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 22:25:02,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 22:25:02,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:25:02,826 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 22:25:02,832 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 22:25:03,186 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=508600.0, ans=0.125 2023-09-29 22:25:04,258 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 22:25:04,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 22:25:04,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:25:09,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 22:25:12,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:25:16,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:25:16,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 22:25:18,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:25:20,472 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=508733.3333333333, ans=0.125 2023-09-29 22:25:21,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 22:25:21,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:25:27,130 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=508733.3333333333, ans=0.125 2023-09-29 22:25:28,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:25:28,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:25:28,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:25:30,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:25:32,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 22:25:32,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 22:25:33,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:25:36,737 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:25:36,739 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:25:38,206 INFO [train.py:1039] (1/4) Epoch 15, batch 1950, loss[loss=0.1755, simple_loss=0.2513, pruned_loss=0.04988, over 24498.00 frames. ], tot_loss[loss=0.1874, simple_loss=0.2608, pruned_loss=0.05704, over 4729559.77 frames. ], batch size: 63, lr: 6.93e-03, grad_scale: 16.0 2023-09-29 22:25:39,804 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:25:39,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:25:41,252 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:25:41,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:25:44,573 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:25:47,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:25:47,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:25:47,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 22:25:49,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 22:25:51,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 22:25:51,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:25:53,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:25:53,258 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=508866.6666666667, ans=0.125 2023-09-29 22:25:53,269 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=508866.6666666667, ans=0.0 2023-09-29 22:25:54,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:25:56,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:25:56,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:25:56,462 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=508866.6666666667, ans=0.1 2023-09-29 22:25:58,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:25:58,783 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.78 vs. limit=15.0 2023-09-29 22:26:01,166 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:26:01,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 22:26:01,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:26:01,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:26:06,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:26:08,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:26:08,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:26:08,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 22:26:08,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 22:26:09,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 22:26:09,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:26:10,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:26:13,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:26:16,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:26:22,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 22:26:25,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:26:25,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 22:26:25,907 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 22:26:25,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:26:32,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:26:33,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:26:35,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 22:26:42,617 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:26:44,064 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:26:46,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:26:49,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:26:51,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:26:53,004 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:26:53,107 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 22:26:53,116 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 22:26:53,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:26:56,146 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 22:26:58,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:26:59,709 INFO [train.py:1039] (1/4) Epoch 15, batch 2000, loss[loss=0.1897, simple_loss=0.2731, pruned_loss=0.0531, over 24422.00 frames. ], tot_loss[loss=0.1876, simple_loss=0.261, pruned_loss=0.05712, over 4721996.71 frames. ], batch size: 69, lr: 6.93e-03, grad_scale: 32.0 2023-09-29 22:27:02,736 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.604e+02 1.867e+02 2.115e+02 2.554e+02 3.825e+02, threshold=4.229e+02, percent-clipped=0.0 2023-09-29 22:27:02,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 22:27:05,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:27:05,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:27:06,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:27:09,694 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:27:13,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 22:27:13,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 22:27:15,419 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 22:27:18,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:27:21,025 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 22:27:21,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 22:27:21,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:27:24,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:27:25,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 22:27:27,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:27:27,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:27:28,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:27:29,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 22:27:29,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 22:27:31,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 22:27:31,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:27:35,762 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:27:37,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 22:27:37,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:27:39,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:27:39,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:27:40,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 22:27:44,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 22:27:44,034 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:27:44,064 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:27:50,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:27:51,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:27:51,722 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:27:53,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:27:54,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:27:54,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:27:56,283 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:27:56,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:27:57,764 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:28:00,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:28:02,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 22:28:07,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 22:28:09,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:28:12,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:28:12,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:28:13,048 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=509400.0, ans=0.0 2023-09-29 22:28:16,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:28:17,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:28:17,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:28:19,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 22:28:19,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:28:20,665 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=509400.0, ans=0.07 2023-09-29 22:28:22,446 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=11.57 vs. limit=15.0 2023-09-29 22:28:23,157 INFO [train.py:1039] (1/4) Epoch 15, batch 2050, loss[loss=0.1689, simple_loss=0.2244, pruned_loss=0.05676, over 23402.00 frames. ], tot_loss[loss=0.1871, simple_loss=0.2604, pruned_loss=0.05692, over 4720952.83 frames. ], batch size: 285, lr: 6.93e-03, grad_scale: 32.0 2023-09-29 22:28:23,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:28:24,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:28:27,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:28:28,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:28:31,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:28:34,292 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:28:34,389 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:28:34,482 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:28:37,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 22:28:37,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:28:38,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:28:38,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:28:44,613 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=509533.3333333333, ans=0.125 2023-09-29 22:28:46,061 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=509533.3333333333, ans=0.2 2023-09-29 22:28:50,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:28:50,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:28:54,305 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 22:28:55,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:28:57,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 22:28:57,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:29:02,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:29:04,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:29:06,592 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 22:29:06,688 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:29:08,239 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:29:09,728 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:29:11,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:29:14,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:29:16,068 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:29:17,687 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 22:29:21,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:29:23,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:29:30,458 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=509733.3333333333, ans=0.0 2023-09-29 22:29:31,398 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:29:32,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 22:29:36,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:29:37,867 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:29:40,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:29:42,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 22:29:43,641 INFO [train.py:1039] (1/4) Epoch 15, batch 2100, loss[loss=0.1767, simple_loss=0.2467, pruned_loss=0.05339, over 23528.00 frames. ], tot_loss[loss=0.1858, simple_loss=0.2586, pruned_loss=0.0565, over 4717202.16 frames. ], batch size: 134, lr: 6.92e-03, grad_scale: 32.0 2023-09-29 22:29:45,610 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 22:29:45,610 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:29:45,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:29:46,884 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.817e+02 2.090e+02 2.571e+02 3.864e+02, threshold=4.179e+02, percent-clipped=0.0 2023-09-29 22:29:47,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 22:29:48,622 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:29:48,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 22:29:48,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 22:29:50,315 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:29:55,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:29:56,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:29:57,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:29:57,903 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=509800.0, ans=0.125 2023-09-29 22:29:59,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:29:59,336 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 22:30:01,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:30:01,532 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 22:30:01,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 22:30:04,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:30:04,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:30:04,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 22:30:04,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 22:30:09,467 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 22:30:09,469 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 22:30:14,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:30:14,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:30:18,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:30:18,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 22:30:20,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:30:20,018 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 22:30:21,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 22:30:21,750 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:30:21,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 22:30:21,823 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 22:30:23,259 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 22:30:26,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 22:30:30,380 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:30:32,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 22:30:32,954 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.83 vs. limit=10.0 2023-09-29 22:30:33,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 22:30:35,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:30:37,293 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:30:37,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 22:30:37,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:30:37,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:30:37,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:30:37,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 22:30:40,344 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 22:30:40,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 22:30:43,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:30:43,867 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=510000.0, ans=0.1 2023-09-29 22:30:46,772 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:30:46,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 22:30:51,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:30:54,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:30:56,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:30:56,025 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:30:56,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 22:30:56,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 22:30:56,343 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=510066.6666666667, ans=0.125 2023-09-29 22:30:57,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:30:57,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:30:59,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:30:59,837 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:31:03,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 22:31:07,264 INFO [train.py:1039] (1/4) Epoch 15, batch 2150, loss[loss=0.1994, simple_loss=0.2447, pruned_loss=0.07706, over 18945.00 frames. ], tot_loss[loss=0.1851, simple_loss=0.2579, pruned_loss=0.05618, over 4701769.43 frames. ], batch size: 388, lr: 6.92e-03, grad_scale: 32.0 2023-09-29 22:31:07,346 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 22:31:07,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:31:09,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:31:09,042 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:31:09,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:31:09,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:31:15,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 22:31:15,556 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=510133.3333333333, ans=0.04949747468305833 2023-09-29 22:31:18,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:31:18,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:31:20,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:31:20,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:31:20,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:31:23,259 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:31:24,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:31:24,751 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:31:27,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:31:27,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 22:31:28,091 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=510200.0, ans=0.1 2023-09-29 22:31:29,832 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.19 vs. limit=15.0 2023-09-29 22:31:31,386 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.80 vs. limit=15.0 2023-09-29 22:31:32,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:31:34,247 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:31:36,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:31:36,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:31:37,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:31:38,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:31:38,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:31:38,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:31:40,540 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:31:42,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 22:31:43,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:31:43,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:31:45,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:31:45,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 22:31:46,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:31:48,664 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:31:50,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 22:31:51,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:31:51,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 22:31:51,760 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 22:31:54,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:31:56,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:31:57,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:31:59,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 22:31:59,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:32:00,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:32:00,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 22:32:02,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 22:32:02,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:32:03,755 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 22:32:03,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:32:03,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:32:07,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 22:32:07,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:32:07,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 22:32:07,344 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 22:32:07,345 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 22:32:07,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 22:32:11,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:32:11,141 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:32:13,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:32:13,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:32:14,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 22:32:14,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:32:14,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:32:19,662 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=510400.0, ans=0.0 2023-09-29 22:32:22,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:32:22,754 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=510400.0, ans=0.125 2023-09-29 22:32:24,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 22:32:25,773 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:32:27,578 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=510466.6666666667, ans=0.0 2023-09-29 22:32:28,649 INFO [train.py:1039] (1/4) Epoch 15, batch 2200, loss[loss=0.1595, simple_loss=0.2389, pruned_loss=0.04006, over 24597.00 frames. ], tot_loss[loss=0.1853, simple_loss=0.2581, pruned_loss=0.05622, over 4706984.19 frames. ], batch size: 60, lr: 6.92e-03, grad_scale: 32.0 2023-09-29 22:32:31,712 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.640e+02 1.917e+02 2.133e+02 2.422e+02 4.121e+02, threshold=4.265e+02, percent-clipped=0.0 2023-09-29 22:32:31,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:32:31,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:32:32,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:32:32,236 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 22:32:33,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:32:35,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:32:35,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:32:35,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 22:32:42,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 22:32:44,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 22:32:49,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 22:32:52,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:32:53,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:32:54,427 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:32:56,150 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=510533.3333333333, ans=0.07 2023-09-29 22:33:00,231 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:33:00,279 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 22:33:04,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:33:06,398 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:33:06,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 22:33:10,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 22:33:11,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:33:14,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:33:14,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:33:16,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 22:33:19,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:33:20,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 22:33:22,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:33:23,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 22:33:23,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:33:26,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:33:28,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:33:28,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:33:28,310 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:33:28,594 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=510666.6666666667, ans=0.0 2023-09-29 22:33:29,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 22:33:30,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:33:32,976 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 22:33:33,329 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=510733.3333333333, ans=0.0 2023-09-29 22:33:36,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 22:33:37,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:33:39,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 22:33:40,666 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 22:33:42,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:33:42,421 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 22:33:43,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 22:33:45,355 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 22:33:47,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:33:47,653 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 22:33:50,560 INFO [train.py:1039] (1/4) Epoch 15, batch 2250, loss[loss=0.1854, simple_loss=0.2707, pruned_loss=0.05011, over 24340.00 frames. ], tot_loss[loss=0.186, simple_loss=0.2591, pruned_loss=0.05644, over 4716882.38 frames. ], batch size: 74, lr: 6.92e-03, grad_scale: 32.0 2023-09-29 22:33:50,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:33:51,177 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=510800.0, ans=0.0 2023-09-29 22:33:52,582 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 22:33:54,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:33:56,034 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=510800.0, ans=0.0 2023-09-29 22:33:56,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:34:01,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:34:03,141 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:34:07,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:34:07,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:34:09,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:34:10,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 22:34:12,156 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:34:12,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:34:13,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 22:34:15,327 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:34:15,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:34:15,662 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=510866.6666666667, ans=0.2 2023-09-29 22:34:17,057 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:34:20,443 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=510866.6666666667, ans=0.125 2023-09-29 22:34:23,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:34:25,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 22:34:25,175 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 22:34:25,623 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=510933.3333333333, ans=0.125 2023-09-29 22:34:26,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 22:34:26,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:34:30,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:34:31,468 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.64 vs. limit=12.0 2023-09-29 22:34:34,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:34:35,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:34:38,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:34:38,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:34:41,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:34:43,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:34:47,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:34:50,862 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 22:34:56,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 22:34:56,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 22:34:58,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:35:04,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 22:35:08,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 22:35:08,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 22:35:08,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:35:08,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:35:11,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 22:35:12,804 INFO [train.py:1039] (1/4) Epoch 15, batch 2300, loss[loss=0.2128, simple_loss=0.278, pruned_loss=0.07378, over 23543.00 frames. ], tot_loss[loss=0.1871, simple_loss=0.2601, pruned_loss=0.057, over 4711170.05 frames. ], batch size: 256, lr: 6.91e-03, grad_scale: 8.0 2023-09-29 22:35:14,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:35:15,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:35:19,133 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.881e+02 2.170e+02 2.531e+02 3.802e+02, threshold=4.341e+02, percent-clipped=0.0 2023-09-29 22:35:20,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:35:22,183 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:35:23,849 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 22:35:25,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:35:32,863 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:35:32,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 22:35:32,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:35:32,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:35:33,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 22:35:35,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:35:40,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:35:40,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:35:45,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:35:48,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 22:35:51,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:35:57,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:35:58,680 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:36:00,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:36:03,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:36:08,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:36:08,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 22:36:08,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:36:08,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 22:36:12,228 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 22:36:12,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:36:13,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:36:13,650 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:36:13,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:36:15,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 22:36:15,950 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 22:36:16,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 22:36:16,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:36:16,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:36:16,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 22:36:21,572 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:36:21,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=511400.0, ans=0.1 2023-09-29 22:36:26,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:36:30,557 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:36:30,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:36:30,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 22:36:32,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 22:36:32,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:36:32,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 22:36:33,774 INFO [train.py:1039] (1/4) Epoch 15, batch 2350, loss[loss=0.188, simple_loss=0.2436, pruned_loss=0.06622, over 23883.00 frames. ], tot_loss[loss=0.1879, simple_loss=0.2612, pruned_loss=0.05731, over 4721349.58 frames. ], batch size: 164, lr: 6.91e-03, grad_scale: 8.0 2023-09-29 22:36:33,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 22:36:41,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:36:42,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 22:36:48,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 22:36:50,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:36:55,451 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:36:55,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:36:55,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:36:55,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:36:56,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 22:37:00,314 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=511533.3333333333, ans=0.1 2023-09-29 22:37:00,385 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=511533.3333333333, ans=0.0 2023-09-29 22:37:01,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:37:01,981 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=511533.3333333333, ans=0.2 2023-09-29 22:37:07,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 22:37:07,942 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=511600.0, ans=0.1 2023-09-29 22:37:09,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:37:12,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:37:12,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:37:15,162 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:37:15,399 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=511600.0, ans=0.0 2023-09-29 22:37:17,262 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 22:37:17,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:37:19,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:37:19,127 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:37:19,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:37:24,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:37:25,078 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=511666.6666666667, ans=0.1 2023-09-29 22:37:28,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 22:37:28,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:37:30,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:37:31,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:37:32,142 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=511666.6666666667, ans=0.125 2023-09-29 22:37:33,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 22:37:33,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:37:36,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 22:37:36,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 22:37:39,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 22:37:44,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 22:37:44,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:37:44,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 22:37:44,371 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 22:37:45,756 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 22:37:48,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 22:37:52,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:37:52,718 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=511733.3333333333, ans=0.04949747468305833 2023-09-29 22:37:56,456 INFO [train.py:1039] (1/4) Epoch 15, batch 2400, loss[loss=0.202, simple_loss=0.2606, pruned_loss=0.0717, over 23733.00 frames. ], tot_loss[loss=0.1877, simple_loss=0.2606, pruned_loss=0.05741, over 4712414.30 frames. ], batch size: 179, lr: 6.91e-03, grad_scale: 16.0 2023-09-29 22:37:58,045 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:37:59,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=511800.0, ans=0.125 2023-09-29 22:38:01,172 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:38:01,686 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=511800.0, ans=0.0 2023-09-29 22:38:03,208 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.881e+02 2.096e+02 2.397e+02 4.111e+02, threshold=4.192e+02, percent-clipped=0.0 2023-09-29 22:38:03,407 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:38:03,492 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 22:38:03,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 22:38:10,306 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.06 vs. limit=22.5 2023-09-29 22:38:12,706 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 22:38:12,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:38:15,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 22:38:15,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:38:17,023 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:38:17,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 22:38:23,319 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:38:25,536 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 22:38:30,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 22:38:37,088 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 22:38:38,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:38:40,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:38:43,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:38:43,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 22:38:43,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:38:51,495 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:38:53,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:38:55,535 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=512000.0, ans=0.07 2023-09-29 22:38:56,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:38:58,366 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:38:58,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 22:38:58,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:38:58,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:38:58,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:39:00,511 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 22:39:05,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:39:07,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 22:39:07,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 22:39:07,294 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 22:39:08,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 22:39:10,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:39:10,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:39:10,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 22:39:11,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 22:39:11,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 22:39:11,988 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 22:39:13,471 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 22:39:14,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:39:16,494 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:39:16,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:39:17,954 INFO [train.py:1039] (1/4) Epoch 15, batch 2450, loss[loss=0.1621, simple_loss=0.2072, pruned_loss=0.05849, over 19220.00 frames. ], tot_loss[loss=0.1861, simple_loss=0.2587, pruned_loss=0.05676, over 4700727.81 frames. ], batch size: 388, lr: 6.91e-03, grad_scale: 16.0 2023-09-29 22:39:18,126 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 22:39:18,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:39:19,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 22:39:20,114 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=512133.3333333333, ans=0.1 2023-09-29 22:39:22,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 22:39:22,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:39:27,556 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:39:27,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:39:29,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 22:39:34,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:39:34,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:39:37,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:39:37,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:39:37,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:39:39,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 22:39:43,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:39:45,055 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=512200.0, ans=0.125 2023-09-29 22:39:46,288 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 22:39:47,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:39:51,084 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=512266.6666666667, ans=0.1 2023-09-29 22:39:52,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 22:39:52,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:39:54,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:39:55,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:39:57,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 22:39:57,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:39:57,719 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=512266.6666666667, ans=0.0 2023-09-29 22:40:07,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:40:09,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:40:09,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:40:09,564 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:40:11,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:40:12,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:40:12,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 22:40:14,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:40:16,360 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:40:16,620 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=512333.3333333333, ans=0.125 2023-09-29 22:40:20,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:40:20,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:40:24,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:40:24,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 22:40:26,182 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:40:26,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:40:27,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 22:40:29,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:40:29,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:40:29,647 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=512400.0, ans=0.0 2023-09-29 22:40:32,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:40:34,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:40:35,233 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.57 vs. limit=12.0 2023-09-29 22:40:35,677 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:40:40,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 22:40:41,409 INFO [train.py:1039] (1/4) Epoch 15, batch 2500, loss[loss=0.1882, simple_loss=0.2534, pruned_loss=0.06148, over 23791.00 frames. ], tot_loss[loss=0.1856, simple_loss=0.2582, pruned_loss=0.05649, over 4703723.36 frames. ], batch size: 164, lr: 6.91e-03, grad_scale: 16.0 2023-09-29 22:40:41,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:40:48,500 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.863e+02 2.026e+02 2.249e+02 3.310e+02, threshold=4.053e+02, percent-clipped=0.0 2023-09-29 22:40:48,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:40:58,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:40:58,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:41:00,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:41:00,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 22:41:05,244 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=512533.3333333333, ans=0.1 2023-09-29 22:41:07,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 22:41:08,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:41:08,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 22:41:08,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 22:41:10,136 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 22:41:11,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:41:11,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:41:11,831 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 22:41:13,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:41:13,432 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 22:41:13,689 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=512600.0, ans=0.125 2023-09-29 22:41:14,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:41:15,811 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=512600.0, ans=0.125 2023-09-29 22:41:18,791 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=512600.0, ans=10.0 2023-09-29 22:41:20,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:41:20,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:41:22,563 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 22:41:24,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 22:41:25,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:41:27,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:41:31,993 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:41:36,488 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:41:39,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:41:44,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 22:41:46,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 22:41:48,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:41:48,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 22:41:50,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:41:50,152 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 22:41:50,302 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 22:41:50,302 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 22:41:50,321 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 22:41:54,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:41:56,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 22:41:56,114 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 22:41:57,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:41:59,057 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 22:42:03,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 22:42:04,927 INFO [train.py:1039] (1/4) Epoch 15, batch 2550, loss[loss=0.1603, simple_loss=0.2468, pruned_loss=0.03693, over 24324.00 frames. ], tot_loss[loss=0.1862, simple_loss=0.2595, pruned_loss=0.05645, over 4710592.02 frames. ], batch size: 61, lr: 6.90e-03, grad_scale: 16.0 2023-09-29 22:42:07,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:42:10,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:42:10,446 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:42:13,587 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:42:15,116 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 22:42:15,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 22:42:19,603 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 22:42:21,138 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:42:24,683 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:42:26,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:42:26,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 22:42:28,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 22:42:28,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:42:29,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:42:32,041 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:42:32,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 22:42:32,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 22:42:32,149 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:42:32,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 22:42:35,400 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=512866.6666666667, ans=0.1 2023-09-29 22:42:45,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:42:51,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:42:51,181 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:42:51,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:42:51,662 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=512933.3333333333, ans=0.2 2023-09-29 22:42:52,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 22:42:54,919 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=513000.0, ans=0.025 2023-09-29 22:42:58,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:42:58,501 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=513000.0, ans=0.125 2023-09-29 22:43:01,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 22:43:02,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:43:03,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:43:03,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 22:43:03,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 22:43:07,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:43:07,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:43:11,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:43:11,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 22:43:11,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:43:12,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:43:13,614 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:43:15,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 22:43:18,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:43:23,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:43:24,978 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:43:27,863 INFO [train.py:1039] (1/4) Epoch 15, batch 2600, loss[loss=0.1878, simple_loss=0.2742, pruned_loss=0.05072, over 24470.00 frames. ], tot_loss[loss=0.1875, simple_loss=0.261, pruned_loss=0.05704, over 4718415.32 frames. ], batch size: 69, lr: 6.90e-03, grad_scale: 16.0 2023-09-29 22:43:28,099 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 22:43:28,372 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=513133.3333333333, ans=0.2 2023-09-29 22:43:32,890 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 22:43:32,928 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:43:34,981 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 1.927e+02 2.129e+02 2.377e+02 3.619e+02, threshold=4.257e+02, percent-clipped=0.0 2023-09-29 22:43:35,103 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 22:43:35,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 22:43:35,282 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 22:43:39,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:43:39,751 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 22:43:39,942 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 22:43:42,039 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 22:43:45,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:43:45,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 22:43:48,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 22:43:49,669 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:43:49,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 22:43:50,157 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=513200.0, ans=0.125 2023-09-29 22:43:51,328 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 22:43:51,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 22:43:51,543 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=513200.0, ans=0.1 2023-09-29 22:43:53,582 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=513200.0, ans=0.2 2023-09-29 22:44:01,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:44:01,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:44:01,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:44:01,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 22:44:02,981 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=513266.6666666667, ans=0.04949747468305833 2023-09-29 22:44:04,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:44:08,153 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.18 vs. limit=10.0 2023-09-29 22:44:11,276 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 22:44:11,597 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=513266.6666666667, ans=0.125 2023-09-29 22:44:14,801 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=513266.6666666667, ans=0.1 2023-09-29 22:44:18,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:44:18,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:44:19,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 22:44:19,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:44:19,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:44:21,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 22:44:24,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 22:44:24,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:44:25,077 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=513333.3333333333, ans=0.0 2023-09-29 22:44:26,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:44:29,847 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 22:44:29,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:44:29,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:44:36,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:44:36,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:44:38,159 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 22:44:38,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:44:39,818 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:44:41,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:44:46,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 22:44:46,945 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=513400.0, ans=0.125 2023-09-29 22:44:48,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:44:50,518 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 22:44:51,976 INFO [train.py:1039] (1/4) Epoch 15, batch 2650, loss[loss=0.2349, simple_loss=0.293, pruned_loss=0.08839, over 23457.00 frames. ], tot_loss[loss=0.1891, simple_loss=0.2625, pruned_loss=0.05785, over 4712703.48 frames. ], batch size: 285, lr: 6.90e-03, grad_scale: 16.0 2023-09-29 22:44:55,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 22:44:55,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:44:56,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 22:44:58,149 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 22:44:58,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:44:59,926 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=513466.6666666667, ans=0.125 2023-09-29 22:45:01,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:45:03,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 22:45:06,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:45:06,520 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:45:06,827 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_positive, batch_count=513533.3333333333, ans=0.05 2023-09-29 22:45:07,200 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=14.63 vs. limit=15.0 2023-09-29 22:45:08,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 22:45:08,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 22:45:08,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:45:11,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 22:45:13,353 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 22:45:14,993 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=513533.3333333333, ans=0.125 2023-09-29 22:45:15,094 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=513533.3333333333, ans=0.2 2023-09-29 22:45:16,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:45:18,562 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 22:45:19,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:45:20,117 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 22:45:23,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:45:23,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 22:45:25,315 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:45:25,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:45:30,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 22:45:31,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 22:45:33,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 22:45:33,618 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.82 vs. limit=22.5 2023-09-29 22:45:38,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 22:45:38,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:45:39,914 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:45:39,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:45:40,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:45:41,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:45:43,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:45:44,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:45:44,667 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:45:46,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 22:45:48,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:45:48,455 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=513666.6666666667, ans=0.5 2023-09-29 22:45:49,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:45:49,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:45:51,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:45:52,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:45:52,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 22:45:55,278 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=513666.6666666667, ans=0.1 2023-09-29 22:45:57,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:45:58,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:45:58,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:45:58,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 22:46:03,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:46:06,360 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:46:07,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:46:07,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:46:09,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:46:11,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:46:13,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:46:14,396 INFO [train.py:1039] (1/4) Epoch 15, batch 2700, loss[loss=0.1966, simple_loss=0.2787, pruned_loss=0.05729, over 24573.00 frames. ], tot_loss[loss=0.1889, simple_loss=0.2628, pruned_loss=0.05748, over 4726578.14 frames. ], batch size: 71, lr: 6.90e-03, grad_scale: 16.0 2023-09-29 22:46:14,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 22:46:16,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:46:17,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 22:46:19,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:46:19,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:46:19,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:46:21,303 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.958e+02 2.156e+02 2.389e+02 4.797e+02, threshold=4.312e+02, percent-clipped=1.0 2023-09-29 22:46:21,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:46:21,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:46:22,239 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.27 vs. limit=15.0 2023-09-29 22:46:22,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:46:23,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 22:46:23,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 22:46:24,337 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:46:25,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 22:46:27,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 22:46:28,819 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:46:34,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 22:46:34,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 22:46:34,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:46:35,263 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=513866.6666666667, ans=0.2 2023-09-29 22:46:40,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:46:40,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:46:41,217 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=513866.6666666667, ans=0.2 2023-09-29 22:46:47,669 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:46:47,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:46:49,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:46:49,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 22:46:50,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:46:53,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:46:53,889 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 22:46:53,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:46:59,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:46:59,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:47:00,099 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.10 vs. limit=15.0 2023-09-29 22:47:08,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:47:08,476 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:47:12,799 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:47:12,802 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:47:16,063 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=514000.0, ans=0.125 2023-09-29 22:47:16,570 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.15 vs. limit=22.5 2023-09-29 22:47:17,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:47:17,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:47:19,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:47:20,984 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:47:22,456 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:47:22,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:47:23,218 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.36 vs. limit=15.0 2023-09-29 22:47:24,012 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=514066.6666666667, ans=0.125 2023-09-29 22:47:25,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:47:28,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:47:28,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:47:31,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 22:47:33,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:47:36,515 INFO [train.py:1039] (1/4) Epoch 15, batch 2750, loss[loss=0.185, simple_loss=0.2633, pruned_loss=0.0533, over 24014.00 frames. ], tot_loss[loss=0.1883, simple_loss=0.2625, pruned_loss=0.0571, over 4736066.18 frames. ], batch size: 86, lr: 6.89e-03, grad_scale: 16.0 2023-09-29 22:47:36,687 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:47:36,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 22:47:37,143 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=514133.3333333333, ans=0.125 2023-09-29 22:47:38,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 22:47:40,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:47:42,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:47:42,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:47:45,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:47:45,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 22:47:47,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:47:50,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:47:50,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 22:47:51,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:47:51,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:47:51,773 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 22:47:51,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:47:53,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:47:58,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 22:48:00,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:48:01,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:48:01,687 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:48:01,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 22:48:03,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:48:03,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:48:03,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:48:04,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:48:09,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 22:48:10,064 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=514266.6666666667, ans=0.0 2023-09-29 22:48:11,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 22:48:11,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:48:12,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:48:14,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 22:48:20,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:48:23,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 22:48:23,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:48:25,215 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=514333.3333333333, ans=0.125 2023-09-29 22:48:26,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:48:26,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 22:48:28,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:48:35,948 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:48:36,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:48:36,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 22:48:37,974 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=514333.3333333333, ans=0.125 2023-09-29 22:48:39,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:48:42,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 22:48:45,165 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.43 vs. limit=15.0 2023-09-29 22:48:50,089 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 22:48:51,714 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:48:51,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 22:48:53,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:48:56,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:48:56,384 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 22:48:56,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:48:59,414 INFO [train.py:1039] (1/4) Epoch 15, batch 2800, loss[loss=0.2192, simple_loss=0.2944, pruned_loss=0.072, over 23972.00 frames. ], tot_loss[loss=0.1873, simple_loss=0.2604, pruned_loss=0.05714, over 4713105.85 frames. ], batch size: 86, lr: 6.89e-03, grad_scale: 32.0 2023-09-29 22:48:59,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 22:48:59,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:49:00,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:49:02,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 22:49:02,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:49:02,496 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:49:05,773 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.774e+02 1.954e+02 2.291e+02 3.351e+02, threshold=3.907e+02, percent-clipped=0.0 2023-09-29 22:49:05,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:49:07,379 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 22:49:07,380 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 22:49:09,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:49:10,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 22:49:10,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:49:15,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:49:17,653 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 22:49:19,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 22:49:20,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 22:49:22,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:49:22,924 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:49:22,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:49:27,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:49:28,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:49:28,046 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 22:49:28,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:49:34,959 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.67 vs. limit=15.0 2023-09-29 22:49:39,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:49:39,677 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:49:42,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:49:42,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:49:44,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:49:48,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:49:48,202 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 22:49:50,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:49:51,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:49:51,654 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:49:54,738 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:49:56,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:50:00,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:50:01,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:50:01,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:50:01,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 22:50:01,937 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 22:50:03,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 22:50:04,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:50:04,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 22:50:04,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:50:05,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:50:05,066 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:50:08,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 22:50:10,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:50:10,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:50:10,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:50:11,790 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=514733.3333333333, ans=0.125 2023-09-29 22:50:13,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 22:50:19,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:50:19,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 22:50:21,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:50:22,715 INFO [train.py:1039] (1/4) Epoch 15, batch 2850, loss[loss=0.1923, simple_loss=0.2687, pruned_loss=0.05797, over 24458.00 frames. ], tot_loss[loss=0.1869, simple_loss=0.2596, pruned_loss=0.05709, over 4708733.42 frames. ], batch size: 63, lr: 6.89e-03, grad_scale: 16.0 2023-09-29 22:50:24,307 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:50:27,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:50:27,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:50:29,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:50:31,273 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:50:33,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:50:35,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 22:50:36,591 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 22:50:38,391 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=514866.6666666667, ans=0.0 2023-09-29 22:50:41,388 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=514866.6666666667, ans=0.2 2023-09-29 22:50:43,872 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 22:50:43,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:50:45,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 22:50:45,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:50:49,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 22:50:49,165 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 22:50:50,689 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:51:03,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:51:06,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:51:06,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:51:07,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 22:51:07,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 22:51:09,202 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 22:51:10,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 22:51:10,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 22:51:13,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 22:51:13,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:51:14,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:51:15,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:51:15,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:51:17,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:51:18,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:51:20,430 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:51:23,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:51:24,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:51:24,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:51:27,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:51:33,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:51:35,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 22:51:35,292 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 22:51:39,057 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 22:51:39,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:51:39,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 22:51:39,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:51:40,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:51:40,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:51:40,820 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:51:40,821 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 22:51:42,935 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 22:51:42,941 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:51:43,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:51:46,096 INFO [train.py:1039] (1/4) Epoch 15, batch 2900, loss[loss=0.1941, simple_loss=0.2747, pruned_loss=0.05671, over 24422.00 frames. ], tot_loss[loss=0.1867, simple_loss=0.2598, pruned_loss=0.0568, over 4723728.76 frames. ], batch size: 77, lr: 6.89e-03, grad_scale: 16.0 2023-09-29 22:51:49,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 22:51:49,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:51:49,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:51:50,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 22:51:53,873 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.822e+02 2.046e+02 2.406e+02 3.211e+02, threshold=4.092e+02, percent-clipped=0.0 2023-09-29 22:51:54,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:51:54,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 22:51:55,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 22:51:55,843 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=515133.3333333333, ans=0.125 2023-09-29 22:51:56,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:51:56,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 22:52:00,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:52:02,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:52:03,873 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:52:05,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:52:08,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 22:52:10,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 22:52:10,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 22:52:10,789 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=515200.0, ans=0.125 2023-09-29 22:52:12,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:52:15,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 22:52:15,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 22:52:20,353 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:52:20,357 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 22:52:20,403 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:52:23,319 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:52:23,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 22:52:26,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:52:26,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:52:31,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:52:32,815 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:52:33,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 22:52:33,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 22:52:33,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:52:38,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:52:40,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 22:52:41,849 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:52:42,027 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=515333.3333333333, ans=0.0 2023-09-29 22:52:47,249 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:52:56,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:52:56,608 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 22:52:58,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 22:53:01,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:53:01,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 22:53:02,763 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:53:02,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:53:06,902 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=515466.6666666667, ans=0.05 2023-09-29 22:53:07,967 INFO [train.py:1039] (1/4) Epoch 15, batch 2950, loss[loss=0.1731, simple_loss=0.2549, pruned_loss=0.04568, over 24586.00 frames. ], tot_loss[loss=0.1874, simple_loss=0.2604, pruned_loss=0.05721, over 4724970.80 frames. ], batch size: 60, lr: 6.89e-03, grad_scale: 16.0 2023-09-29 22:53:09,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:53:11,306 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 22:53:13,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:53:13,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:53:14,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:53:17,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:53:17,713 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 22:53:17,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 22:53:19,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 22:53:19,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:53:26,411 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:53:27,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:53:29,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:53:30,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:53:34,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:53:34,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:53:35,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:53:37,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:53:37,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:53:40,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 22:53:44,978 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.05 vs. limit=15.0 2023-09-29 22:53:45,652 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 22:53:45,685 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 22:53:45,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:53:47,842 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 22:53:48,721 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.31 vs. limit=6.0 2023-09-29 22:53:49,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 22:53:49,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:53:50,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:53:50,893 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 22:53:50,913 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 22:53:54,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 22:53:56,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:53:58,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:53:59,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:54:01,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:54:02,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:54:02,965 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 22:54:04,459 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:54:04,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 22:54:10,212 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.36 vs. limit=15.0 2023-09-29 22:54:10,803 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:54:10,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:54:12,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 22:54:12,451 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:54:14,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 22:54:17,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:54:20,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:54:20,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:54:22,212 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:54:22,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 22:54:22,635 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=515733.3333333333, ans=0.125 2023-09-29 22:54:23,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:54:25,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:54:25,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:54:25,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 22:54:26,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:54:26,260 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=515733.3333333333, ans=0.125 2023-09-29 22:54:27,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:54:29,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:54:29,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 22:54:31,024 INFO [train.py:1039] (1/4) Epoch 15, batch 3000, loss[loss=0.1639, simple_loss=0.2403, pruned_loss=0.0438, over 24443.00 frames. ], tot_loss[loss=0.1881, simple_loss=0.2612, pruned_loss=0.05749, over 4721299.67 frames. ], batch size: 58, lr: 6.88e-03, grad_scale: 16.0 2023-09-29 22:54:31,024 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-29 22:54:45,824 INFO [train.py:1071] (1/4) Epoch 15, validation: loss=0.2711, simple_loss=0.2767, pruned_loss=0.1327, over 1125622.00 frames. 2023-09-29 22:54:45,824 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-29 22:54:46,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:54:51,085 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:54:51,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 22:54:53,998 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 1.966e+02 2.278e+02 2.682e+02 4.156e+02, threshold=4.556e+02, percent-clipped=1.0 2023-09-29 22:54:54,251 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 22:54:54,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 22:54:57,423 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:54:57,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:54:57,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 22:54:57,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:54:57,917 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=515800.0, ans=0.0 2023-09-29 22:55:06,264 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 22:55:16,928 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:55:18,991 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=515933.3333333333, ans=0.125 2023-09-29 22:55:21,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 22:55:23,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 22:55:23,890 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=515933.3333333333, ans=0.125 2023-09-29 22:55:25,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 22:55:27,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:55:27,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:55:28,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:55:28,856 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 22:55:33,142 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 22:55:33,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:55:35,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 22:55:37,581 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:55:37,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:55:39,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:55:39,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:55:42,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 22:55:42,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:55:42,311 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:55:45,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:55:46,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 22:55:48,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:55:48,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:55:48,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:55:51,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:55:53,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:55:54,702 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 22:55:54,768 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 22:55:54,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:55:54,860 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 22:55:56,249 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:55:58,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 22:56:00,234 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=516066.6666666667, ans=0.125 2023-09-29 22:56:02,796 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 22:56:02,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 22:56:02,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 22:56:05,818 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 22:56:05,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 22:56:07,789 INFO [train.py:1039] (1/4) Epoch 15, batch 3050, loss[loss=0.2033, simple_loss=0.2629, pruned_loss=0.07182, over 23825.00 frames. ], tot_loss[loss=0.189, simple_loss=0.2622, pruned_loss=0.0579, over 4726776.10 frames. ], batch size: 179, lr: 6.88e-03, grad_scale: 16.0 2023-09-29 22:56:07,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:56:10,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:56:10,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 22:56:10,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:56:11,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:56:13,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 22:56:15,486 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:56:18,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:56:18,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:56:21,693 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:56:24,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 22:56:29,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 22:56:29,591 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 22:56:31,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:56:36,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:56:37,833 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:56:37,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:56:39,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:56:44,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:56:45,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 22:56:46,761 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=4.93 vs. limit=10.0 2023-09-29 22:56:47,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:56:47,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:56:47,184 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:56:47,340 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:56:47,584 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=516266.6666666667, ans=0.0 2023-09-29 22:56:50,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:56:52,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:56:52,824 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=516266.6666666667, ans=0.0 2023-09-29 22:56:54,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 22:56:55,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:56:55,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 22:56:57,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:56:58,866 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:56:58,974 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:57:00,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:57:03,840 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=516333.3333333333, ans=0.125 2023-09-29 22:57:06,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:57:06,385 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:57:13,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:57:14,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:57:14,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:57:16,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:57:16,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 22:57:16,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:57:18,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 22:57:19,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:57:19,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:57:20,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 22:57:23,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:57:29,750 INFO [train.py:1039] (1/4) Epoch 15, batch 3100, loss[loss=0.1868, simple_loss=0.2751, pruned_loss=0.04928, over 24312.00 frames. ], tot_loss[loss=0.1893, simple_loss=0.2623, pruned_loss=0.05817, over 4735405.49 frames. ], batch size: 74, lr: 6.88e-03, grad_scale: 16.0 2023-09-29 22:57:29,813 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:57:31,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:57:33,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 22:57:34,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 22:57:35,219 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=516466.6666666667, ans=0.0 2023-09-29 22:57:37,782 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 1.874e+02 2.072e+02 2.284e+02 2.890e+02, threshold=4.143e+02, percent-clipped=0.0 2023-09-29 22:57:37,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 22:57:40,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 22:57:40,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:57:41,177 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=516466.6666666667, ans=0.0 2023-09-29 22:57:44,684 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:57:46,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:57:46,503 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=516533.3333333333, ans=0.0 2023-09-29 22:57:47,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 22:57:54,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:57:58,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 22:58:05,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 22:58:05,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:58:05,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:58:07,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:58:08,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 22:58:10,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:58:10,203 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 22:58:10,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:58:10,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:58:11,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 22:58:12,120 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=516600.0, ans=0.95 2023-09-29 22:58:13,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:58:16,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:58:16,854 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=516666.6666666667, ans=0.1 2023-09-29 22:58:17,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 22:58:20,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 22:58:21,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:58:21,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:58:23,890 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:58:23,907 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:58:23,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:58:26,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 22:58:26,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:58:29,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:58:29,248 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:58:29,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:58:29,264 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 22:58:34,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:58:35,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 22:58:37,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:58:38,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 22:58:40,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:58:40,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:58:41,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 22:58:50,876 INFO [train.py:1039] (1/4) Epoch 15, batch 3150, loss[loss=0.1947, simple_loss=0.2608, pruned_loss=0.06431, over 23783.00 frames. ], tot_loss[loss=0.1875, simple_loss=0.2602, pruned_loss=0.05739, over 4717950.92 frames. ], batch size: 179, lr: 6.88e-03, grad_scale: 16.0 2023-09-29 22:58:51,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 22:58:52,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:58:55,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:58:56,946 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:58:56,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:58:58,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 22:59:00,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:59:00,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 22:59:00,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 22:59:01,326 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.80 vs. limit=22.5 2023-09-29 22:59:03,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:59:05,332 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 22:59:10,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 22:59:10,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:59:11,740 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 22:59:11,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 22:59:12,124 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=516866.6666666667, ans=0.0 2023-09-29 22:59:13,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 22:59:14,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 22:59:14,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 22:59:14,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:59:14,886 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:59:16,452 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:59:19,212 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 22:59:20,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:59:22,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:59:22,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:59:23,948 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:59:25,891 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=516933.3333333333, ans=0.0 2023-09-29 22:59:27,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 22:59:27,292 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:59:29,593 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:59:31,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:59:31,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 22:59:34,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 22:59:34,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:59:36,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 22:59:36,382 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 22:59:36,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:59:36,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:59:38,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 22:59:38,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 22:59:38,841 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=516933.3333333333, ans=0.1 2023-09-29 22:59:40,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 22:59:40,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 22:59:40,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:59:43,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:59:43,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:59:44,652 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 22:59:44,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:59:46,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 22:59:46,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:59:47,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 22:59:49,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 22:59:49,603 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:59:49,883 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=517000.0, ans=0.2 2023-09-29 22:59:50,466 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=7.04 vs. limit=12.0 2023-09-29 22:59:51,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:59:51,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 22:59:52,643 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 22:59:52,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:59:57,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:59:57,526 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=517066.6666666667, ans=0.125 2023-09-29 22:59:58,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:59:58,685 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:00:00,325 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=517066.6666666667, ans=0.1 2023-09-29 23:00:05,906 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 23:00:06,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:00:09,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 23:00:12,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:00:12,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 23:00:14,363 INFO [train.py:1039] (1/4) Epoch 15, batch 3200, loss[loss=0.1807, simple_loss=0.2566, pruned_loss=0.05241, over 23624.00 frames. ], tot_loss[loss=0.186, simple_loss=0.2592, pruned_loss=0.0564, over 4718902.64 frames. ], batch size: 149, lr: 6.87e-03, grad_scale: 32.0 2023-09-29 23:00:16,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:00:17,715 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:00:17,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 23:00:20,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:00:22,256 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 1.845e+02 1.990e+02 2.356e+02 4.554e+02, threshold=3.981e+02, percent-clipped=2.0 2023-09-29 23:00:24,031 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:00:28,630 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:00:37,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:00:49,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 23:00:52,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:00:54,493 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=18.33 vs. limit=22.5 2023-09-29 23:00:55,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 23:00:55,553 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=517266.6666666667, ans=0.1 2023-09-29 23:00:56,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 23:01:01,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:01:01,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 23:01:03,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:01:04,882 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 23:01:07,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 23:01:08,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 23:01:09,328 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.10 vs. limit=12.0 2023-09-29 23:01:12,250 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 23:01:15,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:01:19,686 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=11.46 vs. limit=10.0 2023-09-29 23:01:22,094 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:01:22,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:01:22,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:01:23,845 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 23:01:23,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 23:01:27,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:01:27,266 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 23:01:28,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 23:01:28,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 23:01:30,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 23:01:32,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:01:33,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 23:01:35,049 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 23:01:35,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:01:35,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:01:36,523 INFO [train.py:1039] (1/4) Epoch 15, batch 3250, loss[loss=0.2061, simple_loss=0.2722, pruned_loss=0.07001, over 23811.00 frames. ], tot_loss[loss=0.1865, simple_loss=0.2591, pruned_loss=0.05692, over 4700765.39 frames. ], batch size: 179, lr: 6.87e-03, grad_scale: 16.0 2023-09-29 23:01:36,676 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 23:01:39,025 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.50 vs. limit=22.5 2023-09-29 23:01:43,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:01:44,141 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=517466.6666666667, ans=0.0 2023-09-29 23:01:45,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:01:54,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:01:54,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 23:01:54,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:01:55,768 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:01:55,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:01:57,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 23:01:57,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:02:00,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:02:00,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 23:02:01,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:02:01,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:02:01,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:02:01,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:02:03,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:02:06,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 23:02:07,056 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=517533.3333333333, ans=0.125 2023-09-29 23:02:09,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:02:09,719 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:02:11,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:02:11,255 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:02:11,283 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:02:16,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 23:02:18,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:02:18,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:02:20,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:02:20,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:02:28,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:02:30,540 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=517666.6666666667, ans=0.125 2023-09-29 23:02:38,079 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:02:38,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:02:38,153 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 23:02:38,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:02:38,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 23:02:39,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:02:41,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 23:02:42,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 23:02:42,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:02:44,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:02:44,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:02:45,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 23:02:45,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:02:49,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:02:49,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:02:51,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 23:02:51,272 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:02:53,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 23:02:53,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 23:02:55,447 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.36 vs. limit=15.0 2023-09-29 23:02:58,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:02:58,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 23:02:59,919 INFO [train.py:1039] (1/4) Epoch 15, batch 3300, loss[loss=0.1954, simple_loss=0.2703, pruned_loss=0.06021, over 24571.00 frames. ], tot_loss[loss=0.1871, simple_loss=0.2604, pruned_loss=0.05687, over 4709305.06 frames. ], batch size: 60, lr: 6.87e-03, grad_scale: 16.0 2023-09-29 23:03:02,199 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 23:03:03,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 23:03:03,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:03:03,940 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 23:03:08,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:03:09,619 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.939e+02 2.168e+02 2.538e+02 3.579e+02, threshold=4.337e+02, percent-clipped=0.0 2023-09-29 23:03:09,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:03:09,902 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:03:11,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 23:03:11,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 23:03:16,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:03:17,756 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=517866.6666666667, ans=0.07 2023-09-29 23:03:18,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:03:19,352 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=517866.6666666667, ans=0.0 2023-09-29 23:03:21,211 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.15 vs. limit=15.0 2023-09-29 23:03:22,062 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 23:03:22,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:03:22,219 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:03:23,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:03:23,831 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 23:03:25,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:03:26,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 23:03:28,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:03:28,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:03:28,274 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 23:03:32,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:03:32,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 23:03:34,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:03:34,863 WARNING [train.py:1197] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 23:03:36,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 23:03:36,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:03:38,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:03:41,230 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 23:03:42,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 23:03:43,038 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=517933.3333333333, ans=0.0 2023-09-29 23:03:44,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:03:47,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 23:03:48,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:03:50,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 23:03:50,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:03:52,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:03:53,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:03:53,777 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:03:53,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 23:03:55,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:03:55,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:03:56,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:03:59,850 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 23:04:01,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 23:04:03,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 23:04:04,973 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:04:04,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:04:07,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:04:07,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:04:11,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 23:04:11,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:04:11,168 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 23:04:12,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:04:13,061 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=518066.6666666667, ans=0.125 2023-09-29 23:04:14,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 23:04:15,102 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.46 vs. limit=22.5 2023-09-29 23:04:15,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 23:04:16,856 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.33 vs. limit=8.0 2023-09-29 23:04:17,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:04:18,741 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:04:20,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:04:20,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:04:21,761 INFO [train.py:1039] (1/4) Epoch 15, batch 3350, loss[loss=0.185, simple_loss=0.2733, pruned_loss=0.04829, over 24334.00 frames. ], tot_loss[loss=0.1866, simple_loss=0.2603, pruned_loss=0.05641, over 4721760.83 frames. ], batch size: 74, lr: 6.87e-03, grad_scale: 16.0 2023-09-29 23:04:21,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:04:24,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:04:24,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:04:26,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:04:28,550 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.44 vs. limit=15.0 2023-09-29 23:04:29,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:04:30,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:04:32,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:04:35,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:04:37,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:04:39,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:04:40,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 23:04:42,732 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 23:04:42,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:04:47,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 23:04:47,194 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 23:04:48,730 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 23:04:48,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:04:50,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:04:50,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 23:04:50,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:04:51,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:04:52,091 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:04:54,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:04:55,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:04:56,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:04:59,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:04:59,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:05:01,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:05:05,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:05:07,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:05:10,160 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:05:10,186 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:05:12,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:05:15,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 23:05:15,463 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 23:05:15,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 23:05:15,592 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:05:17,706 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 23:05:19,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:05:21,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:05:27,001 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=518400.0, ans=0.0 2023-09-29 23:05:28,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:05:29,815 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 23:05:31,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 23:05:31,353 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:05:32,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:05:39,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:05:40,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 23:05:42,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 23:05:42,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:05:42,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:05:42,702 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=518466.6666666667, ans=0.125 2023-09-29 23:05:43,755 INFO [train.py:1039] (1/4) Epoch 15, batch 3400, loss[loss=0.1859, simple_loss=0.2733, pruned_loss=0.04929, over 24435.00 frames. ], tot_loss[loss=0.1871, simple_loss=0.2612, pruned_loss=0.05649, over 4719465.40 frames. ], batch size: 69, lr: 6.87e-03, grad_scale: 16.0 2023-09-29 23:05:43,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 23:05:43,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:05:43,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 23:05:46,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:05:46,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:05:46,987 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 23:05:49,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:05:49,216 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 23:05:54,290 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 1.906e+02 2.208e+02 2.568e+02 3.814e+02, threshold=4.417e+02, percent-clipped=0.0 2023-09-29 23:05:54,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 23:05:54,423 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 23:05:54,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:05:56,411 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=518466.6666666667, ans=0.1 2023-09-29 23:05:59,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:05:59,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 23:06:00,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:06:02,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 23:06:06,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:06:09,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 23:06:10,220 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=518533.3333333333, ans=0.1 2023-09-29 23:06:14,776 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:06:16,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:06:16,422 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:06:16,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 23:06:21,249 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=518600.0, ans=0.125 2023-09-29 23:06:26,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:06:31,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 23:06:34,138 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.56 vs. limit=15.0 2023-09-29 23:06:37,960 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:06:38,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:06:39,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 23:06:39,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:06:40,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:06:41,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:06:42,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:06:42,772 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=518666.6666666667, ans=0.0 2023-09-29 23:06:46,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:06:48,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 23:06:48,562 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:06:54,786 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:06:56,391 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 23:07:02,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 23:07:05,195 INFO [train.py:1039] (1/4) Epoch 15, batch 3450, loss[loss=0.1736, simple_loss=0.2454, pruned_loss=0.05092, over 24336.00 frames. ], tot_loss[loss=0.1864, simple_loss=0.2605, pruned_loss=0.0561, over 4733283.60 frames. ], batch size: 56, lr: 6.86e-03, grad_scale: 16.0 2023-09-29 23:07:05,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 23:07:07,341 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=518800.0, ans=0.0 2023-09-29 23:07:09,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 23:07:10,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:07:13,055 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:07:13,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 23:07:13,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:07:16,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 23:07:16,730 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=518800.0, ans=0.1 2023-09-29 23:07:16,778 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 23:07:21,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:07:22,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:07:23,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:07:23,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:07:26,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:07:32,638 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.30 vs. limit=15.0 2023-09-29 23:07:34,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 23:07:39,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 23:07:39,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 23:07:39,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:07:40,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:07:46,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 23:07:47,768 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.76 vs. limit=6.0 2023-09-29 23:07:48,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 23:07:50,372 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_abs, batch_count=518933.3333333333, ans=0.5 2023-09-29 23:07:53,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:07:53,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:07:53,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 23:07:55,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:07:57,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 23:07:57,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:07:58,498 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.49 vs. limit=15.0 2023-09-29 23:07:59,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:08:02,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:08:05,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 23:08:10,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:08:16,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:08:17,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:08:19,584 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:08:20,128 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=519066.6666666667, ans=0.125 2023-09-29 23:08:21,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:08:21,795 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=519066.6666666667, ans=0.0 2023-09-29 23:08:22,983 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:08:23,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:08:23,142 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:08:25,820 INFO [train.py:1039] (1/4) Epoch 15, batch 3500, loss[loss=0.1839, simple_loss=0.2273, pruned_loss=0.0703, over 19222.00 frames. ], tot_loss[loss=0.1863, simple_loss=0.2593, pruned_loss=0.05661, over 4714430.36 frames. ], batch size: 388, lr: 6.86e-03, grad_scale: 16.0 2023-09-29 23:08:28,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:08:32,173 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:08:32,354 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=519133.3333333333, ans=0.0 2023-09-29 23:08:32,460 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=519133.3333333333, ans=0.2 2023-09-29 23:08:33,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 23:08:35,735 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.501e+02 1.923e+02 2.112e+02 2.557e+02 4.010e+02, threshold=4.224e+02, percent-clipped=0.0 2023-09-29 23:08:36,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 23:08:38,132 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=519133.3333333333, ans=0.125 2023-09-29 23:08:39,415 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 23:08:43,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:08:43,077 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 23:08:47,721 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:08:49,253 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:08:50,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:08:50,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:08:50,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 23:08:52,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:08:52,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:08:52,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 23:08:54,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:08:55,493 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 23:08:55,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:08:59,387 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.99 vs. limit=15.0 2023-09-29 23:09:00,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:09:01,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 23:09:01,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:09:02,169 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=519266.6666666667, ans=0.0 2023-09-29 23:09:04,804 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:09:06,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:09:07,768 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:09:11,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:09:11,313 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:09:12,870 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 23:09:15,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 23:09:15,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 23:09:16,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:09:18,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:09:18,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:09:19,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 23:09:21,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 23:09:21,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:09:26,331 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:09:26,669 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=519333.3333333333, ans=0.0 2023-09-29 23:09:27,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 23:09:27,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 23:09:27,993 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:09:28,309 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=519333.3333333333, ans=0.125 2023-09-29 23:09:30,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:09:32,464 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:09:34,002 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:09:37,009 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 23:09:38,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:09:40,084 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:09:42,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 23:09:43,714 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 23:09:47,179 INFO [train.py:1039] (1/4) Epoch 15, batch 3550, loss[loss=0.1833, simple_loss=0.2223, pruned_loss=0.07215, over 18840.00 frames. ], tot_loss[loss=0.1864, simple_loss=0.2589, pruned_loss=0.05695, over 4710774.69 frames. ], batch size: 388, lr: 6.86e-03, grad_scale: 16.0 2023-09-29 23:09:47,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:09:47,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:09:48,897 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:09:48,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:09:53,613 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.96 vs. limit=22.5 2023-09-29 23:09:54,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:10:03,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:10:05,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 23:10:08,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:10:09,691 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:10:11,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:10:12,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:10:12,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 23:10:15,868 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:10:15,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:10:18,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:10:18,083 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 23:10:18,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 23:10:25,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:10:25,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:10:26,340 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=519600.0, ans=0.125 2023-09-29 23:10:27,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:10:27,985 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:10:29,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:10:29,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 23:10:29,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:10:31,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:10:32,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 23:10:38,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:10:38,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:10:40,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:10:41,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 23:10:42,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:10:43,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 23:10:44,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:10:46,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:10:47,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:10:49,394 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 23:10:51,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:11:00,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:11:00,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 23:11:01,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:11:03,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:11:05,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 23:11:09,402 INFO [train.py:1039] (1/4) Epoch 15, batch 3600, loss[loss=0.1885, simple_loss=0.2628, pruned_loss=0.05712, over 23465.00 frames. ], tot_loss[loss=0.186, simple_loss=0.2585, pruned_loss=0.05675, over 4703236.47 frames. ], batch size: 106, lr: 6.86e-03, grad_scale: 16.0 2023-09-29 23:11:12,611 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 23:11:12,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:11:14,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:11:15,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:11:17,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:11:17,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:11:20,429 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.599e+02 1.981e+02 2.241e+02 2.559e+02 3.675e+02, threshold=4.482e+02, percent-clipped=0.0 2023-09-29 23:11:20,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:11:22,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:11:24,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:11:25,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:11:25,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:11:25,904 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 23:11:30,977 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 23:11:33,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:11:35,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:11:37,530 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=519866.6666666667, ans=0.0 2023-09-29 23:11:38,688 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:11:38,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 23:11:40,359 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:11:40,392 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 23:11:40,510 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:11:43,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:11:43,741 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:11:45,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:11:48,431 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:11:49,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:11:51,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 23:11:54,740 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 23:11:58,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:12:00,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 23:12:00,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 23:12:07,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:12:13,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:12:16,705 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:12:22,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 23:12:22,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:12:23,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 23:12:24,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 23:12:26,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 23:12:27,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:12:29,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:12:30,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 23:12:30,690 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:12:30,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:12:30,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:12:32,115 INFO [train.py:1039] (1/4) Epoch 15, batch 3650, loss[loss=0.1973, simple_loss=0.2814, pruned_loss=0.05662, over 24663.00 frames. ], tot_loss[loss=0.1863, simple_loss=0.2594, pruned_loss=0.0566, over 4708384.02 frames. ], batch size: 73, lr: 6.86e-03, grad_scale: 16.0 2023-09-29 23:12:32,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 23:12:33,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 23:12:36,153 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=520133.3333333333, ans=0.1 2023-09-29 23:12:37,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:12:37,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=520133.3333333333, ans=0.1 2023-09-29 23:12:39,384 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 23:12:43,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 23:12:44,863 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:12:48,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 23:12:50,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 23:12:54,897 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:12:54,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 23:12:54,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:12:59,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 23:12:59,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:13:01,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 23:13:01,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:13:02,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:13:02,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 23:13:04,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 23:13:05,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:13:05,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:13:07,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:13:11,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 23:13:13,030 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 23:13:14,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:13:16,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 23:13:18,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:13:18,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:13:23,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 23:13:25,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:13:26,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 23:13:28,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:13:28,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:13:31,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:13:34,235 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:13:34,616 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=520333.3333333333, ans=0.125 2023-09-29 23:13:35,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:13:35,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:13:35,952 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=520333.3333333333, ans=0.125 2023-09-29 23:13:37,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 23:13:37,593 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:13:37,771 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=520400.0, ans=0.5 2023-09-29 23:13:39,138 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:13:47,173 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 23:13:50,854 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:13:50,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:13:53,010 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 23:13:53,086 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:13:54,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 23:13:55,832 INFO [train.py:1039] (1/4) Epoch 15, batch 3700, loss[loss=0.1812, simple_loss=0.2544, pruned_loss=0.05398, over 24633.00 frames. ], tot_loss[loss=0.1858, simple_loss=0.2594, pruned_loss=0.05606, over 4726818.28 frames. ], batch size: 60, lr: 6.85e-03, grad_scale: 16.0 2023-09-29 23:13:57,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:13:57,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 23:13:58,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:14:02,676 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 23:14:04,322 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:14:04,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:14:07,295 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 1.795e+02 1.978e+02 2.320e+02 3.492e+02, threshold=3.956e+02, percent-clipped=0.0 2023-09-29 23:14:07,444 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:14:07,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 23:14:07,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:14:09,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 23:14:09,083 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 23:14:10,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 23:14:13,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:14:15,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:14:16,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:14:16,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:14:18,772 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 23:14:20,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:14:24,027 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 23:14:30,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:14:30,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 23:14:31,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:14:31,339 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=520600.0, ans=0.125 2023-09-29 23:14:32,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 23:14:32,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:14:37,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:14:39,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 23:14:39,212 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:14:40,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:14:43,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:14:43,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 23:14:46,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 23:14:53,493 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:14:53,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 23:14:53,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:14:53,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 23:14:58,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:14:58,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:15:02,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:15:04,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 23:15:05,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:15:05,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 23:15:05,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:15:05,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:15:09,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:15:11,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 23:15:12,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 23:15:13,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:15:13,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:15:15,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:15:17,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:15:18,588 INFO [train.py:1039] (1/4) Epoch 15, batch 3750, loss[loss=0.1972, simple_loss=0.2795, pruned_loss=0.05749, over 24020.00 frames. ], tot_loss[loss=0.1871, simple_loss=0.2608, pruned_loss=0.05669, over 4726839.01 frames. ], batch size: 80, lr: 6.85e-03, grad_scale: 16.0 2023-09-29 23:15:20,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:15:21,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:15:23,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:15:25,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 23:15:25,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 23:15:28,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 23:15:28,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 23:15:28,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:15:30,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:15:32,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:15:33,080 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=520800.0, ans=0.125 2023-09-29 23:15:34,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:15:39,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:15:41,233 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=520866.6666666667, ans=0.125 2023-09-29 23:15:43,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:15:43,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 23:15:46,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:15:49,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:15:50,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 23:15:50,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:15:52,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:15:53,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:15:58,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 23:16:00,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 23:16:02,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:16:02,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:16:05,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:16:10,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:16:11,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 23:16:11,975 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=521000.0, ans=0.125 2023-09-29 23:16:17,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 23:16:17,489 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=521000.0, ans=0.125 2023-09-29 23:16:19,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:16:22,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:16:23,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:16:25,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 23:16:29,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 23:16:30,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 23:16:33,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 23:16:35,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:16:36,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 23:16:42,048 INFO [train.py:1039] (1/4) Epoch 15, batch 3800, loss[loss=0.1909, simple_loss=0.279, pruned_loss=0.0514, over 24441.00 frames. ], tot_loss[loss=0.1874, simple_loss=0.2614, pruned_loss=0.05672, over 4725663.38 frames. ], batch size: 69, lr: 6.85e-03, grad_scale: 16.0 2023-09-29 23:16:48,895 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:16:52,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:16:53,955 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.649e+02 1.926e+02 2.197e+02 2.572e+02 3.793e+02, threshold=4.394e+02, percent-clipped=0.0 2023-09-29 23:16:54,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 23:16:55,655 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 23:16:56,503 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.33 vs. limit=10.0 2023-09-29 23:16:57,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:16:58,766 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:16:58,906 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 23:16:59,203 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=521200.0, ans=0.5 2023-09-29 23:16:59,563 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.50 vs. limit=15.0 2023-09-29 23:17:00,581 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=521200.0, ans=0.0 2023-09-29 23:17:01,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 23:17:01,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:17:03,747 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:17:04,107 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=521200.0, ans=0.125 2023-09-29 23:17:05,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:17:05,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:17:05,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:17:06,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 23:17:09,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 23:17:11,619 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:17:14,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:17:16,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:17:16,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 23:17:20,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 23:17:20,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:17:22,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:17:23,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:17:28,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 23:17:28,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 23:17:32,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:17:37,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:17:42,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:17:43,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 23:17:45,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 23:17:45,571 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:17:48,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:17:50,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:17:52,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 23:17:53,247 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=521400.0, ans=0.125 2023-09-29 23:17:56,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 23:17:56,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 23:17:56,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:17:58,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:18:01,625 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=521400.0, ans=0.2 2023-09-29 23:18:02,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:18:04,523 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 23:18:05,934 INFO [train.py:1039] (1/4) Epoch 15, batch 3850, loss[loss=0.1764, simple_loss=0.2327, pruned_loss=0.06007, over 22743.00 frames. ], tot_loss[loss=0.1864, simple_loss=0.2604, pruned_loss=0.05616, over 4731949.33 frames. ], batch size: 322, lr: 6.85e-03, grad_scale: 8.0 2023-09-29 23:18:08,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:18:09,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 23:18:11,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:18:11,515 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:18:14,567 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 23:18:17,540 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:18:17,823 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=521466.6666666667, ans=0.125 2023-09-29 23:18:20,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 23:18:22,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 23:18:29,172 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:18:32,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:18:34,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:18:34,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:18:37,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:18:38,749 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:18:40,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:18:42,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:18:42,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:18:43,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:18:45,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:18:45,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:18:46,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 23:18:46,859 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 23:18:46,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:18:47,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:18:50,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:18:51,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:18:51,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 23:18:53,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 23:18:53,658 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.max_abs, batch_count=521666.6666666667, ans=10.0 2023-09-29 23:18:56,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:18:57,109 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 23:18:58,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 23:19:04,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:19:06,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:19:10,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:19:10,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 23:19:11,730 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=6.09 vs. limit=10.0 2023-09-29 23:19:14,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 23:19:17,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:19:17,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:19:22,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 23:19:22,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 23:19:22,230 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:19:23,743 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:19:23,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:19:23,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 23:19:25,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:19:25,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 23:19:27,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:19:27,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:19:28,417 INFO [train.py:1039] (1/4) Epoch 15, batch 3900, loss[loss=0.1988, simple_loss=0.2754, pruned_loss=0.06112, over 24026.00 frames. ], tot_loss[loss=0.1856, simple_loss=0.2592, pruned_loss=0.05602, over 4726394.37 frames. ], batch size: 80, lr: 6.84e-03, grad_scale: 8.0 2023-09-29 23:19:29,474 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=521800.0, ans=0.125 2023-09-29 23:19:30,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:19:30,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:19:32,077 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:19:32,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:19:32,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:19:33,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:19:33,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 23:19:33,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:19:38,934 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:19:40,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 23:19:40,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:19:41,907 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.448e+02 1.819e+02 2.036e+02 2.423e+02 3.835e+02, threshold=4.073e+02, percent-clipped=0.0 2023-09-29 23:19:42,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:19:45,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 23:19:45,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:19:47,087 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 23:19:47,492 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=521866.6666666667, ans=0.125 2023-09-29 23:19:49,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 23:19:49,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:19:50,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 23:19:50,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:19:50,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 23:19:52,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 23:19:57,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:19:58,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:19:58,610 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:20:00,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:20:05,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:20:06,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:20:07,278 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=521933.3333333333, ans=0.0 2023-09-29 23:20:08,834 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=521933.3333333333, ans=0.0 2023-09-29 23:20:09,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:20:09,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:20:10,029 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:20:18,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:20:18,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:20:23,882 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=522000.0, ans=0.0 2023-09-29 23:20:25,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 23:20:28,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:20:40,015 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:20:43,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:20:43,157 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 23:20:44,556 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 23:20:44,590 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:20:46,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 23:20:47,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:20:49,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 23:20:51,209 INFO [train.py:1039] (1/4) Epoch 15, batch 3950, loss[loss=0.1528, simple_loss=0.2348, pruned_loss=0.03543, over 24575.00 frames. ], tot_loss[loss=0.1855, simple_loss=0.2585, pruned_loss=0.0562, over 4704533.55 frames. ], batch size: 60, lr: 6.84e-03, grad_scale: 8.0 2023-09-29 23:20:56,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:20:58,316 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 23:20:58,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:21:01,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:21:03,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:21:08,964 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 23:21:09,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:21:09,765 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=24.02 vs. limit=22.5 2023-09-29 23:21:10,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 23:21:10,335 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 23:21:10,393 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:21:13,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:21:14,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 23:21:14,822 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:21:16,485 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 23:21:19,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:21:19,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:21:19,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:21:21,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:21:21,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:21:31,932 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=522266.6666666667, ans=0.0 2023-09-29 23:21:33,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:21:33,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:21:33,506 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=522266.6666666667, ans=0.5 2023-09-29 23:21:41,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 23:21:45,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=522333.3333333333, ans=0.0 2023-09-29 23:21:47,865 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 23:21:47,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 23:21:47,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:21:48,198 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=522333.3333333333, ans=0.5 2023-09-29 23:21:49,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:21:54,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:21:54,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 23:21:56,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:21:56,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:21:56,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 23:22:02,003 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=522400.0, ans=0.125 2023-09-29 23:22:03,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:22:04,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:22:07,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 23:22:15,407 INFO [train.py:1039] (1/4) Epoch 15, batch 4000, loss[loss=0.1753, simple_loss=0.2449, pruned_loss=0.05286, over 24491.00 frames. ], tot_loss[loss=0.1859, simple_loss=0.2588, pruned_loss=0.05651, over 4699712.50 frames. ], batch size: 58, lr: 6.84e-03, grad_scale: 16.0 2023-09-29 23:22:17,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:22:24,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:22:28,380 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.844e+02 2.082e+02 2.375e+02 3.458e+02, threshold=4.164e+02, percent-clipped=0.0 2023-09-29 23:22:28,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:22:30,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:22:32,190 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:22:32,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 23:22:33,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 23:22:35,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 23:22:35,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 23:22:35,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 23:22:36,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:22:40,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:22:40,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:22:40,669 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:22:42,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:22:42,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 23:22:44,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:22:47,002 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 23:22:47,430 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=522600.0, ans=0.125 2023-09-29 23:22:48,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:22:48,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:22:50,389 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=522600.0, ans=0.0 2023-09-29 23:22:51,702 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 23:22:53,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 23:22:53,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:22:59,922 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 23:23:01,452 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:23:03,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:23:03,473 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=522666.6666666667, ans=0.1 2023-09-29 23:23:04,853 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 23:23:05,016 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:23:06,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 23:23:06,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:23:06,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:23:08,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:23:10,610 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=522666.6666666667, ans=0.0 2023-09-29 23:23:10,674 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=522666.6666666667, ans=0.2 2023-09-29 23:23:11,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:23:11,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 23:23:11,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:23:13,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 23:23:13,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:23:15,055 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 23:23:21,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:23:24,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 23:23:25,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:23:27,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:23:28,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:23:29,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:23:31,386 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=522733.3333333333, ans=0.2 2023-09-29 23:23:34,217 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:23:37,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 23:23:37,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 23:23:38,528 INFO [train.py:1039] (1/4) Epoch 15, batch 4050, loss[loss=0.2075, simple_loss=0.2678, pruned_loss=0.07365, over 22895.00 frames. ], tot_loss[loss=0.1869, simple_loss=0.2599, pruned_loss=0.05699, over 4708007.09 frames. ], batch size: 322, lr: 6.84e-03, grad_scale: 16.0 2023-09-29 23:23:38,755 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 23:23:38,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:23:40,328 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:23:40,762 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=522800.0, ans=0.1 2023-09-29 23:23:42,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 23:23:42,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:23:42,808 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=522800.0, ans=0.0 2023-09-29 23:23:45,904 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=522800.0, ans=0.125 2023-09-29 23:23:47,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:23:50,837 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:23:50,917 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 23:23:54,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:23:55,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:24:00,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:24:02,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 23:24:06,397 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.83 vs. limit=12.0 2023-09-29 23:24:07,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 23:24:08,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 23:24:10,080 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 23:24:11,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:24:19,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 23:24:19,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:24:23,141 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=522933.3333333333, ans=0.125 2023-09-29 23:24:24,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:24:27,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:24:27,952 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:24:27,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:24:32,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:24:36,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 23:24:36,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 23:24:38,559 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:24:40,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 23:24:44,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:24:50,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 23:24:53,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:24:53,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 23:24:55,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 23:24:55,417 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 23:24:55,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:24:57,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:24:59,470 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:24:59,515 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:24:59,905 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 23:25:00,463 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=14.15 vs. limit=22.5 2023-09-29 23:25:00,925 INFO [train.py:1039] (1/4) Epoch 15, batch 4100, loss[loss=0.1891, simple_loss=0.2737, pruned_loss=0.05221, over 24350.00 frames. ], tot_loss[loss=0.1877, simple_loss=0.2607, pruned_loss=0.05731, over 4704850.96 frames. ], batch size: 77, lr: 6.84e-03, grad_scale: 8.0 2023-09-29 23:25:06,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 23:25:07,250 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=523133.3333333333, ans=0.2 2023-09-29 23:25:08,412 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 23:25:10,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 23:25:12,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 23:25:12,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:25:13,524 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:25:13,583 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:25:13,616 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 23:25:15,134 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 23:25:16,440 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.960e+02 2.243e+02 2.866e+02 4.978e+02, threshold=4.486e+02, percent-clipped=4.0 2023-09-29 23:25:18,124 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:25:18,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:25:18,304 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:25:18,981 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.75 vs. limit=15.0 2023-09-29 23:25:19,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:25:22,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 23:25:24,443 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:25:24,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:25:24,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 23:25:24,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:25:25,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:25:25,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:25:25,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:25:26,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 23:25:29,891 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:25:31,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 23:25:33,446 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:25:35,176 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=523266.6666666667, ans=0.125 2023-09-29 23:25:36,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:25:36,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 23:25:38,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:25:40,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:25:40,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:25:40,541 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=523266.6666666667, ans=0.1 2023-09-29 23:25:41,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 23:25:43,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 23:25:43,527 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:25:47,191 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 23:25:47,513 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=523266.6666666667, ans=0.125 2023-09-29 23:25:48,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:25:48,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:25:51,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:25:53,897 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=523333.3333333333, ans=0.125 2023-09-29 23:25:58,208 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:26:01,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:26:02,851 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:26:07,402 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=523400.0, ans=0.125 2023-09-29 23:26:13,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:26:13,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:26:18,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:26:19,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:26:23,849 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:26:25,099 INFO [train.py:1039] (1/4) Epoch 15, batch 4150, loss[loss=0.1907, simple_loss=0.2659, pruned_loss=0.05769, over 23350.00 frames. ], tot_loss[loss=0.1879, simple_loss=0.2615, pruned_loss=0.05712, over 4719066.95 frames. ], batch size: 93, lr: 6.83e-03, grad_scale: 8.0 2023-09-29 23:26:26,643 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 23:26:26,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:26:26,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:26:29,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 23:26:29,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:26:31,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 23:26:31,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 23:26:31,581 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=523466.6666666667, ans=0.125 2023-09-29 23:26:32,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 23:26:34,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:26:34,783 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=523466.6666666667, ans=0.05 2023-09-29 23:26:40,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:26:40,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:26:45,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:26:47,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:26:47,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 23:26:47,537 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=523533.3333333333, ans=0.125 2023-09-29 23:26:50,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 23:26:51,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:26:53,280 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 23:26:53,720 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=523533.3333333333, ans=0.125 2023-09-29 23:26:57,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:27:02,453 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:27:03,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 23:27:05,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 23:27:05,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:27:07,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 23:27:07,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:27:07,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:27:08,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:27:10,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:27:13,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 23:27:15,869 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 23:27:19,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:27:19,364 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 23:27:19,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:27:21,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 23:27:23,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 23:27:26,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:27:27,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:27:29,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 23:27:29,207 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:27:29,210 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 23:27:30,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 23:27:31,031 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=523733.3333333333, ans=0.125 2023-09-29 23:27:32,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 23:27:32,602 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=523733.3333333333, ans=0.0 2023-09-29 23:27:34,390 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:27:34,397 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 23:27:34,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 23:27:34,567 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 23:27:35,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:27:35,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 23:27:36,072 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:27:39,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:27:39,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 23:27:40,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 23:27:46,930 INFO [train.py:1039] (1/4) Epoch 15, batch 4200, loss[loss=0.1874, simple_loss=0.2688, pruned_loss=0.05297, over 24004.00 frames. ], tot_loss[loss=0.1866, simple_loss=0.2604, pruned_loss=0.0564, over 4737855.68 frames. ], batch size: 86, lr: 6.83e-03, grad_scale: 8.0 2023-09-29 23:27:47,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:27:49,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 23:27:49,320 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=523800.0, ans=0.04949747468305833 2023-09-29 23:27:52,075 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 23:27:52,329 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=523800.0, ans=0.125 2023-09-29 23:27:55,002 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:27:55,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:27:56,623 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:27:56,626 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:27:58,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 23:28:00,462 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.93 vs. limit=6.0 2023-09-29 23:28:01,700 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.531e+02 1.915e+02 2.061e+02 2.276e+02 4.406e+02, threshold=4.122e+02, percent-clipped=0.0 2023-09-29 23:28:02,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 23:28:02,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:28:05,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:28:08,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:28:10,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 23:28:11,720 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:28:11,773 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:28:13,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 23:28:13,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:28:14,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:28:14,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:28:15,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 23:28:16,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:28:18,583 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.13 vs. limit=15.0 2023-09-29 23:28:21,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 23:28:21,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:28:26,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 23:28:28,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:28:29,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:28:31,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:28:33,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:28:33,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 23:28:33,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:28:35,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:28:35,505 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=524000.0, ans=0.035 2023-09-29 23:28:40,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 23:28:43,141 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:28:49,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:28:52,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 23:28:54,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:28:56,805 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=524066.6666666667, ans=0.125 2023-09-29 23:28:56,910 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=524066.6666666667, ans=0.0 2023-09-29 23:28:59,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 23:29:01,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:29:02,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 23:29:07,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 23:29:09,500 INFO [train.py:1039] (1/4) Epoch 15, batch 4250, loss[loss=0.2031, simple_loss=0.287, pruned_loss=0.05958, over 24659.00 frames. ], tot_loss[loss=0.1853, simple_loss=0.2591, pruned_loss=0.05575, over 4744498.11 frames. ], batch size: 68, lr: 6.83e-03, grad_scale: 8.0 2023-09-29 23:29:12,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:29:12,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 23:29:15,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:29:20,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 23:29:20,868 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 23:29:21,109 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=524133.3333333333, ans=0.125 2023-09-29 23:29:22,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:29:24,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:29:24,225 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=524200.0, ans=0.0 2023-09-29 23:29:27,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:29:34,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:29:34,261 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:29:34,536 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:29:34,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:29:36,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:29:37,545 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:29:39,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:29:42,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:29:44,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:29:45,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 23:29:48,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 23:29:48,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:29:49,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:29:49,090 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:29:51,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:29:51,346 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:29:52,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:29:52,955 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=524266.6666666667, ans=0.0 2023-09-29 23:29:55,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 23:29:57,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:30:02,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:30:04,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:30:06,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 23:30:06,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:30:06,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 23:30:07,684 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:30:09,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 23:30:10,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:30:10,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:30:12,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 23:30:14,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 23:30:15,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 23:30:20,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:30:23,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:30:25,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:30:27,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:30:28,751 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:30:30,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:30:30,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:30:30,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 23:30:31,791 INFO [train.py:1039] (1/4) Epoch 15, batch 4300, loss[loss=0.189, simple_loss=0.2534, pruned_loss=0.06229, over 23493.00 frames. ], tot_loss[loss=0.1849, simple_loss=0.2586, pruned_loss=0.05565, over 4738658.27 frames. ], batch size: 119, lr: 6.83e-03, grad_scale: 8.0 2023-09-29 23:30:32,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:30:36,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:30:38,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:30:41,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:30:47,073 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.553e+02 1.892e+02 2.099e+02 2.369e+02 3.970e+02, threshold=4.198e+02, percent-clipped=0.0 2023-09-29 23:30:47,941 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.10 vs. limit=15.0 2023-09-29 23:30:50,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:30:50,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 23:30:51,798 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:30:53,932 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 23:30:54,677 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.75 vs. limit=15.0 2023-09-29 23:30:55,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 23:30:55,313 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 23:30:58,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 23:31:00,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 23:31:05,114 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 23:31:05,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:31:05,181 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 23:31:08,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 23:31:10,565 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:31:10,953 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=524600.0, ans=0.1 2023-09-29 23:31:14,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:31:14,340 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:31:14,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:31:14,787 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=524600.0, ans=0.2 2023-09-29 23:31:15,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:31:16,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:31:16,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 23:31:19,057 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 23:31:20,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:31:22,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:31:22,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 23:31:22,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:31:23,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:31:23,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 23:31:23,794 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 23:31:23,896 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 23:31:25,621 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=524666.6666666666, ans=0.125 2023-09-29 23:31:25,722 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=524666.6666666666, ans=0.2 2023-09-29 23:31:26,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:31:26,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 23:31:27,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 23:31:29,423 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=524666.6666666666, ans=0.125 2023-09-29 23:31:32,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:31:33,663 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 23:31:35,682 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:31:36,116 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=524666.6666666666, ans=0.2 2023-09-29 23:31:37,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:31:37,274 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:31:39,064 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 23:31:39,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 23:31:39,174 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:31:41,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:31:42,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:31:42,879 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:31:45,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:31:47,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:31:48,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:31:49,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:31:49,943 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_abs, batch_count=524733.3333333334, ans=0.5 2023-09-29 23:31:55,406 INFO [train.py:1039] (1/4) Epoch 15, batch 4350, loss[loss=0.2067, simple_loss=0.2663, pruned_loss=0.07349, over 23712.00 frames. ], tot_loss[loss=0.1858, simple_loss=0.2596, pruned_loss=0.056, over 4726558.28 frames. ], batch size: 232, lr: 6.83e-03, grad_scale: 8.0 2023-09-29 23:31:55,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 23:31:55,600 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 23:31:58,850 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=524800.0, ans=0.125 2023-09-29 23:32:03,050 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:32:06,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:32:09,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:32:09,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:32:13,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:32:18,819 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:32:21,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:32:21,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:32:24,423 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=524866.6666666666, ans=0.0 2023-09-29 23:32:25,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 23:32:27,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:32:28,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 23:32:35,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 23:32:36,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:32:37,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:32:40,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:32:44,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 23:32:49,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:32:52,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 23:32:57,594 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 23:32:59,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:32:59,756 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:32:59,863 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 23:33:00,548 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.85 vs. limit=15.0 2023-09-29 23:33:01,343 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 23:33:01,354 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:33:01,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:33:02,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:33:02,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:33:04,492 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:33:05,851 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:33:07,544 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 23:33:08,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:33:08,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:33:08,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:33:10,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 23:33:12,004 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 23:33:12,012 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 23:33:12,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 23:33:15,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:33:16,458 INFO [train.py:1039] (1/4) Epoch 15, batch 4400, loss[loss=0.1959, simple_loss=0.2692, pruned_loss=0.06128, over 23611.00 frames. ], tot_loss[loss=0.1865, simple_loss=0.2606, pruned_loss=0.05618, over 4730074.92 frames. ], batch size: 135, lr: 6.82e-03, grad_scale: 16.0 2023-09-29 23:33:16,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:33:16,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:33:17,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:33:19,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 23:33:21,224 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 23:33:21,235 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:33:26,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:33:27,006 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:33:29,095 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:33:30,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 23:33:30,815 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 23:33:32,656 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.997e+02 2.183e+02 2.511e+02 3.955e+02, threshold=4.366e+02, percent-clipped=0.0 2023-09-29 23:33:32,801 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 23:33:32,843 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 23:33:34,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 23:33:34,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:33:35,928 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 23:33:37,886 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=525200.0, ans=0.0 2023-09-29 23:33:39,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:33:40,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:33:40,548 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 23:33:43,708 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:33:43,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 23:33:43,795 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 23:33:46,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 23:33:46,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 23:33:47,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 23:33:48,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:33:48,670 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:33:50,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:33:50,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:33:53,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 23:33:53,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 23:33:54,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:33:56,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:33:56,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:33:58,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:33:59,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:33:59,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 23:34:00,519 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 23:34:02,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:34:11,291 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:34:12,963 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 23:34:16,128 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:34:17,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:34:20,714 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:34:20,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 23:34:20,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:34:20,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 23:34:20,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:34:22,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 23:34:27,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 23:34:30,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 23:34:31,255 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=525400.0, ans=0.0 2023-09-29 23:34:32,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 23:34:32,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:34:32,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 23:34:32,621 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:34:36,542 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:34:40,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 23:34:41,758 INFO [train.py:1039] (1/4) Epoch 15, batch 4450, loss[loss=0.1572, simple_loss=0.2255, pruned_loss=0.0445, over 24470.00 frames. ], tot_loss[loss=0.1885, simple_loss=0.2622, pruned_loss=0.05741, over 4719339.46 frames. ], batch size: 58, lr: 6.82e-03, grad_scale: 16.0 2023-09-29 23:34:43,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:34:45,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:34:46,394 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:34:52,518 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:34:52,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:34:53,009 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=525466.6666666666, ans=0.125 2023-09-29 23:34:55,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:34:58,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:35:01,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:35:01,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:35:02,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 23:35:02,772 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:35:02,907 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:35:04,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:35:04,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 23:35:08,433 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 23:35:13,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:35:14,982 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:35:16,502 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:35:18,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:35:18,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:35:22,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 23:35:25,534 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 23:35:25,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 23:35:25,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:35:28,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:35:30,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 23:35:34,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 23:35:38,404 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:35:39,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 23:35:39,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:35:39,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:35:39,950 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:35:39,964 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:35:42,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:35:45,788 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 23:35:45,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 23:35:47,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 23:35:49,514 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:35:51,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:35:53,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:35:53,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 23:35:58,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 23:36:01,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 23:36:02,722 INFO [train.py:1039] (1/4) Epoch 15, batch 4500, loss[loss=0.1739, simple_loss=0.258, pruned_loss=0.0449, over 24488.00 frames. ], tot_loss[loss=0.1884, simple_loss=0.2626, pruned_loss=0.05706, over 4729788.45 frames. ], batch size: 63, lr: 6.82e-03, grad_scale: 16.0 2023-09-29 23:36:02,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:36:07,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:36:08,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 23:36:08,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 23:36:11,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:36:15,780 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:36:15,850 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:36:17,836 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.500e+02 1.874e+02 2.112e+02 2.381e+02 3.744e+02, threshold=4.224e+02, percent-clipped=0.0 2023-09-29 23:36:17,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 23:36:18,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:36:19,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:36:19,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:36:30,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:36:31,106 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=525866.6666666666, ans=0.2 2023-09-29 23:36:32,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:36:35,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:36:35,453 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:36:37,084 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:36:43,218 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 23:36:47,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:36:51,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 23:36:56,225 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:36:56,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 23:36:57,776 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:36:57,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:36:59,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:37:00,871 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:37:02,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:37:02,640 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 23:37:02,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 23:37:02,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:37:05,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:37:05,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:37:09,152 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:37:12,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 23:37:12,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:37:13,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 23:37:16,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 23:37:16,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 23:37:18,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 23:37:23,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 23:37:23,347 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=526133.3333333334, ans=0.0 2023-09-29 23:37:24,408 INFO [train.py:1039] (1/4) Epoch 15, batch 4550, loss[loss=0.179, simple_loss=0.2651, pruned_loss=0.04647, over 24661.00 frames. ], tot_loss[loss=0.1872, simple_loss=0.2612, pruned_loss=0.05664, over 4738749.36 frames. ], batch size: 73, lr: 6.82e-03, grad_scale: 16.0 2023-09-29 23:37:24,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:37:29,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:37:29,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:37:31,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:37:37,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:37:39,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:37:40,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:37:40,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:37:40,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:37:43,705 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:37:43,778 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:37:46,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:37:50,072 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 23:37:50,161 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 23:37:50,439 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=526200.0, ans=0.025 2023-09-29 23:37:51,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:37:53,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 23:37:58,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 23:37:59,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:38:04,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 23:38:07,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 23:38:08,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:38:08,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:38:10,146 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:38:11,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 23:38:15,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:38:16,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:38:18,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:38:18,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:38:21,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 23:38:21,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 23:38:21,346 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:38:22,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 23:38:25,793 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 23:38:25,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:38:26,218 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=526333.3333333334, ans=0.1 2023-09-29 23:38:27,404 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:38:27,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:38:29,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:38:29,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:38:32,081 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.31 vs. limit=15.0 2023-09-29 23:38:33,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 23:38:33,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 23:38:34,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:38:34,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 23:38:36,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 23:38:36,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:38:36,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 23:38:39,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:38:39,446 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:38:42,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:38:42,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:38:42,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 23:38:45,634 INFO [train.py:1039] (1/4) Epoch 15, batch 4600, loss[loss=0.1972, simple_loss=0.275, pruned_loss=0.05968, over 24063.00 frames. ], tot_loss[loss=0.1859, simple_loss=0.2599, pruned_loss=0.05599, over 4732718.47 frames. ], batch size: 80, lr: 6.81e-03, grad_scale: 16.0 2023-09-29 23:38:45,773 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:38:47,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 23:38:50,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:38:51,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:38:55,053 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:38:55,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:38:55,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:38:56,650 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 23:38:56,996 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=526466.6666666666, ans=0.125 2023-09-29 23:38:58,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:38:58,736 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=526466.6666666666, ans=0.125 2023-09-29 23:38:59,842 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.461e+02 1.932e+02 2.167e+02 2.436e+02 3.970e+02, threshold=4.334e+02, percent-clipped=0.0 2023-09-29 23:39:02,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:39:02,498 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=526533.3333333334, ans=0.125 2023-09-29 23:39:03,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:39:06,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:39:12,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 23:39:14,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:39:16,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:39:20,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:39:20,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:39:20,919 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=526600.0, ans=0.125 2023-09-29 23:39:22,374 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=526600.0, ans=0.125 2023-09-29 23:39:23,102 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.35 vs. limit=15.0 2023-09-29 23:39:23,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 23:39:23,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 23:39:25,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:39:25,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=526600.0, ans=0.1 2023-09-29 23:39:31,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:39:33,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 23:39:33,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:39:38,156 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 23:39:39,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 23:39:45,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:39:45,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:39:48,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:39:48,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 23:39:49,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:39:49,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 23:39:49,847 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:39:51,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:39:51,379 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:39:52,872 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:39:52,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:39:54,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 23:39:54,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 23:39:54,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 23:39:54,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:39:56,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:39:57,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:39:57,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:40:09,579 INFO [train.py:1039] (1/4) Epoch 15, batch 4650, loss[loss=0.1913, simple_loss=0.268, pruned_loss=0.05732, over 23988.00 frames. ], tot_loss[loss=0.1862, simple_loss=0.2603, pruned_loss=0.05603, over 4735262.98 frames. ], batch size: 86, lr: 6.81e-03, grad_scale: 16.0 2023-09-29 23:40:11,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:40:13,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:40:15,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:40:15,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:40:15,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:40:15,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:40:16,777 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:40:20,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 23:40:23,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:40:24,730 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 23:40:24,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:40:26,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 23:40:26,283 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:40:27,706 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 23:40:27,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 23:40:27,745 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:40:27,964 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=526866.6666666666, ans=0.0 2023-09-29 23:40:29,203 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:40:29,701 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=526866.6666666666, ans=0.125 2023-09-29 23:40:31,139 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=526866.6666666666, ans=0.1 2023-09-29 23:40:32,227 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 23:40:33,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:40:33,755 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 23:40:36,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:40:36,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 23:40:40,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:40:40,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:40:42,142 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 23:40:45,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:40:47,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:40:52,360 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:40:57,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:41:00,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:41:00,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:41:00,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:41:03,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 23:41:04,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 23:41:06,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 23:41:06,190 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 23:41:07,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:41:11,051 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=527000.0, ans=0.125 2023-09-29 23:41:15,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:41:15,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:41:15,983 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 23:41:16,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:41:17,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:41:17,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:41:21,788 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:41:23,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:41:23,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:41:25,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:41:28,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:41:28,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:41:30,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 23:41:30,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 23:41:30,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:41:32,038 INFO [train.py:1039] (1/4) Epoch 15, batch 4700, loss[loss=0.2516, simple_loss=0.3007, pruned_loss=0.1012, over 19553.00 frames. ], tot_loss[loss=0.1867, simple_loss=0.2605, pruned_loss=0.05641, over 4726196.81 frames. ], batch size: 389, lr: 6.81e-03, grad_scale: 8.0 2023-09-29 23:41:33,656 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 23:41:40,161 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=527133.3333333334, ans=0.125 2023-09-29 23:41:41,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:41:42,935 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:41:43,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:41:44,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:41:44,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 23:41:46,410 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=527200.0, ans=0.015 2023-09-29 23:41:48,315 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.497e+02 1.872e+02 2.063e+02 2.349e+02 3.516e+02, threshold=4.126e+02, percent-clipped=0.0 2023-09-29 23:41:50,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 23:41:50,255 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=527200.0, ans=0.0 2023-09-29 23:41:51,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 23:41:51,654 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=527200.0, ans=0.125 2023-09-29 23:41:53,076 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:41:55,210 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:41:55,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:41:59,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:42:06,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 23:42:07,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 23:42:09,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:42:14,059 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 23:42:15,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 23:42:15,337 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:42:18,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:42:20,265 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=527333.3333333334, ans=0.1 2023-09-29 23:42:22,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 23:42:23,705 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:42:29,066 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:42:30,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 23:42:32,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:42:32,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:42:34,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:42:35,254 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.82 vs. limit=12.0 2023-09-29 23:42:36,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:42:36,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 23:42:37,671 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 23:42:39,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:42:40,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:42:40,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:42:40,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 23:42:41,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:42:44,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 23:42:47,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:42:48,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:42:53,238 INFO [train.py:1039] (1/4) Epoch 15, batch 4750, loss[loss=0.1875, simple_loss=0.2723, pruned_loss=0.05134, over 24618.00 frames. ], tot_loss[loss=0.1872, simple_loss=0.2607, pruned_loss=0.05685, over 4726089.99 frames. ], batch size: 68, lr: 6.81e-03, grad_scale: 8.0 2023-09-29 23:42:53,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:42:53,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:42:57,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 23:42:58,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:42:58,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=527466.6666666666, ans=0.1 2023-09-29 23:43:03,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 23:43:06,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:43:06,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:43:08,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:43:14,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 23:43:18,130 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:43:21,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 23:43:22,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:43:24,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:43:24,424 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:43:25,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:43:25,886 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 23:43:25,890 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 23:43:33,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 23:43:36,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:43:38,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:43:42,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 23:43:42,124 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 23:43:42,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:43:45,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:43:48,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:43:50,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 23:43:50,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 23:43:50,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:43:52,078 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:43:52,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:43:53,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 23:43:53,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 23:43:56,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 23:43:59,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:44:02,661 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:44:02,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 23:44:02,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:44:04,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:44:05,936 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 23:44:07,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:44:07,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 23:44:12,116 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:44:12,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 23:44:14,006 INFO [train.py:1039] (1/4) Epoch 15, batch 4800, loss[loss=0.216, simple_loss=0.2773, pruned_loss=0.07734, over 23619.00 frames. ], tot_loss[loss=0.1884, simple_loss=0.2621, pruned_loss=0.05734, over 4730715.59 frames. ], batch size: 256, lr: 6.81e-03, grad_scale: 16.0 2023-09-29 23:44:14,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 23:44:15,789 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 23:44:18,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 23:44:19,392 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:44:19,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 23:44:27,039 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:44:27,116 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:44:30,099 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.597e+02 1.901e+02 2.248e+02 2.763e+02 5.522e+02, threshold=4.496e+02, percent-clipped=3.0 2023-09-29 23:44:31,824 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:44:33,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:44:33,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:44:34,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 23:44:34,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:44:34,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:44:37,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:44:41,126 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:44:41,940 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.98 vs. limit=15.0 2023-09-29 23:44:42,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:44:44,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:44:44,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:44:44,345 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 23:44:44,366 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:44:45,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:44:50,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:44:53,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:44:55,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:44:55,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 23:44:56,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 23:44:58,116 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.89 vs. limit=5.0 2023-09-29 23:44:58,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:45:01,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 23:45:01,406 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 23:45:03,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:45:03,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:45:03,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:45:03,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:45:03,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:45:05,106 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=528000.0, ans=0.1 2023-09-29 23:45:06,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:45:06,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:45:09,454 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:45:11,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:45:12,724 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:45:17,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 23:45:17,210 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:45:17,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:45:18,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 23:45:18,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:45:24,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:45:24,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:45:24,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:45:26,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:45:26,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:45:28,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 23:45:28,964 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=528066.6666666666, ans=0.125 2023-09-29 23:45:32,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:45:32,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:45:32,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:45:33,524 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.30 vs. limit=10.0 2023-09-29 23:45:36,016 INFO [train.py:1039] (1/4) Epoch 15, batch 4850, loss[loss=0.173, simple_loss=0.2406, pruned_loss=0.05267, over 13127.00 frames. ], tot_loss[loss=0.1881, simple_loss=0.2618, pruned_loss=0.05718, over 4725680.51 frames. ], batch size: 27, lr: 6.80e-03, grad_scale: 16.0 2023-09-29 23:45:36,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 23:45:37,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 23:45:37,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:45:37,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:45:40,688 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:45:40,690 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:45:42,362 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:45:48,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 23:45:52,176 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:45:57,382 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:45:58,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 23:45:58,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:46:02,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:46:02,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:46:04,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:46:04,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 23:46:05,744 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.58 vs. limit=22.5 2023-09-29 23:46:09,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:46:11,095 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:46:11,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 23:46:12,563 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 23:46:12,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 23:46:15,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:46:15,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:46:18,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:46:19,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 23:46:19,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 23:46:20,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 23:46:29,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:46:29,106 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 23:46:31,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:46:31,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:46:32,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 23:46:35,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 23:46:35,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:46:35,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 23:46:35,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:46:37,335 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:46:37,817 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.68 vs. limit=15.0 2023-09-29 23:46:38,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 23:46:41,103 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=528400.0, ans=0.125 2023-09-29 23:46:48,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:46:49,194 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.54 vs. limit=15.0 2023-09-29 23:46:54,753 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:46:54,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:46:57,633 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.59 vs. limit=15.0 2023-09-29 23:46:58,248 INFO [train.py:1039] (1/4) Epoch 15, batch 4900, loss[loss=0.1744, simple_loss=0.236, pruned_loss=0.05639, over 23804.00 frames. ], tot_loss[loss=0.1868, simple_loss=0.2603, pruned_loss=0.05665, over 4720156.67 frames. ], batch size: 212, lr: 6.80e-03, grad_scale: 16.0 2023-09-29 23:46:58,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 23:46:58,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:47:05,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:47:07,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:47:07,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:47:11,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 23:47:14,801 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.640e+02 1.963e+02 2.199e+02 2.506e+02 3.437e+02, threshold=4.398e+02, percent-clipped=0.0 2023-09-29 23:47:16,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 23:47:16,849 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=528533.3333333334, ans=0.1 2023-09-29 23:47:21,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 23:47:23,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 23:47:23,217 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 23:47:23,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:47:23,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:47:23,340 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:47:23,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:47:24,773 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 23:47:28,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 23:47:28,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 23:47:30,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 23:47:30,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 23:47:30,571 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=528600.0, ans=0.125 2023-09-29 23:47:33,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:47:33,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:47:33,481 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:47:33,495 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 23:47:37,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:47:39,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:47:39,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 23:47:39,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 23:47:43,451 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.64 vs. limit=15.0 2023-09-29 23:47:44,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 23:47:47,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:47:47,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:47:47,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:47:47,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:47:47,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 23:47:49,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:47:49,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 23:47:51,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:47:54,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 23:47:55,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:48:00,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 23:48:02,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:48:02,220 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 23:48:03,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 23:48:04,090 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=528733.3333333334, ans=0.125 2023-09-29 23:48:10,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:48:12,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:48:14,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 23:48:14,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 23:48:14,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:48:17,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:48:20,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:48:20,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:48:20,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:48:20,485 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 23:48:21,878 INFO [train.py:1039] (1/4) Epoch 15, batch 4950, loss[loss=0.1865, simple_loss=0.2636, pruned_loss=0.05472, over 23097.00 frames. ], tot_loss[loss=0.186, simple_loss=0.2593, pruned_loss=0.05637, over 4714254.82 frames. ], batch size: 105, lr: 6.80e-03, grad_scale: 16.0 2023-09-29 23:48:22,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 23:48:22,341 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=528800.0, ans=0.125 2023-09-29 23:48:25,674 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:48:25,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 23:48:25,908 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=528800.0, ans=0.2 2023-09-29 23:48:28,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 23:48:28,811 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 23:48:30,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 23:48:30,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 23:48:31,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:48:31,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:48:31,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:48:31,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:48:35,628 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:48:35,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:48:37,185 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:48:38,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:48:41,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:48:41,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:48:46,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 23:48:52,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:48:53,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:48:55,285 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:48:55,374 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:48:56,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:48:57,102 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 23:48:58,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 23:49:00,313 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=528933.3333333334, ans=0.125 2023-09-29 23:49:01,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:49:03,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:49:03,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:49:05,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:49:05,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:49:07,120 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:49:08,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:49:09,029 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=528933.3333333334, ans=0.0 2023-09-29 23:49:11,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 23:49:14,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:49:14,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:49:14,959 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=529000.0, ans=0.125 2023-09-29 23:49:16,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:49:16,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 23:49:18,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:49:18,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=529000.0, ans=0.125 2023-09-29 23:49:19,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 23:49:25,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:49:25,287 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=529000.0, ans=0.125 2023-09-29 23:49:26,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:49:26,602 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:49:26,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:49:28,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:49:28,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:49:31,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:49:31,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:49:31,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:49:32,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 23:49:34,823 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:49:41,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 23:49:41,365 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 23:49:44,493 INFO [train.py:1039] (1/4) Epoch 15, batch 5000, loss[loss=0.1877, simple_loss=0.2661, pruned_loss=0.05469, over 24684.00 frames. ], tot_loss[loss=0.1852, simple_loss=0.2584, pruned_loss=0.05601, over 4718975.30 frames. ], batch size: 65, lr: 6.80e-03, grad_scale: 16.0 2023-09-29 23:49:48,353 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:49:48,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 23:49:49,838 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 23:49:51,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 23:49:51,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:49:52,582 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=529133.3333333334, ans=0.125 2023-09-29 23:49:55,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 23:49:55,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:49:55,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 23:49:57,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 23:49:58,019 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:49:58,139 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:49:59,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 23:49:59,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:50:01,014 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.874e+02 2.133e+02 2.483e+02 3.662e+02, threshold=4.266e+02, percent-clipped=0.0 2023-09-29 23:50:01,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:50:02,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 23:50:02,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 23:50:04,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:50:04,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 23:50:04,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 23:50:04,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:50:04,608 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=529200.0, ans=0.125 2023-09-29 23:50:05,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 23:50:05,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 23:50:05,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 23:50:07,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 23:50:07,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:50:07,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:50:09,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 23:50:09,109 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 23:50:13,220 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:50:13,560 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=529200.0, ans=0.1 2023-09-29 23:50:14,666 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:50:16,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 23:50:17,809 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 23:50:17,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:50:18,240 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=529266.6666666666, ans=0.0 2023-09-29 23:50:21,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:50:26,069 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 23:50:26,468 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=529266.6666666666, ans=0.1 2023-09-29 23:50:29,227 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:50:30,305 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=529266.6666666666, ans=0.0 2023-09-29 23:50:31,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:50:31,431 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:50:33,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 23:50:33,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:50:33,285 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:50:35,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:50:36,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 23:50:38,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:50:42,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:50:42,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:50:47,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 23:50:54,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:51:02,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:51:03,721 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:51:03,733 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:51:03,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:51:03,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 23:51:03,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 23:51:05,320 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:51:07,361 INFO [train.py:1039] (1/4) Epoch 15, batch 5050, loss[loss=0.2031, simple_loss=0.2707, pruned_loss=0.06776, over 23354.00 frames. ], tot_loss[loss=0.1852, simple_loss=0.259, pruned_loss=0.05571, over 4728989.83 frames. ], batch size: 119, lr: 6.80e-03, grad_scale: 8.0 2023-09-29 23:51:11,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:51:11,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 23:51:12,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:51:15,310 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=529466.6666666666, ans=0.125 2023-09-29 23:51:16,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:51:17,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:51:17,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 23:51:19,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:51:19,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:51:21,067 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=529533.3333333334, ans=0.0 2023-09-29 23:51:22,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 23:51:24,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 23:51:24,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 23:51:34,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 23:51:34,585 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 23:51:34,919 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 23:51:36,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:51:36,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 23:51:36,170 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:51:37,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:51:39,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:51:39,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:51:39,256 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 23:51:40,712 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 23:51:41,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:51:44,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:51:47,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:51:47,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 23:51:50,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:51:52,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 23:51:56,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:51:56,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:51:56,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:51:56,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:51:59,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:52:01,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:52:02,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:52:02,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:52:02,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:52:02,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 23:52:04,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:52:06,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:52:08,001 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=529666.6666666666, ans=0.2 2023-09-29 23:52:10,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:52:10,799 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 23:52:10,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 23:52:10,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:52:12,550 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:52:12,593 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 23:52:14,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:52:14,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 23:52:14,296 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:52:19,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:52:19,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:52:19,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 23:52:21,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 23:52:24,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:52:24,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:52:24,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:52:27,861 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 23:52:29,671 INFO [train.py:1039] (1/4) Epoch 15, batch 5100, loss[loss=0.1588, simple_loss=0.2319, pruned_loss=0.04285, over 24606.00 frames. ], tot_loss[loss=0.1855, simple_loss=0.2595, pruned_loss=0.05572, over 4739433.81 frames. ], batch size: 60, lr: 6.79e-03, grad_scale: 8.0 2023-09-29 23:52:32,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:52:35,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 23:52:35,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 23:52:38,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:52:39,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:52:41,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:52:42,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 23:52:42,912 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 23:52:47,408 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.887e+02 2.076e+02 2.309e+02 3.546e+02, threshold=4.153e+02, percent-clipped=0.0 2023-09-29 23:52:47,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:52:47,739 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:52:52,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:52:54,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 23:52:55,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:52:57,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:52:57,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 23:53:00,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:53:00,698 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:53:02,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 23:53:06,415 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 23:53:07,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:53:07,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 23:53:08,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 23:53:13,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:53:19,624 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=530000.0, ans=0.2 2023-09-29 23:53:23,709 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:53:26,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 23:53:27,511 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 23:53:27,524 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 23:53:28,046 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=530000.0, ans=0.1 2023-09-29 23:53:29,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 23:53:29,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:53:32,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 23:53:35,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 23:53:35,991 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=530066.6666666666, ans=0.125 2023-09-29 23:53:37,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 23:53:39,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:53:40,882 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 23:53:42,399 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 23:53:42,472 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 23:53:49,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:53:49,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:53:49,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:53:50,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:53:50,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 23:53:52,311 INFO [train.py:1039] (1/4) Epoch 15, batch 5150, loss[loss=0.1885, simple_loss=0.2754, pruned_loss=0.05078, over 24652.00 frames. ], tot_loss[loss=0.1875, simple_loss=0.2611, pruned_loss=0.05696, over 4724563.03 frames. ], batch size: 68, lr: 6.79e-03, grad_scale: 8.0 2023-09-29 23:53:52,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:53:53,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 23:53:53,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 23:53:55,331 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 23:53:55,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:53:55,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 23:53:57,006 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=530133.3333333334, ans=0.0 2023-09-29 23:53:58,258 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:54:00,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 23:54:01,926 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:54:02,369 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 23:54:03,429 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:54:06,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 23:54:06,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 23:54:08,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:54:09,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:54:11,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 23:54:11,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:54:11,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:54:12,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:54:12,067 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:54:12,240 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=530200.0, ans=0.2 2023-09-29 23:54:12,402 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=530200.0, ans=0.0 2023-09-29 23:54:13,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 23:54:14,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:54:16,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:54:18,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 23:54:19,794 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 23:54:19,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:54:23,874 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=530266.6666666666, ans=0.125 2023-09-29 23:54:27,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:54:27,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 23:54:31,164 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:54:37,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:54:37,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:54:40,888 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=530333.3333333334, ans=0.125 2023-09-29 23:54:43,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:54:43,444 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:54:45,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 23:54:50,262 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:54:51,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:54:51,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:54:55,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:54:56,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:54:58,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 23:55:05,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:55:07,203 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 23:55:10,167 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:55:10,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:55:11,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 23:55:11,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 23:55:11,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:55:11,894 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_positive, batch_count=530400.0, ans=0.05 2023-09-29 23:55:13,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:55:14,528 INFO [train.py:1039] (1/4) Epoch 15, batch 5200, loss[loss=0.1614, simple_loss=0.2255, pruned_loss=0.04868, over 23657.00 frames. ], tot_loss[loss=0.1876, simple_loss=0.2615, pruned_loss=0.05686, over 4728822.05 frames. ], batch size: 232, lr: 6.79e-03, grad_scale: 16.0 2023-09-29 23:55:16,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:55:17,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 23:55:19,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:55:21,330 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=530466.6666666666, ans=0.0 2023-09-29 23:55:24,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 23:55:26,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:55:27,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:55:30,402 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.99 vs. limit=15.0 2023-09-29 23:55:31,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:55:31,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:55:32,508 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.846e+02 2.066e+02 2.366e+02 4.637e+02, threshold=4.132e+02, percent-clipped=1.0 2023-09-29 23:55:32,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:55:34,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 23:55:36,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 23:55:36,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:55:38,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 23:55:42,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 23:55:43,701 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=530533.3333333334, ans=0.04949747468305833 2023-09-29 23:55:44,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:55:44,802 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 23:55:44,877 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 23:55:47,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 23:55:49,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:55:49,494 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 23:55:49,505 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:55:51,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:55:51,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:55:52,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 23:55:53,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:55:55,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:55:59,008 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 23:55:59,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 23:55:59,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 23:56:07,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 23:56:09,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 23:56:11,438 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=530666.6666666666, ans=0.1 2023-09-29 23:56:11,520 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=530666.6666666666, ans=0.125 2023-09-29 23:56:14,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 23:56:14,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:56:14,592 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=530666.6666666666, ans=0.125 2023-09-29 23:56:15,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 23:56:15,991 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:56:16,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 23:56:16,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:56:17,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 23:56:21,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:56:21,523 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=530733.3333333334, ans=0.2 2023-09-29 23:56:22,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:56:22,996 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=530733.3333333334, ans=0.125 2023-09-29 23:56:25,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:56:27,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:56:27,043 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:56:29,109 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=530733.3333333334, ans=0.1 2023-09-29 23:56:30,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:56:32,401 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 23:56:33,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:56:33,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:56:35,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:56:36,886 INFO [train.py:1039] (1/4) Epoch 15, batch 5250, loss[loss=0.1652, simple_loss=0.2435, pruned_loss=0.04348, over 24608.00 frames. ], tot_loss[loss=0.1869, simple_loss=0.2604, pruned_loss=0.05669, over 4729827.29 frames. ], batch size: 60, lr: 6.79e-03, grad_scale: 16.0 2023-09-29 23:56:36,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 23:56:37,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 23:56:38,930 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=530800.0, ans=0.0 2023-09-29 23:56:42,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:56:44,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:56:45,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:56:47,387 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:56:52,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:56:55,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:56:57,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:56:58,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 23:57:01,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 23:57:01,877 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:57:03,264 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:57:26,486 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=531000.0, ans=0.1 2023-09-29 23:57:52,153 INFO [train.py:1039] (1/4) Epoch 15, batch 5300, loss[loss=0.1903, simple_loss=0.2652, pruned_loss=0.05773, over 23657.00 frames. ], tot_loss[loss=0.1857, simple_loss=0.2587, pruned_loss=0.05641, over 4710524.48 frames. ], batch size: 85, lr: 6.78e-03, grad_scale: 16.0 2023-09-29 23:58:06,814 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.927e+02 2.161e+02 2.637e+02 4.366e+02, threshold=4.323e+02, percent-clipped=1.0 2023-09-29 23:58:06,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:58:07,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 23:58:07,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 23:58:07,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:58:07,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:58:07,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:58:07,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:58:07,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:58:07,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:58:07,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:58:07,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 23:58:08,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:58:08,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 23:58:08,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 23:58:08,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 23:58:08,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 23:58:08,920 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 23:58:09,052 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 23:58:09,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:58:09,722 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:58:09,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:58:09,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:58:10,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:58:10,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:58:11,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:58:11,087 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:58:11,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:58:11,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:58:11,276 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:58:11,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:58:11,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:58:12,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 23:58:12,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:58:12,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:58:12,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 23:58:12,866 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 23:58:13,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 23:58:13,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:58:13,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 23:58:13,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 23:58:13,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 23:58:14,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:58:14,890 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:58:15,046 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 23:58:15,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 23:58:15,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 23:58:15,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:58:15,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 23:58:15,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 23:58:15,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 23:58:15,935 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 23:58:24,974 INFO [train.py:1039] (1/4) Epoch 16, batch 0, loss[loss=0.2701, simple_loss=0.3215, pruned_loss=0.1093, over 19349.00 frames. ], tot_loss[loss=0.2701, simple_loss=0.3215, pruned_loss=0.1093, over 19349.00 frames. ], batch size: 388, lr: 6.57e-03, grad_scale: 32.0 2023-09-29 23:58:24,975 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-29 23:58:41,242 INFO [train.py:1071] (1/4) Epoch 16, validation: loss=0.3148, simple_loss=0.2815, pruned_loss=0.174, over 1125622.00 frames. 2023-09-29 23:58:41,243 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-29 23:58:41,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 23:58:41,554 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=531213.3333333334, ans=0.1 2023-09-29 23:58:42,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:58:44,530 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:58:50,516 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=531213.3333333334, ans=0.125 2023-09-29 23:58:51,776 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:58:52,676 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.30 vs. limit=6.0 2023-09-29 23:58:53,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:58:53,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:58:54,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 23:58:57,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 23:58:57,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:58:59,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:58:59,718 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=531280.0, ans=0.125 2023-09-29 23:59:02,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:59:02,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:59:02,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:59:04,216 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:59:04,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 23:59:08,034 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:59:15,691 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:59:15,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:59:17,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 23:59:21,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 23:59:21,108 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:59:22,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:59:27,052 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.94 vs. limit=10.0 2023-09-29 23:59:27,806 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:59:32,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:59:32,653 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 23:59:38,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 23:59:41,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 23:59:42,088 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=531413.3333333334, ans=0.2 2023-09-29 23:59:44,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:59:44,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:59:44,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:59:45,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:59:47,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 23:59:50,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:59:50,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:59:55,299 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:59:59,023 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-30 00:00:00,588 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=531480.0, ans=0.125 2023-09-30 00:00:02,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:00:03,436 INFO [train.py:1039] (1/4) Epoch 16, batch 50, loss[loss=0.1674, simple_loss=0.2468, pruned_loss=0.04394, over 24661.00 frames. ], tot_loss[loss=0.1872, simple_loss=0.2607, pruned_loss=0.05681, over 1073559.05 frames. ], batch size: 65, lr: 6.56e-03, grad_scale: 16.0 2023-09-30 00:00:03,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:00:06,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:00:06,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-30 00:00:08,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:00:09,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:00:11,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:00:14,171 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:00:15,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:00:19,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-30 00:00:19,685 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:00:25,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:00:27,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-30 00:00:29,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-30 00:00:31,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:00:34,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:00:34,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:00:34,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:00:36,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-30 00:00:37,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 00:00:37,831 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:00:44,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:00:47,075 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-30 00:00:47,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 00:00:48,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-30 00:00:50,353 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:00:51,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:00:51,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-30 00:00:53,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:00:55,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-30 00:00:56,168 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.66 vs. limit=15.0 2023-09-30 00:01:03,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:01:03,304 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:01:04,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:01:08,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:01:08,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:01:13,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-30 00:01:13,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-30 00:01:13,626 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=531813.3333333334, ans=0.1 2023-09-30 00:01:14,153 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.37 vs. limit=15.0 2023-09-30 00:01:14,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:01:14,940 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:01:17,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:01:17,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:01:19,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-30 00:01:19,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-30 00:01:22,224 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-30 00:01:23,585 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.948e+02 2.203e+02 2.562e+02 3.872e+02, threshold=4.407e+02, percent-clipped=0.0 2023-09-30 00:01:23,647 INFO [train.py:1039] (1/4) Epoch 16, batch 100, loss[loss=0.1671, simple_loss=0.2415, pruned_loss=0.04632, over 23718.00 frames. ], tot_loss[loss=0.1862, simple_loss=0.2612, pruned_loss=0.05565, over 1897354.86 frames. ], batch size: 149, lr: 6.56e-03, grad_scale: 16.0 2023-09-30 00:01:23,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:01:23,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-30 00:01:25,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-30 00:01:25,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-30 00:01:25,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:01:27,092 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-30 00:01:28,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-30 00:01:28,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:01:33,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:01:35,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:01:38,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:01:38,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-30 00:01:38,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:01:43,800 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:01:43,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:01:43,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-30 00:01:43,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:01:45,234 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:01:45,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-30 00:01:49,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-30 00:01:49,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:01:49,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:01:49,677 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:01:52,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-30 00:01:54,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:01:56,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:01:57,585 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-30 00:01:57,830 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=532013.3333333334, ans=0.025 2023-09-30 00:01:59,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 00:02:00,917 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=532013.3333333334, ans=0.125 2023-09-30 00:02:02,151 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-30 00:02:02,175 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-30 00:02:03,845 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:02:03,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:02:08,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:02:10,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:02:11,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:02:12,285 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=532080.0, ans=0.1 2023-09-30 00:02:18,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:02:18,999 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-30 00:02:22,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-30 00:02:25,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:02:27,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:02:28,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:02:33,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:02:34,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:02:36,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:02:40,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:02:40,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:02:41,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:02:41,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:02:41,926 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:02:43,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-30 00:02:43,347 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-30 00:02:43,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:02:43,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:02:43,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:02:43,650 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:02:43,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 00:02:45,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 00:02:45,161 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-30 00:02:45,173 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:02:46,471 INFO [train.py:1039] (1/4) Epoch 16, batch 150, loss[loss=0.2019, simple_loss=0.2643, pruned_loss=0.06975, over 22702.00 frames. ], tot_loss[loss=0.1858, simple_loss=0.2611, pruned_loss=0.05526, over 2527034.83 frames. ], batch size: 322, lr: 6.56e-03, grad_scale: 8.0 2023-09-30 00:02:47,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:02:48,090 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:02:48,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:02:49,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:02:51,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:02:54,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:02:54,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:02:56,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:03:00,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:03:00,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:03:03,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:03:04,483 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:03:07,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-30 00:03:07,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-30 00:03:07,647 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-30 00:03:10,686 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:03:10,694 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:03:10,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:03:13,014 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:03:13,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:03:13,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:03:13,203 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:03:13,461 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=532280.0, ans=0.2 2023-09-30 00:03:14,732 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-30 00:03:17,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:03:22,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:03:26,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 00:03:28,003 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-30 00:03:31,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:03:31,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:03:31,755 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:03:32,425 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.24 vs. limit=22.5 2023-09-30 00:03:33,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:03:36,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:03:37,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:03:38,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:03:39,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-30 00:03:44,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:03:46,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:03:46,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:03:46,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-30 00:03:49,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:03:52,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 00:03:55,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-30 00:03:57,567 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.13 vs. limit=6.0 2023-09-30 00:03:58,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:03:58,669 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=532480.0, ans=0.125 2023-09-30 00:04:00,567 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:04:03,566 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:04:04,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-30 00:04:05,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:04:05,678 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-30 00:04:08,525 INFO [train.py:1039] (1/4) Epoch 16, batch 200, loss[loss=0.1791, simple_loss=0.2617, pruned_loss=0.04826, over 24471.00 frames. ], tot_loss[loss=0.186, simple_loss=0.2606, pruned_loss=0.05565, over 3021126.81 frames. ], batch size: 63, lr: 6.56e-03, grad_scale: 8.0 2023-09-30 00:04:10,024 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.410e+02 1.995e+02 2.387e+02 2.784e+02 4.621e+02, threshold=4.773e+02, percent-clipped=1.0 2023-09-30 00:04:10,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:04:10,710 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=532546.6666666666, ans=0.1 2023-09-30 00:04:12,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:04:13,230 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.81 vs. limit=15.0 2023-09-30 00:04:13,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:04:17,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-30 00:04:18,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:04:18,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:04:22,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-30 00:04:23,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-30 00:04:23,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:04:25,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:04:25,734 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=532613.3333333334, ans=0.1 2023-09-30 00:04:28,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:04:28,753 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:04:30,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:04:37,737 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.18 vs. limit=15.0 2023-09-30 00:04:49,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:04:49,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:04:50,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:04:52,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:04:52,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 00:04:52,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 00:04:55,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:04:57,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:04:59,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:04:59,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:05:00,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-30 00:05:02,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 00:05:02,121 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:05:06,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:05:12,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:05:18,939 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:05:19,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:05:27,295 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:05:30,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-30 00:05:32,398 INFO [train.py:1039] (1/4) Epoch 16, batch 250, loss[loss=0.1761, simple_loss=0.2306, pruned_loss=0.06084, over 22681.00 frames. ], tot_loss[loss=0.1861, simple_loss=0.2608, pruned_loss=0.05564, over 3391903.76 frames. ], batch size: 322, lr: 6.56e-03, grad_scale: 8.0 2023-09-30 00:05:32,482 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:05:32,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:05:32,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:05:32,745 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 00:05:33,987 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:05:34,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-30 00:05:34,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:05:34,286 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-30 00:05:37,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:05:37,703 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=532880.0, ans=0.125 2023-09-30 00:05:38,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:05:40,441 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:05:41,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:05:42,355 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=532880.0, ans=0.125 2023-09-30 00:05:44,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:05:45,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:05:47,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:05:50,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:05:57,065 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=532946.6666666666, ans=0.125 2023-09-30 00:06:02,414 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=532946.6666666666, ans=0.125 2023-09-30 00:06:03,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:06:05,387 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:06:05,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:06:12,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-30 00:06:12,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-30 00:06:13,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:06:15,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:06:15,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 00:06:15,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:06:16,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:06:19,870 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:06:22,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-30 00:06:23,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:06:24,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:06:25,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:06:25,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:06:25,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:06:27,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:06:27,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:06:30,357 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:06:31,957 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:06:32,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:06:37,491 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=533146.6666666666, ans=0.125 2023-09-30 00:06:38,506 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:06:40,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:06:44,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:06:48,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:06:49,106 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=533146.6666666666, ans=0.0 2023-09-30 00:06:50,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:06:52,194 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=533146.6666666666, ans=0.0 2023-09-30 00:06:54,755 INFO [train.py:1039] (1/4) Epoch 16, batch 300, loss[loss=0.1462, simple_loss=0.2255, pruned_loss=0.03349, over 24585.00 frames. ], tot_loss[loss=0.1843, simple_loss=0.2585, pruned_loss=0.05506, over 3690684.83 frames. ], batch size: 60, lr: 6.55e-03, grad_scale: 8.0 2023-09-30 00:06:54,915 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-30 00:06:55,064 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:06:55,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:06:56,945 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.879e+02 2.129e+02 2.398e+02 3.317e+02, threshold=4.257e+02, percent-clipped=0.0 2023-09-30 00:06:57,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-30 00:06:58,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-30 00:06:58,814 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=533213.3333333334, ans=0.04949747468305833 2023-09-30 00:07:00,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:07:00,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-30 00:07:05,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:07:07,606 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:07:10,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:07:10,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-30 00:07:12,436 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:07:13,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 00:07:13,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-30 00:07:15,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:07:18,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-30 00:07:23,629 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 00:07:23,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-30 00:07:30,360 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-30 00:07:30,422 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:07:32,193 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=533346.6666666666, ans=0.125 2023-09-30 00:07:33,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:07:35,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:07:35,639 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-30 00:07:35,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:07:36,424 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.51 vs. limit=15.0 2023-09-30 00:07:37,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:07:39,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:07:41,108 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:07:47,060 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-30 00:07:47,067 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-30 00:07:48,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:07:52,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:07:53,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-30 00:07:54,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:07:59,302 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:08:02,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:08:02,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-30 00:08:06,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:08:06,890 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 00:08:08,515 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:08:11,858 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:08:11,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-30 00:08:11,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 00:08:13,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:08:15,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-30 00:08:17,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:08:18,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:08:20,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:08:20,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:08:20,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:08:22,266 INFO [train.py:1039] (1/4) Epoch 16, batch 350, loss[loss=0.1864, simple_loss=0.2541, pruned_loss=0.05935, over 23804.00 frames. ], tot_loss[loss=0.1837, simple_loss=0.2577, pruned_loss=0.05484, over 3921647.66 frames. ], batch size: 195, lr: 6.55e-03, grad_scale: 8.0 2023-09-30 00:08:26,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:08:26,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 00:08:28,125 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:08:34,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:08:37,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:08:39,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:08:40,709 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-30 00:08:42,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:08:42,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-30 00:08:46,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:08:46,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-30 00:08:48,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:08:51,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-30 00:08:52,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:08:55,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:08:57,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:08:58,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:08:58,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:08:59,010 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=533680.0, ans=0.0 2023-09-30 00:09:00,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:09:00,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:09:01,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-30 00:09:03,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:09:03,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:09:09,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:09:09,509 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:09:10,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:09:10,979 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:09:14,708 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=19.10 vs. limit=22.5 2023-09-30 00:09:17,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-30 00:09:17,067 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:09:22,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:09:22,287 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:09:22,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:09:24,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-30 00:09:26,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:09:27,604 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-30 00:09:27,756 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-30 00:09:27,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:09:31,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:09:31,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-30 00:09:32,204 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=12.77 vs. limit=15.0 2023-09-30 00:09:34,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:09:37,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:09:38,335 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.15 vs. limit=10.0 2023-09-30 00:09:39,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:09:40,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:09:40,737 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:09:42,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:09:45,698 INFO [train.py:1039] (1/4) Epoch 16, batch 400, loss[loss=0.1965, simple_loss=0.2596, pruned_loss=0.0667, over 22765.00 frames. ], tot_loss[loss=0.1834, simple_loss=0.257, pruned_loss=0.05486, over 4088909.03 frames. ], batch size: 322, lr: 6.55e-03, grad_scale: 16.0 2023-09-30 00:09:45,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:09:47,265 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.486e+02 1.835e+02 1.997e+02 2.324e+02 4.354e+02, threshold=3.993e+02, percent-clipped=1.0 2023-09-30 00:09:47,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:09:48,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-30 00:09:48,943 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:09:49,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:09:51,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:09:52,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:09:56,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:09:57,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:09:59,581 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-30 00:10:01,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-30 00:10:01,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:10:03,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-30 00:10:03,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:10:08,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:10:08,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:10:08,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-30 00:10:09,065 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=533946.6666666666, ans=0.125 2023-09-30 00:10:10,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:10:10,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:10:10,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:10:11,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:10:13,176 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-30 00:10:14,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-30 00:10:19,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:10:20,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:10:22,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-30 00:10:23,812 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-30 00:10:27,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:10:31,224 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:10:38,763 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-30 00:10:44,090 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-30 00:10:44,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-30 00:10:45,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:10:47,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:10:47,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-30 00:10:52,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:10:55,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 00:10:56,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:10:58,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:10:58,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-30 00:11:00,084 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-30 00:11:02,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-30 00:11:02,635 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=534146.6666666666, ans=0.125 2023-09-30 00:11:03,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:11:05,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:11:06,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-30 00:11:08,080 INFO [train.py:1039] (1/4) Epoch 16, batch 450, loss[loss=0.2313, simple_loss=0.2882, pruned_loss=0.08719, over 19618.00 frames. ], tot_loss[loss=0.1847, simple_loss=0.2583, pruned_loss=0.05552, over 4237381.45 frames. ], batch size: 388, lr: 6.55e-03, grad_scale: 16.0 2023-09-30 00:11:09,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 00:11:09,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:11:09,846 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-30 00:11:12,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-30 00:11:12,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:11:14,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:11:14,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:11:14,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-30 00:11:14,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:11:16,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:11:17,175 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=534213.3333333334, ans=0.0 2023-09-30 00:11:19,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 00:11:29,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:11:29,230 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:11:30,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-30 00:11:30,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-30 00:11:31,204 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=534280.0, ans=0.2 2023-09-30 00:11:36,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:11:37,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:11:40,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:11:43,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:11:45,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:11:45,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-30 00:11:47,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-30 00:11:49,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-30 00:11:49,919 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:11:51,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:11:52,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:11:54,535 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-30 00:11:54,550 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-30 00:11:54,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:11:56,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:11:57,619 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-30 00:12:00,671 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-30 00:12:00,728 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:12:00,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-30 00:12:02,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-30 00:12:05,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:12:07,369 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-30 00:12:07,424 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 00:12:09,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-30 00:12:14,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:12:15,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-30 00:12:15,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-30 00:12:17,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:12:21,400 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.61 vs. limit=15.0 2023-09-30 00:12:24,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:12:26,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:12:27,660 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:12:27,697 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-30 00:12:30,631 INFO [train.py:1039] (1/4) Epoch 16, batch 500, loss[loss=0.176, simple_loss=0.241, pruned_loss=0.05546, over 20589.00 frames. ], tot_loss[loss=0.1852, simple_loss=0.2589, pruned_loss=0.05574, over 4343793.95 frames. ], batch size: 44, lr: 6.55e-03, grad_scale: 8.0 2023-09-30 00:12:32,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:12:33,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:12:34,029 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=534546.6666666666, ans=0.125 2023-09-30 00:12:35,030 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.824e+02 2.052e+02 2.354e+02 3.367e+02, threshold=4.104e+02, percent-clipped=0.0 2023-09-30 00:12:35,182 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:12:35,196 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-30 00:12:36,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-30 00:12:36,772 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:12:38,663 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=534546.6666666666, ans=0.125 2023-09-30 00:12:39,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 00:12:45,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 00:12:45,290 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-30 00:12:48,242 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:12:48,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:12:49,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:13:03,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:13:03,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-30 00:13:03,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-30 00:13:03,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:13:04,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-30 00:13:04,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 00:13:08,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:13:08,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:13:08,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:13:08,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:13:09,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-30 00:13:12,786 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-30 00:13:14,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:13:14,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:13:17,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:13:17,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:13:17,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-30 00:13:20,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-30 00:13:24,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:13:25,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:13:31,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:13:35,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:13:41,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:13:42,935 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 00:13:44,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-30 00:13:44,224 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:13:44,243 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:13:47,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-30 00:13:47,495 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-30 00:13:49,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:13:51,952 INFO [train.py:1039] (1/4) Epoch 16, batch 550, loss[loss=0.265, simple_loss=0.3149, pruned_loss=0.1076, over 19325.00 frames. ], tot_loss[loss=0.1864, simple_loss=0.2605, pruned_loss=0.05615, over 4431604.93 frames. ], batch size: 388, lr: 6.54e-03, grad_scale: 8.0 2023-09-30 00:13:55,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-30 00:13:57,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-30 00:13:57,642 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:13:57,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-30 00:13:59,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:13:59,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:13:59,251 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:14:01,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:14:01,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:14:01,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:14:04,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:14:04,758 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=534880.0, ans=0.125 2023-09-30 00:14:04,761 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=534880.0, ans=0.125 2023-09-30 00:14:05,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-30 00:14:06,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:14:11,829 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:14:11,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:14:14,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:14:15,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:14:20,200 WARNING [train.py:1197] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-30 00:14:20,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-30 00:14:23,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:14:28,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:14:30,187 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:14:31,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:14:32,247 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=535013.3333333334, ans=0.1 2023-09-30 00:14:34,896 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:14:34,904 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-30 00:14:35,039 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:14:36,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 00:14:39,149 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:14:39,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 00:14:39,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-30 00:14:40,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:14:42,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-30 00:14:43,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-30 00:14:45,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:14:45,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:14:45,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:14:45,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:14:48,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:14:51,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:14:53,830 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=535080.0, ans=0.0 2023-09-30 00:14:54,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:14:54,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:14:56,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 00:14:58,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 00:15:00,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:15:02,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-30 00:15:02,417 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=535146.6666666666, ans=0.125 2023-09-30 00:15:03,628 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:15:05,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-30 00:15:05,285 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-30 00:15:12,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-30 00:15:14,321 INFO [train.py:1039] (1/4) Epoch 16, batch 600, loss[loss=0.2515, simple_loss=0.3085, pruned_loss=0.09728, over 19475.00 frames. ], tot_loss[loss=0.1869, simple_loss=0.2606, pruned_loss=0.05663, over 4478125.37 frames. ], batch size: 388, lr: 6.54e-03, grad_scale: 8.0 2023-09-30 00:15:16,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-30 00:15:16,163 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:15:18,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 00:15:18,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:15:18,579 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=535213.3333333334, ans=0.0 2023-09-30 00:15:19,620 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 1.903e+02 2.137e+02 2.465e+02 5.407e+02, threshold=4.275e+02, percent-clipped=1.0 2023-09-30 00:15:20,200 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=535213.3333333334, ans=0.125 2023-09-30 00:15:26,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:15:28,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 00:15:29,893 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-30 00:15:31,382 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:15:34,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:15:37,054 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:15:38,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-30 00:15:38,942 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=535280.0, ans=0.125 2023-09-30 00:15:40,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:15:43,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-30 00:15:49,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:15:49,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:15:50,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:15:55,264 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=535346.6666666666, ans=0.0 2023-09-30 00:15:56,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:15:56,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:15:57,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:16:01,831 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=535346.6666666666, ans=0.0 2023-09-30 00:16:04,503 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:16:09,064 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:16:09,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:16:09,083 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:16:16,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-30 00:16:21,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-30 00:16:21,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:16:26,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-30 00:16:26,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:16:29,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-30 00:16:29,769 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:16:29,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:16:37,777 INFO [train.py:1039] (1/4) Epoch 16, batch 650, loss[loss=0.1897, simple_loss=0.2582, pruned_loss=0.06061, over 23885.00 frames. ], tot_loss[loss=0.1864, simple_loss=0.2592, pruned_loss=0.0568, over 4519649.86 frames. ], batch size: 195, lr: 6.54e-03, grad_scale: 8.0 2023-09-30 00:16:37,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 00:16:41,641 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-30 00:16:43,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:16:43,470 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=535546.6666666666, ans=0.0 2023-09-30 00:16:44,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:16:45,107 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=535546.6666666666, ans=0.0 2023-09-30 00:16:46,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:16:49,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-30 00:16:49,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:16:55,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:16:55,801 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:16:56,148 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=535613.3333333334, ans=0.125 2023-09-30 00:16:58,399 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.43 vs. limit=15.0 2023-09-30 00:16:58,984 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:17:02,872 WARNING [train.py:1197] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-30 00:17:04,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:17:05,892 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:17:09,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:17:09,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 00:17:12,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:17:14,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:17:16,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 00:17:16,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:17:18,425 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:17:20,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 00:17:20,103 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-30 00:17:20,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:17:20,143 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:17:24,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:17:26,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:17:26,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:17:27,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:17:29,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-30 00:17:29,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:17:29,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:17:31,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-30 00:17:31,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:17:32,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 00:17:34,788 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-30 00:17:34,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-30 00:17:35,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:17:35,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:17:36,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:17:36,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:17:39,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:17:40,355 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=13.21 vs. limit=15.0 2023-09-30 00:17:44,824 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=535813.3333333334, ans=0.2 2023-09-30 00:17:47,523 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:17:47,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:17:49,750 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:17:53,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:17:53,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 00:17:54,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:17:59,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 00:17:59,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:17:59,731 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:17:59,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:18:01,189 INFO [train.py:1039] (1/4) Epoch 16, batch 700, loss[loss=0.1739, simple_loss=0.24, pruned_loss=0.0539, over 23679.00 frames. ], tot_loss[loss=0.1851, simple_loss=0.258, pruned_loss=0.05604, over 4572467.28 frames. ], batch size: 232, lr: 6.54e-03, grad_scale: 8.0 2023-09-30 00:18:05,524 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.452e+02 1.867e+02 2.174e+02 2.485e+02 3.899e+02, threshold=4.348e+02, percent-clipped=0.0 2023-09-30 00:18:05,756 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-30 00:18:07,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-30 00:18:07,747 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=535880.0, ans=0.1 2023-09-30 00:18:09,705 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=535880.0, ans=0.0 2023-09-30 00:18:10,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-30 00:18:11,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:18:12,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:18:14,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-30 00:18:19,133 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:18:22,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:18:23,079 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=535946.6666666666, ans=0.0 2023-09-30 00:18:25,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:18:25,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-30 00:18:27,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:18:29,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:18:31,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 00:18:31,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:18:32,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-30 00:18:37,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-30 00:18:41,920 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-30 00:18:41,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:18:44,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:18:46,188 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=536013.3333333334, ans=0.125 2023-09-30 00:18:49,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:18:49,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-30 00:18:54,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:18:56,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 00:18:56,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-30 00:19:01,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:19:01,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:19:04,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:19:08,402 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.68 vs. limit=22.5 2023-09-30 00:19:10,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:19:10,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-30 00:19:11,265 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.95 vs. limit=22.5 2023-09-30 00:19:15,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-30 00:19:15,156 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-30 00:19:18,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:19:20,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:19:22,021 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:19:24,041 INFO [train.py:1039] (1/4) Epoch 16, batch 750, loss[loss=0.162, simple_loss=0.241, pruned_loss=0.04152, over 24451.00 frames. ], tot_loss[loss=0.1843, simple_loss=0.2575, pruned_loss=0.05558, over 4606650.61 frames. ], batch size: 63, lr: 6.54e-03, grad_scale: 8.0 2023-09-30 00:19:24,271 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:19:24,283 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-30 00:19:28,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-30 00:19:28,883 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-30 00:19:28,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-30 00:19:31,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-30 00:19:31,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-30 00:19:31,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:19:32,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-30 00:19:32,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:19:34,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:19:36,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:19:39,380 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:19:39,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-30 00:19:39,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:19:41,066 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:19:41,425 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=536280.0, ans=0.1 2023-09-30 00:19:42,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 00:19:45,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:19:48,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:19:48,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:19:48,532 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-30 00:19:50,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-30 00:19:52,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:19:53,767 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:19:55,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-30 00:19:57,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-30 00:19:57,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:19:59,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-30 00:19:59,321 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-30 00:19:59,453 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=536346.6666666666, ans=0.0 2023-09-30 00:20:00,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-30 00:20:00,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:20:02,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 00:20:05,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:20:11,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:20:11,767 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=536346.6666666666, ans=0.125 2023-09-30 00:20:12,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:20:12,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 00:20:14,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:20:17,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:20:17,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-30 00:20:17,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:20:18,110 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer_ff2.min_abs, batch_count=536413.3333333334, ans=0.1 2023-09-30 00:20:19,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-30 00:20:19,370 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=536413.3333333334, ans=0.125 2023-09-30 00:20:20,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:20:23,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:20:23,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-30 00:20:25,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:20:30,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:20:32,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:20:32,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:20:34,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 00:20:38,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-30 00:20:38,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:20:40,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:20:41,725 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:20:43,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:20:44,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:20:46,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-30 00:20:47,959 INFO [train.py:1039] (1/4) Epoch 16, batch 800, loss[loss=0.2105, simple_loss=0.273, pruned_loss=0.07398, over 22903.00 frames. ], tot_loss[loss=0.1852, simple_loss=0.2583, pruned_loss=0.05607, over 4629191.11 frames. ], batch size: 322, lr: 6.53e-03, grad_scale: 16.0 2023-09-30 00:20:52,608 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.643e+02 1.946e+02 2.133e+02 2.496e+02 4.467e+02, threshold=4.266e+02, percent-clipped=1.0 2023-09-30 00:20:54,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:20:54,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:20:55,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:20:55,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:20:57,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:20:58,980 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:20:59,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:21:01,001 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=536546.6666666666, ans=0.0 2023-09-30 00:21:04,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:21:05,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:21:09,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-30 00:21:10,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:21:13,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:21:13,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:21:13,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:21:13,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-30 00:21:13,964 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:21:15,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-30 00:21:19,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:21:22,368 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:21:25,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:21:25,335 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=536680.0, ans=0.0 2023-09-30 00:21:26,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:21:27,265 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.54 vs. limit=22.5 2023-09-30 00:21:28,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:21:28,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:21:32,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:21:32,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:21:34,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-30 00:21:37,332 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-30 00:21:37,385 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-30 00:21:37,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:21:37,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:21:39,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:21:39,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:21:45,763 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-30 00:21:45,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-30 00:21:48,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-30 00:21:51,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:21:53,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:21:58,372 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:21:59,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-30 00:21:59,844 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:22:02,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-30 00:22:07,743 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=536880.0, ans=0.125 2023-09-30 00:22:08,796 INFO [train.py:1039] (1/4) Epoch 16, batch 850, loss[loss=0.1774, simple_loss=0.2598, pruned_loss=0.04751, over 24647.00 frames. ], tot_loss[loss=0.1856, simple_loss=0.2585, pruned_loss=0.05634, over 4638878.02 frames. ], batch size: 65, lr: 6.53e-03, grad_scale: 16.0 2023-09-30 00:22:10,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:22:12,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:22:13,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-30 00:22:13,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:22:15,650 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:22:17,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-30 00:22:17,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:22:18,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:22:20,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:22:20,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 00:22:24,058 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:22:25,522 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-30 00:22:25,588 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-30 00:22:25,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-30 00:22:27,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:22:28,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:22:30,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:22:30,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:22:30,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 00:22:35,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:22:35,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:22:35,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-30 00:22:38,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-30 00:22:42,126 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=537013.3333333334, ans=0.125 2023-09-30 00:22:43,443 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:22:44,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-30 00:22:46,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-30 00:22:50,529 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-30 00:22:53,541 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-30 00:22:53,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:22:53,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:22:53,602 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 00:22:57,638 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:22:59,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:22:59,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-30 00:23:00,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:23:02,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:23:03,861 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:23:03,894 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-30 00:23:05,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:23:06,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-30 00:23:07,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-30 00:23:12,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:23:12,018 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:23:13,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:23:13,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:23:14,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:23:15,303 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=537146.6666666666, ans=0.125 2023-09-30 00:23:17,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:23:19,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:23:21,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-30 00:23:22,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:23:22,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-30 00:23:29,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-30 00:23:31,172 INFO [train.py:1039] (1/4) Epoch 16, batch 900, loss[loss=0.2285, simple_loss=0.2866, pruned_loss=0.08519, over 19420.00 frames. ], tot_loss[loss=0.1876, simple_loss=0.2606, pruned_loss=0.05734, over 4647990.81 frames. ], batch size: 388, lr: 6.53e-03, grad_scale: 16.0 2023-09-30 00:23:31,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:23:31,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-30 00:23:31,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:23:31,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:23:33,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-30 00:23:36,678 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 2.050e+02 2.390e+02 2.977e+02 4.145e+02, threshold=4.781e+02, percent-clipped=0.0 2023-09-30 00:23:38,984 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:23:42,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:23:42,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-30 00:23:46,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:23:46,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-30 00:23:48,114 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-30 00:23:49,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:23:49,609 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:23:49,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 00:23:49,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:23:53,592 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=537280.0, ans=0.1 2023-09-30 00:24:02,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:24:02,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:24:02,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 00:24:05,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:24:11,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-30 00:24:14,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:24:18,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-30 00:24:18,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:24:20,095 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-30 00:24:21,625 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-30 00:24:27,847 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-30 00:24:27,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:24:30,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 00:24:36,592 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:24:36,610 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:24:39,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-30 00:24:39,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:24:40,611 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.02 vs. limit=15.0 2023-09-30 00:24:41,526 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-30 00:24:43,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:24:43,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:24:45,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:24:45,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:24:50,600 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-30 00:24:50,652 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-30 00:24:52,123 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-30 00:24:52,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-30 00:24:54,970 INFO [train.py:1039] (1/4) Epoch 16, batch 950, loss[loss=0.1646, simple_loss=0.2508, pruned_loss=0.03916, over 24483.00 frames. ], tot_loss[loss=0.1889, simple_loss=0.2614, pruned_loss=0.0582, over 4646046.53 frames. ], batch size: 66, lr: 6.53e-03, grad_scale: 16.0 2023-09-30 00:24:55,110 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:24:59,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-30 00:24:59,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=537546.6666666666, ans=0.2 2023-09-30 00:25:04,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:25:08,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:25:08,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:25:09,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 00:25:12,914 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-30 00:25:16,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:25:16,124 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:25:17,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:25:17,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:25:17,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-30 00:25:19,239 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-30 00:25:21,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:25:21,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-30 00:25:21,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:25:27,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:25:27,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:25:27,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:25:29,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-30 00:25:30,834 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 00:25:32,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:25:32,673 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=537680.0, ans=0.125 2023-09-30 00:25:32,711 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=537680.0, ans=0.125 2023-09-30 00:25:34,181 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=537680.0, ans=0.125 2023-09-30 00:25:35,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:25:40,542 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:25:40,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:25:45,084 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-30 00:25:45,410 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=537746.6666666666, ans=0.125 2023-09-30 00:25:46,715 WARNING [train.py:1197] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 00:25:46,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 00:25:48,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:25:48,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:25:48,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:25:48,516 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=537746.6666666666, ans=0.0 2023-09-30 00:25:53,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-30 00:25:53,231 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=537746.6666666666, ans=0.125 2023-09-30 00:25:54,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:25:58,347 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:25:59,818 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:25:59,856 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-30 00:25:59,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:25:59,884 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:25:59,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-30 00:26:04,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:26:07,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:26:07,848 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=537813.3333333334, ans=0.125 2023-09-30 00:26:10,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:26:13,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-30 00:26:13,140 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-30 00:26:13,851 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.50 vs. limit=15.0 2023-09-30 00:26:16,017 INFO [train.py:1039] (1/4) Epoch 16, batch 1000, loss[loss=0.1762, simple_loss=0.262, pruned_loss=0.04519, over 24658.00 frames. ], tot_loss[loss=0.1879, simple_loss=0.2608, pruned_loss=0.05746, over 4662364.24 frames. ], batch size: 65, lr: 6.53e-03, grad_scale: 16.0 2023-09-30 00:26:16,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:26:20,689 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 2.021e+02 2.226e+02 2.513e+02 3.322e+02, threshold=4.453e+02, percent-clipped=0.0 2023-09-30 00:26:20,809 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-30 00:26:20,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:26:24,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:26:26,271 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-30 00:26:26,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-30 00:26:28,500 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=537880.0, ans=0.1 2023-09-30 00:26:31,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:26:31,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:26:33,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:26:34,730 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-30 00:26:39,387 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=537946.6666666666, ans=0.125 2023-09-30 00:26:40,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-30 00:26:42,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-30 00:26:43,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:26:46,612 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-30 00:26:46,789 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-30 00:26:46,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-30 00:26:49,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:26:50,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:26:57,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:26:59,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:27:01,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:27:01,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:27:01,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-30 00:27:01,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:27:02,919 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:27:03,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:27:04,452 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-30 00:27:05,279 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.52 vs. limit=22.5 2023-09-30 00:27:06,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-30 00:27:07,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-30 00:27:09,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-30 00:27:10,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:27:17,160 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.51 vs. limit=15.0 2023-09-30 00:27:18,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:27:18,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:27:19,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:27:20,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:27:21,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-30 00:27:23,273 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:27:23,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-30 00:27:24,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-30 00:27:26,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:27:26,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:27:27,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:27:31,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 00:27:33,035 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:27:36,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:27:37,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:27:39,056 INFO [train.py:1039] (1/4) Epoch 16, batch 1050, loss[loss=0.162, simple_loss=0.2063, pruned_loss=0.05887, over 19219.00 frames. ], tot_loss[loss=0.1864, simple_loss=0.2591, pruned_loss=0.05681, over 4675144.50 frames. ], batch size: 388, lr: 6.52e-03, grad_scale: 16.0 2023-09-30 00:27:39,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 00:27:40,772 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:27:43,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:27:45,400 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=538213.3333333334, ans=0.125 2023-09-30 00:27:47,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:27:48,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-30 00:27:52,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:27:53,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:27:54,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:27:55,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:27:55,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-30 00:27:57,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:27:57,299 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=538280.0, ans=0.0 2023-09-30 00:27:58,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-30 00:28:00,359 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:28:01,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-30 00:28:01,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-30 00:28:05,726 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=538280.0, ans=0.125 2023-09-30 00:28:07,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:28:09,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-30 00:28:09,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:28:09,543 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=538280.0, ans=0.125 2023-09-30 00:28:12,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-30 00:28:12,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-30 00:28:12,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:28:15,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-30 00:28:19,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-30 00:28:19,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:28:23,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 00:28:27,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-30 00:28:27,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:28:28,770 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:28:29,147 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=538413.3333333334, ans=0.125 2023-09-30 00:28:31,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:28:36,809 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-30 00:28:37,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-30 00:28:38,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-30 00:28:38,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:28:39,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:28:40,032 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-30 00:28:40,378 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=538413.3333333334, ans=0.1 2023-09-30 00:28:43,748 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=538480.0, ans=0.2 2023-09-30 00:28:44,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:28:47,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:28:47,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:28:49,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:28:49,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:28:49,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=538480.0, ans=0.1 2023-09-30 00:28:54,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:28:54,062 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-30 00:28:55,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:28:55,659 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-30 00:28:55,996 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=538480.0, ans=0.0 2023-09-30 00:28:57,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-30 00:28:57,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:28:57,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=538480.0, ans=0.2 2023-09-30 00:29:00,862 INFO [train.py:1039] (1/4) Epoch 16, batch 1100, loss[loss=0.1592, simple_loss=0.2393, pruned_loss=0.03954, over 24428.00 frames. ], tot_loss[loss=0.1854, simple_loss=0.2589, pruned_loss=0.05594, over 4698710.76 frames. ], batch size: 58, lr: 6.52e-03, grad_scale: 16.0 2023-09-30 00:29:01,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:29:03,776 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=538546.6666666666, ans=0.1 2023-09-30 00:29:06,282 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.914e+02 2.113e+02 2.523e+02 4.579e+02, threshold=4.227e+02, percent-clipped=1.0 2023-09-30 00:29:07,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:29:09,615 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=538546.6666666666, ans=0.0 2023-09-30 00:29:14,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 00:29:15,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:29:15,957 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:29:17,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-30 00:29:19,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:29:21,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-30 00:29:21,802 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.71 vs. limit=15.0 2023-09-30 00:29:24,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:29:25,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:29:25,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-30 00:29:27,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 00:29:27,598 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:29:29,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:29:30,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:29:34,487 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-30 00:29:39,036 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:29:39,931 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.17 vs. limit=15.0 2023-09-30 00:29:42,993 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=538680.0, ans=0.125 2023-09-30 00:29:44,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-30 00:29:45,730 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-30 00:29:45,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:29:47,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:29:49,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-30 00:29:49,563 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:29:51,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-30 00:29:52,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:29:52,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:29:52,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:29:52,821 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:29:54,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-30 00:29:59,679 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:29:59,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-30 00:30:00,444 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.96 vs. limit=15.0 2023-09-30 00:30:02,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:30:08,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 00:30:12,558 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-30 00:30:12,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-30 00:30:14,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:30:15,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:30:16,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:30:18,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-30 00:30:19,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:30:19,694 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:30:21,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-30 00:30:21,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:30:22,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-30 00:30:22,990 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=538880.0, ans=0.125 2023-09-30 00:30:24,579 INFO [train.py:1039] (1/4) Epoch 16, batch 1150, loss[loss=0.1822, simple_loss=0.2613, pruned_loss=0.05152, over 23941.00 frames. ], tot_loss[loss=0.1853, simple_loss=0.2593, pruned_loss=0.0557, over 4702534.44 frames. ], batch size: 80, lr: 6.52e-03, grad_scale: 16.0 2023-09-30 00:30:24,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:30:24,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:30:25,110 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=538880.0, ans=0.0 2023-09-30 00:30:26,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-30 00:30:31,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:30:34,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:30:36,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:30:36,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:30:36,296 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-30 00:30:36,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:30:39,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-30 00:30:39,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:30:39,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 00:30:46,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-30 00:30:48,694 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:30:53,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:30:54,582 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:30:54,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-30 00:30:56,667 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-30 00:30:56,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:31:00,654 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.75 vs. limit=12.0 2023-09-30 00:31:01,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-30 00:31:01,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:31:02,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:31:08,591 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=539013.3333333334, ans=0.125 2023-09-30 00:31:13,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:31:18,379 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.66 vs. limit=12.0 2023-09-30 00:31:19,619 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:31:19,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-30 00:31:21,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:31:21,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:31:27,778 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-30 00:31:28,197 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=539080.0, ans=0.1 2023-09-30 00:31:29,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:31:37,331 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-30 00:31:41,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:31:42,684 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:31:42,731 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-30 00:31:44,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:31:47,051 INFO [train.py:1039] (1/4) Epoch 16, batch 1200, loss[loss=0.2079, simple_loss=0.2719, pruned_loss=0.07194, over 23803.00 frames. ], tot_loss[loss=0.1862, simple_loss=0.2598, pruned_loss=0.05631, over 4702928.87 frames. ], batch size: 179, lr: 6.52e-03, grad_scale: 16.0 2023-09-30 00:31:48,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:31:53,129 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.416e+02 1.828e+02 2.089e+02 2.357e+02 3.548e+02, threshold=4.177e+02, percent-clipped=0.0 2023-09-30 00:31:55,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:31:55,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-30 00:31:57,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:31:57,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:31:58,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:32:00,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:32:02,526 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 00:32:03,011 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=539280.0, ans=0.125 2023-09-30 00:32:03,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:32:03,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:32:07,031 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-30 00:32:07,258 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=539280.0, ans=0.0 2023-09-30 00:32:10,787 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-30 00:32:13,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:32:14,605 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.04 vs. limit=10.0 2023-09-30 00:32:16,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:32:19,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:32:22,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:32:22,099 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-30 00:32:22,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:32:28,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-30 00:32:29,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:32:29,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-30 00:32:30,005 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:32:34,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-30 00:32:38,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-30 00:32:40,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:32:41,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:32:41,943 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=539413.3333333334, ans=0.1 2023-09-30 00:32:44,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:32:44,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-30 00:32:46,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:32:46,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:32:47,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:32:48,543 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-30 00:32:48,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:32:48,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-30 00:32:48,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 00:32:50,980 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:32:50,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:32:53,103 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.06 vs. limit=22.5 2023-09-30 00:32:55,483 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-30 00:32:57,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:32:57,284 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=539480.0, ans=0.125 2023-09-30 00:33:01,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-30 00:33:05,022 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-30 00:33:09,245 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:33:10,642 INFO [train.py:1039] (1/4) Epoch 16, batch 1250, loss[loss=0.1673, simple_loss=0.2429, pruned_loss=0.04585, over 24324.00 frames. ], tot_loss[loss=0.1873, simple_loss=0.2611, pruned_loss=0.05676, over 4695665.79 frames. ], batch size: 61, lr: 6.52e-03, grad_scale: 16.0 2023-09-30 00:33:12,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-30 00:33:13,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:33:15,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:33:17,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-30 00:33:20,920 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=539546.6666666666, ans=0.1 2023-09-30 00:33:22,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:33:22,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:33:22,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-30 00:33:23,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:33:25,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 00:33:30,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 00:33:30,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:33:32,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:33:32,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:33:33,867 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=539613.3333333334, ans=0.2 2023-09-30 00:33:35,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-30 00:33:35,360 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer_na.min_abs, batch_count=539613.3333333334, ans=0.02 2023-09-30 00:33:38,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 00:33:38,138 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-30 00:33:38,148 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:33:41,534 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:33:43,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:33:46,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:33:48,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-30 00:33:52,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-30 00:33:53,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:33:55,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:33:56,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-30 00:33:58,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:33:58,260 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-30 00:33:58,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:33:58,295 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:34:01,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:34:06,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:34:06,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:34:08,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-30 00:34:08,159 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-30 00:34:08,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-30 00:34:11,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:34:13,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-30 00:34:13,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:34:15,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-30 00:34:15,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:34:18,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-30 00:34:18,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-30 00:34:18,991 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:34:19,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-30 00:34:20,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:34:23,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-30 00:34:27,031 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:34:28,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:34:29,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:34:31,660 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-30 00:34:33,154 INFO [train.py:1039] (1/4) Epoch 16, batch 1300, loss[loss=0.187, simple_loss=0.2569, pruned_loss=0.05852, over 23632.00 frames. ], tot_loss[loss=0.1881, simple_loss=0.2613, pruned_loss=0.05745, over 4689691.39 frames. ], batch size: 135, lr: 6.51e-03, grad_scale: 16.0 2023-09-30 00:34:36,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:34:36,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-30 00:34:39,913 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.480e+02 1.902e+02 2.089e+02 2.370e+02 3.462e+02, threshold=4.179e+02, percent-clipped=0.0 2023-09-30 00:34:41,544 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:34:43,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-30 00:34:43,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:34:46,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:34:48,299 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:34:48,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-30 00:34:52,512 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 00:34:53,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:34:53,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-30 00:34:55,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-30 00:34:55,453 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=539946.6666666666, ans=0.2 2023-09-30 00:35:00,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 00:35:03,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:35:05,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:35:06,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:35:07,214 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=540013.3333333334, ans=0.125 2023-09-30 00:35:08,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:35:08,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:35:09,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-30 00:35:09,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-30 00:35:16,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:35:17,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:35:19,903 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-30 00:35:20,007 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 00:35:21,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:35:25,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:35:25,304 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.max_abs, batch_count=540080.0, ans=10.0 2023-09-30 00:35:26,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-30 00:35:26,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:35:26,628 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-30 00:35:28,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:35:31,507 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=540080.0, ans=0.0 2023-09-30 00:35:33,384 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:35:33,400 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:35:36,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-30 00:35:38,049 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-30 00:35:39,538 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-30 00:35:42,841 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:35:45,872 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-30 00:35:47,397 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:35:54,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-30 00:35:56,304 INFO [train.py:1039] (1/4) Epoch 16, batch 1350, loss[loss=0.1856, simple_loss=0.2673, pruned_loss=0.05195, over 24048.00 frames. ], tot_loss[loss=0.188, simple_loss=0.2611, pruned_loss=0.05746, over 4693422.15 frames. ], batch size: 80, lr: 6.51e-03, grad_scale: 16.0 2023-09-30 00:35:59,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:36:02,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:36:06,073 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:36:07,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:36:09,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:36:09,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:36:12,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:36:13,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-30 00:36:15,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-30 00:36:16,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:36:18,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-30 00:36:19,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:36:20,301 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=540280.0, ans=0.2 2023-09-30 00:36:21,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:36:21,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-30 00:36:24,053 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.76 vs. limit=5.0 2023-09-30 00:36:25,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-30 00:36:28,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-30 00:36:29,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:36:29,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-30 00:36:35,362 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=540346.6666666666, ans=0.0 2023-09-30 00:36:38,294 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=540346.6666666666, ans=0.125 2023-09-30 00:36:42,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:36:52,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:36:52,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:36:52,330 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-30 00:36:55,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:36:56,010 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.99 vs. limit=12.0 2023-09-30 00:36:58,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-30 00:36:58,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-30 00:36:58,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:37:02,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:37:04,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-30 00:37:07,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:37:07,980 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.91 vs. limit=6.0 2023-09-30 00:37:11,936 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.99 vs. limit=15.0 2023-09-30 00:37:13,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-30 00:37:14,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-30 00:37:17,861 INFO [train.py:1039] (1/4) Epoch 16, batch 1400, loss[loss=0.1818, simple_loss=0.2558, pruned_loss=0.05394, over 24311.00 frames. ], tot_loss[loss=0.1871, simple_loss=0.26, pruned_loss=0.05712, over 4698297.33 frames. ], batch size: 61, lr: 6.51e-03, grad_scale: 16.0 2023-09-30 00:37:19,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-30 00:37:19,870 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=540546.6666666666, ans=0.1 2023-09-30 00:37:22,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:37:23,989 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.643e+02 1.842e+02 1.998e+02 2.370e+02 3.291e+02, threshold=3.996e+02, percent-clipped=0.0 2023-09-30 00:37:24,231 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:37:24,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:37:31,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-30 00:37:33,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-30 00:37:44,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:37:45,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:37:48,201 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=540613.3333333334, ans=0.125 2023-09-30 00:37:49,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:37:49,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-30 00:37:54,063 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:37:54,488 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=540680.0, ans=0.125 2023-09-30 00:37:55,694 WARNING [train.py:1197] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 00:38:03,363 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:38:04,835 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:38:09,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-30 00:38:10,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-30 00:38:12,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:38:12,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:38:13,806 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:38:14,172 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=540746.6666666666, ans=0.125 2023-09-30 00:38:14,192 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=540746.6666666666, ans=0.125 2023-09-30 00:38:15,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:38:15,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:38:15,393 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:38:16,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-30 00:38:16,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:38:22,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:38:25,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:38:33,234 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-30 00:38:33,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 00:38:34,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:38:36,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 00:38:38,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:38:39,963 INFO [train.py:1039] (1/4) Epoch 16, batch 1450, loss[loss=0.1927, simple_loss=0.2554, pruned_loss=0.06507, over 23585.00 frames. ], tot_loss[loss=0.1856, simple_loss=0.2587, pruned_loss=0.05626, over 4700338.58 frames. ], batch size: 256, lr: 6.51e-03, grad_scale: 8.0 2023-09-30 00:38:40,844 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:38:43,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-30 00:38:45,360 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:38:45,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:38:45,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-30 00:38:50,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:38:50,760 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_ff2.min_abs, batch_count=540880.0, ans=0.1 2023-09-30 00:38:51,510 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.84 vs. limit=15.0 2023-09-30 00:38:51,968 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 00:38:53,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:38:54,010 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-30 00:38:55,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 00:38:57,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-30 00:38:57,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:38:58,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:38:58,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-30 00:39:00,139 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:39:00,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:39:01,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 00:39:01,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:39:03,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:39:04,771 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:39:07,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:39:08,128 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=540946.6666666666, ans=0.125 2023-09-30 00:39:11,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:39:11,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:39:13,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:39:13,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:39:14,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:39:14,779 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:39:16,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:39:16,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:39:21,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-30 00:39:24,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:39:28,516 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-30 00:39:28,997 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=541080.0, ans=0.0 2023-09-30 00:39:30,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:39:30,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-30 00:39:31,696 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:39:33,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-30 00:39:37,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:39:39,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-30 00:39:40,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-30 00:39:42,130 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:39:43,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:39:45,928 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:39:48,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-30 00:39:51,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-30 00:39:51,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-30 00:39:51,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=541146.6666666666, ans=0.0 2023-09-30 00:39:52,760 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:39:55,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 00:40:02,493 INFO [train.py:1039] (1/4) Epoch 16, batch 1500, loss[loss=0.1884, simple_loss=0.2588, pruned_loss=0.05904, over 23207.00 frames. ], tot_loss[loss=0.1852, simple_loss=0.2585, pruned_loss=0.05593, over 4703904.35 frames. ], batch size: 119, lr: 6.51e-03, grad_scale: 8.0 2023-09-30 00:40:06,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-30 00:40:07,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-30 00:40:07,828 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:40:09,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:40:10,587 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.398e+02 1.908e+02 2.053e+02 2.386e+02 4.299e+02, threshold=4.105e+02, percent-clipped=2.0 2023-09-30 00:40:10,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:40:10,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:40:12,356 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-30 00:40:13,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:40:13,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-30 00:40:13,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:40:15,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:40:18,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:40:18,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:40:23,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:40:23,893 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-30 00:40:25,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-30 00:40:25,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:40:26,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:40:29,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-30 00:40:35,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-30 00:40:36,065 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=541346.6666666666, ans=0.2 2023-09-30 00:40:37,902 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:40:39,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-30 00:40:40,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-30 00:40:43,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:40:45,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:40:45,447 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:40:47,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-30 00:40:47,074 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:40:47,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:40:48,550 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-30 00:40:48,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:40:54,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:40:54,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-30 00:40:57,325 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=541413.3333333334, ans=0.0 2023-09-30 00:41:01,882 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:41:03,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 00:41:08,060 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-30 00:41:08,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:41:08,157 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-30 00:41:10,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:41:11,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:41:13,797 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-30 00:41:15,313 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-30 00:41:18,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-30 00:41:19,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:41:22,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:41:22,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:41:23,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:41:24,349 INFO [train.py:1039] (1/4) Epoch 16, batch 1550, loss[loss=0.1797, simple_loss=0.267, pruned_loss=0.04619, over 24435.00 frames. ], tot_loss[loss=0.1849, simple_loss=0.2591, pruned_loss=0.05538, over 4721411.43 frames. ], batch size: 69, lr: 6.50e-03, grad_scale: 8.0 2023-09-30 00:41:24,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:41:24,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:41:26,114 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-30 00:41:26,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-30 00:41:26,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:41:27,764 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-30 00:41:27,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-30 00:41:30,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:41:32,890 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:41:33,716 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.15 vs. limit=6.0 2023-09-30 00:41:34,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:41:34,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:41:35,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:41:35,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:41:39,607 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-30 00:41:39,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:41:39,979 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=541613.3333333334, ans=0.125 2023-09-30 00:41:41,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:41:41,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 00:41:44,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:41:44,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-30 00:41:46,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:41:46,256 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-30 00:41:47,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-30 00:41:47,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-30 00:41:49,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:41:49,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:41:54,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:41:55,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-30 00:41:55,932 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-30 00:41:57,710 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=541680.0, ans=0.1 2023-09-30 00:42:06,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:42:10,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:42:10,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:42:10,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:42:12,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-30 00:42:17,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 00:42:18,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:42:22,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:42:25,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:42:25,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:42:25,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-30 00:42:27,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 00:42:28,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:42:28,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:42:30,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-30 00:42:30,359 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-30 00:42:33,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:42:36,102 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.26 vs. limit=15.0 2023-09-30 00:42:37,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-30 00:42:44,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:42:45,943 INFO [train.py:1039] (1/4) Epoch 16, batch 1600, loss[loss=0.1952, simple_loss=0.2594, pruned_loss=0.06551, over 23820.00 frames. ], tot_loss[loss=0.1856, simple_loss=0.26, pruned_loss=0.05557, over 4718191.77 frames. ], batch size: 195, lr: 6.50e-03, grad_scale: 16.0 2023-09-30 00:42:46,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:42:46,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-30 00:42:46,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 00:42:47,257 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=541880.0, ans=0.125 2023-09-30 00:42:48,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:42:48,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:42:48,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:42:49,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:42:53,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:42:54,298 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 1.801e+02 1.974e+02 2.195e+02 3.172e+02, threshold=3.948e+02, percent-clipped=0.0 2023-09-30 00:42:54,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-30 00:42:56,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-30 00:42:59,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-30 00:43:02,441 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:43:04,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-30 00:43:04,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:43:07,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:43:08,003 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.51 vs. limit=22.5 2023-09-30 00:43:11,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:43:13,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-30 00:43:16,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:43:17,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-30 00:43:17,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:43:19,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-30 00:43:25,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-30 00:43:33,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:43:33,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-30 00:43:34,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:43:34,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:43:34,954 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:43:38,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-30 00:43:41,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 00:43:42,933 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:43:44,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:43:44,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:43:44,565 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:43:47,263 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:43:48,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:43:50,813 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:43:57,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:43:57,877 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=542146.6666666666, ans=0.125 2023-09-30 00:43:59,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:44:02,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-30 00:44:02,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:44:03,595 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-30 00:44:06,898 INFO [train.py:1039] (1/4) Epoch 16, batch 1650, loss[loss=0.1818, simple_loss=0.2479, pruned_loss=0.05788, over 23670.00 frames. ], tot_loss[loss=0.1864, simple_loss=0.2606, pruned_loss=0.05607, over 4713029.08 frames. ], batch size: 232, lr: 6.50e-03, grad_scale: 8.0 2023-09-30 00:44:10,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:44:11,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:44:11,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:44:11,760 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-30 00:44:11,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-30 00:44:11,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-30 00:44:11,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-30 00:44:14,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:44:15,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:44:16,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:44:16,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-30 00:44:19,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:44:21,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-30 00:44:22,859 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:44:24,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:44:24,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:44:24,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:44:27,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-30 00:44:27,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-30 00:44:29,458 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=542280.0, ans=0.125 2023-09-30 00:44:32,429 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:44:35,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-30 00:44:42,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-30 00:44:42,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:44:45,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-30 00:44:47,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:44:49,371 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.75 vs. limit=15.0 2023-09-30 00:44:50,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:44:51,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:44:51,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:44:52,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:44:52,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:44:56,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:44:57,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:44:58,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:44:58,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:45:00,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:45:01,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 00:45:04,356 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=542413.3333333334, ans=0.2 2023-09-30 00:45:05,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:45:07,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-30 00:45:07,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:45:08,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-30 00:45:11,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-30 00:45:11,117 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-30 00:45:11,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:45:12,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:45:12,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:45:12,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:45:12,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-30 00:45:17,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:45:18,774 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:45:18,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:45:21,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-30 00:45:26,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:45:26,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:45:26,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-30 00:45:28,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:45:28,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:45:28,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:45:30,570 INFO [train.py:1039] (1/4) Epoch 16, batch 1700, loss[loss=0.2005, simple_loss=0.2689, pruned_loss=0.06609, over 23502.00 frames. ], tot_loss[loss=0.1863, simple_loss=0.2593, pruned_loss=0.05663, over 4690884.83 frames. ], batch size: 120, lr: 6.50e-03, grad_scale: 8.0 2023-09-30 00:45:32,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:45:33,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:45:33,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-30 00:45:37,440 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 00:45:40,401 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.475e+02 1.927e+02 2.213e+02 2.603e+02 4.204e+02, threshold=4.426e+02, percent-clipped=1.0 2023-09-30 00:45:45,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:45:48,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:45:53,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-30 00:45:53,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:45:54,995 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:45:55,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:45:58,070 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-30 00:46:01,170 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:46:01,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:46:03,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-30 00:46:05,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-30 00:46:05,226 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=542680.0, ans=0.2 2023-09-30 00:46:06,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-30 00:46:08,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-30 00:46:10,372 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:46:12,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-30 00:46:13,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:46:20,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:46:22,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:46:22,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:46:23,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-30 00:46:23,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-30 00:46:24,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:46:25,818 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=542746.6666666666, ans=0.0 2023-09-30 00:46:27,022 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:46:27,023 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-30 00:46:28,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:46:28,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:46:28,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:46:28,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:46:32,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:46:32,894 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:46:33,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:46:35,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:46:35,132 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:46:40,614 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:46:42,002 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-30 00:46:45,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:46:45,587 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:46:47,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-30 00:46:47,463 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=542813.3333333334, ans=0.125 2023-09-30 00:46:53,070 INFO [train.py:1039] (1/4) Epoch 16, batch 1750, loss[loss=0.193, simple_loss=0.2504, pruned_loss=0.0678, over 23676.00 frames. ], tot_loss[loss=0.1853, simple_loss=0.2579, pruned_loss=0.05635, over 4693859.38 frames. ], batch size: 232, lr: 6.50e-03, grad_scale: 8.0 2023-09-30 00:46:53,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:46:57,638 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.73 vs. limit=15.0 2023-09-30 00:46:58,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:46:58,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-30 00:46:58,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-30 00:46:58,353 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:47:01,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:47:01,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:47:06,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-30 00:47:08,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:47:08,702 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=542946.6666666666, ans=0.0 2023-09-30 00:47:10,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-30 00:47:10,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:47:11,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:47:15,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 00:47:16,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-30 00:47:18,928 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:47:18,977 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-30 00:47:28,924 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:47:32,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:47:32,127 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:47:35,124 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:47:35,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:47:38,058 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:47:38,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:47:43,192 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:47:43,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:47:44,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-30 00:47:47,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:47:50,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-30 00:47:50,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:47:51,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:47:53,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:47:57,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 00:47:57,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-30 00:47:57,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:48:00,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:48:03,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:48:06,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:48:08,360 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:48:08,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-30 00:48:08,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:48:10,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-30 00:48:10,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:48:10,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-30 00:48:10,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:48:11,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:48:15,192 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:48:16,441 INFO [train.py:1039] (1/4) Epoch 16, batch 1800, loss[loss=0.1955, simple_loss=0.2783, pruned_loss=0.05637, over 24581.00 frames. ], tot_loss[loss=0.184, simple_loss=0.2566, pruned_loss=0.05569, over 4698882.53 frames. ], batch size: 71, lr: 6.49e-03, grad_scale: 8.0 2023-09-30 00:48:17,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:48:19,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 00:48:21,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:48:25,026 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=543213.3333333334, ans=0.0 2023-09-30 00:48:26,087 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.478e+02 1.882e+02 2.134e+02 2.523e+02 4.257e+02, threshold=4.267e+02, percent-clipped=0.0 2023-09-30 00:48:26,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 00:48:26,426 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:48:28,003 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=543213.3333333334, ans=0.125 2023-09-30 00:48:31,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:48:34,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:48:34,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:48:36,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:48:39,544 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:48:39,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-30 00:48:41,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:48:44,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:48:48,723 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-30 00:48:50,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-30 00:48:50,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-30 00:48:50,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:48:52,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:48:52,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:48:52,794 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:49:00,876 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-30 00:49:03,752 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:49:05,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:49:08,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-30 00:49:08,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-30 00:49:08,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-30 00:49:09,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:49:11,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 00:49:11,362 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=543413.3333333334, ans=0.0 2023-09-30 00:49:11,515 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=543413.3333333334, ans=0.1 2023-09-30 00:49:14,345 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=543413.3333333334, ans=0.0 2023-09-30 00:49:16,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-30 00:49:23,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:49:24,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-30 00:49:24,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:49:24,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:49:24,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:49:25,109 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=543480.0, ans=0.025 2023-09-30 00:49:26,940 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-30 00:49:29,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-30 00:49:29,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:49:34,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-30 00:49:34,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:49:37,848 INFO [train.py:1039] (1/4) Epoch 16, batch 1850, loss[loss=0.1691, simple_loss=0.2472, pruned_loss=0.04547, over 24590.00 frames. ], tot_loss[loss=0.1841, simple_loss=0.2572, pruned_loss=0.05545, over 4701589.81 frames. ], batch size: 60, lr: 6.49e-03, grad_scale: 8.0 2023-09-30 00:49:37,903 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:49:37,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:49:37,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:49:39,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:49:39,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:49:42,615 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:49:42,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:49:46,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:49:48,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:49:55,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:49:55,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-30 00:49:59,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-30 00:50:02,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-30 00:50:06,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:50:06,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-30 00:50:06,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 00:50:12,712 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.87 vs. limit=22.5 2023-09-30 00:50:18,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:50:20,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-30 00:50:21,367 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=543680.0, ans=0.0 2023-09-30 00:50:23,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:50:24,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:50:28,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-30 00:50:29,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:50:29,106 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 00:50:29,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:50:30,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:50:33,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:50:34,741 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.33 vs. limit=12.0 2023-09-30 00:50:37,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-30 00:50:37,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:50:38,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 00:50:38,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:50:40,649 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:50:42,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:50:45,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-30 00:50:45,537 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:50:50,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:50:50,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:50:50,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-30 00:50:50,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-30 00:50:52,895 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-30 00:50:54,884 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-30 00:50:56,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 00:50:56,588 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:50:56,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:50:58,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:50:59,448 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-30 00:50:59,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:50:59,539 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:51:00,959 INFO [train.py:1039] (1/4) Epoch 16, batch 1900, loss[loss=0.1895, simple_loss=0.2717, pruned_loss=0.05368, over 24473.00 frames. ], tot_loss[loss=0.1844, simple_loss=0.258, pruned_loss=0.05538, over 4706475.90 frames. ], batch size: 69, lr: 6.49e-03, grad_scale: 8.0 2023-09-30 00:51:01,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-30 00:51:02,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 00:51:02,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:51:02,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-30 00:51:05,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:51:05,846 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-30 00:51:05,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:51:07,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:51:10,289 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.936e+02 2.154e+02 2.566e+02 3.893e+02, threshold=4.308e+02, percent-clipped=0.0 2023-09-30 00:51:12,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:51:15,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:51:16,981 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-30 00:51:17,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-30 00:51:19,190 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.88 vs. limit=22.5 2023-09-30 00:51:19,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:51:20,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:51:20,059 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-30 00:51:22,101 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-30 00:51:25,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-30 00:51:27,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:51:31,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-30 00:51:34,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-30 00:51:45,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-30 00:51:46,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-30 00:51:46,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:51:48,344 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-30 00:51:48,362 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-30 00:51:48,407 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-30 00:51:49,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-30 00:51:49,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:51:54,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-30 00:51:58,051 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.98 vs. limit=15.0 2023-09-30 00:51:58,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:52:00,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:52:00,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-30 00:52:02,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:52:05,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-30 00:52:05,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:52:12,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:52:12,051 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:52:12,071 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:52:12,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:52:13,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 00:52:13,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-30 00:52:15,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:52:18,142 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:52:18,154 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-30 00:52:21,240 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:52:21,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:52:21,322 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:52:23,317 INFO [train.py:1039] (1/4) Epoch 16, batch 1950, loss[loss=0.1604, simple_loss=0.2352, pruned_loss=0.04281, over 24305.00 frames. ], tot_loss[loss=0.1856, simple_loss=0.2592, pruned_loss=0.05601, over 4706968.64 frames. ], batch size: 61, lr: 6.49e-03, grad_scale: 8.0 2023-09-30 00:52:23,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:52:26,602 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:52:30,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:52:30,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:52:30,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:52:31,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-30 00:52:33,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 00:52:33,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:52:35,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:52:37,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:52:37,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:52:37,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:52:40,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:52:45,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:52:45,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:52:45,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:52:45,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:52:49,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:52:53,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-30 00:52:53,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:52:53,281 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=544280.0, ans=0.0 2023-09-30 00:52:54,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-30 00:52:54,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-30 00:52:54,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 00:52:55,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:52:56,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:52:59,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:53:02,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:53:09,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 00:53:14,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:53:15,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:53:15,110 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-30 00:53:16,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:53:19,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:53:21,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:53:22,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:53:29,192 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:53:30,654 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:53:31,122 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=544480.0, ans=0.0 2023-09-30 00:53:32,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:53:34,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:53:37,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:53:37,440 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:53:39,593 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-30 00:53:39,611 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 00:53:39,949 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=544480.0, ans=0.125 2023-09-30 00:53:41,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:53:41,297 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-30 00:53:44,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:53:45,411 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.84 vs. limit=10.0 2023-09-30 00:53:46,226 INFO [train.py:1039] (1/4) Epoch 16, batch 2000, loss[loss=0.1641, simple_loss=0.2379, pruned_loss=0.04514, over 24459.00 frames. ], tot_loss[loss=0.1859, simple_loss=0.2598, pruned_loss=0.05601, over 4721046.71 frames. ], batch size: 58, lr: 6.49e-03, grad_scale: 16.0 2023-09-30 00:53:47,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:53:49,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:53:49,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:53:51,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:53:54,585 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:53:55,957 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.874e+02 2.052e+02 2.476e+02 4.888e+02, threshold=4.104e+02, percent-clipped=2.0 2023-09-30 00:53:57,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-30 00:53:57,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:53:59,617 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=544546.6666666666, ans=0.1 2023-09-30 00:54:00,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:54:03,841 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-30 00:54:04,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:54:05,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:54:08,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:54:10,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-30 00:54:12,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:54:13,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:54:13,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:54:16,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-30 00:54:16,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 00:54:17,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-30 00:54:17,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:54:20,725 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:54:22,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-30 00:54:22,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:54:22,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:54:24,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:54:24,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-30 00:54:26,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-30 00:54:26,382 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:54:26,395 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:54:30,976 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=544680.0, ans=0.125 2023-09-30 00:54:33,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:54:35,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:54:35,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:54:36,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:54:37,070 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=544746.6666666666, ans=0.07 2023-09-30 00:54:39,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:54:41,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:54:41,154 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:54:41,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:54:42,803 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:54:46,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:54:46,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-30 00:54:52,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 00:54:52,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:54:57,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:54:57,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:55:02,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:55:02,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:55:02,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:55:02,782 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=544813.3333333334, ans=0.125 2023-09-30 00:55:03,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 00:55:03,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 00:55:06,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:55:07,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:55:08,444 INFO [train.py:1039] (1/4) Epoch 16, batch 2050, loss[loss=0.1966, simple_loss=0.2568, pruned_loss=0.0682, over 23722.00 frames. ], tot_loss[loss=0.1858, simple_loss=0.2594, pruned_loss=0.05617, over 4716763.02 frames. ], batch size: 179, lr: 6.48e-03, grad_scale: 16.0 2023-09-30 00:55:10,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:55:11,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:55:18,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:55:21,232 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:55:21,563 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=544880.0, ans=0.0 2023-09-30 00:55:23,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:55:23,373 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:55:23,649 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=544946.6666666666, ans=0.125 2023-09-30 00:55:24,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-30 00:55:24,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:55:26,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:55:26,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-30 00:55:31,796 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=544946.6666666666, ans=0.0 2023-09-30 00:55:31,943 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=544946.6666666666, ans=0.0 2023-09-30 00:55:38,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:55:38,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:55:39,688 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-30 00:55:42,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:55:44,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-30 00:55:44,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:55:44,833 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.98 vs. limit=15.0 2023-09-30 00:55:47,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:55:49,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:55:51,198 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-30 00:55:52,581 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:55:54,095 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:55:54,264 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:55:54,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:56:00,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:56:01,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:56:03,286 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-30 00:56:04,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:56:08,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:56:13,175 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:56:14,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-30 00:56:19,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:56:21,016 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:56:23,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:56:26,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-30 00:56:26,718 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=545146.6666666666, ans=0.125 2023-09-30 00:56:28,096 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=545146.6666666666, ans=0.125 2023-09-30 00:56:28,110 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=545146.6666666666, ans=0.0 2023-09-30 00:56:31,377 INFO [train.py:1039] (1/4) Epoch 16, batch 2100, loss[loss=0.1705, simple_loss=0.2487, pruned_loss=0.04613, over 24323.00 frames. ], tot_loss[loss=0.1851, simple_loss=0.2588, pruned_loss=0.05567, over 4720921.64 frames. ], batch size: 61, lr: 6.48e-03, grad_scale: 16.0 2023-09-30 00:56:31,439 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-30 00:56:31,440 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:56:31,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:56:33,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 00:56:34,588 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:56:34,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-30 00:56:34,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-30 00:56:37,630 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:56:37,934 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=545213.3333333334, ans=0.125 2023-09-30 00:56:41,270 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.889e+02 2.067e+02 2.438e+02 3.667e+02, threshold=4.134e+02, percent-clipped=0.0 2023-09-30 00:56:41,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:56:41,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:56:41,790 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=545213.3333333334, ans=0.125 2023-09-30 00:56:41,915 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=545213.3333333334, ans=0.0 2023-09-30 00:56:44,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:56:46,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:56:46,156 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-30 00:56:47,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:56:47,780 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-30 00:56:47,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-30 00:56:49,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:56:50,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:56:50,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-30 00:56:50,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 00:56:55,626 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-30 00:56:55,628 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 00:56:58,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:56:58,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:57:04,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:57:04,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-30 00:57:06,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:57:06,688 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 00:57:08,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-30 00:57:09,769 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:57:09,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-30 00:57:11,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-30 00:57:11,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-30 00:57:13,085 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=545346.6666666666, ans=0.1 2023-09-30 00:57:14,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:57:16,332 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:57:19,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:57:20,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:57:22,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:57:23,900 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:57:23,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-30 00:57:23,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:57:23,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:57:25,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:57:25,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-30 00:57:26,935 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-30 00:57:28,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-30 00:57:30,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:57:34,616 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:57:34,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-30 00:57:34,870 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=545480.0, ans=0.1 2023-09-30 00:57:40,058 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=545480.0, ans=0.5 2023-09-30 00:57:41,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:57:44,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:57:46,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:57:46,378 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:57:46,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-30 00:57:46,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 00:57:48,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:57:48,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-30 00:57:49,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:57:50,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:57:51,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-30 00:57:53,094 INFO [train.py:1039] (1/4) Epoch 16, batch 2150, loss[loss=0.2072, simple_loss=0.2774, pruned_loss=0.06851, over 23245.00 frames. ], tot_loss[loss=0.1848, simple_loss=0.2589, pruned_loss=0.05532, over 4728713.93 frames. ], batch size: 93, lr: 6.48e-03, grad_scale: 8.0 2023-09-30 00:57:53,182 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-30 00:57:53,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:57:57,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:57:57,433 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:57:57,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:57:58,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:58:05,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 00:58:06,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:58:08,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:58:09,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:58:09,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:58:11,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:58:14,745 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:58:14,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:58:14,848 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:58:18,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:58:18,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-30 00:58:23,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:58:25,313 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-30 00:58:26,029 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.78 vs. limit=15.0 2023-09-30 00:58:28,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:58:28,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:58:28,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:58:28,768 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.64 vs. limit=15.0 2023-09-30 00:58:29,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-30 00:58:29,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:58:29,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:58:29,797 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:58:31,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-30 00:58:32,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:58:32,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:58:34,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:58:35,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 00:58:37,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:58:38,971 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:58:39,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-30 00:58:40,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:58:40,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-30 00:58:40,536 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:58:44,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:58:45,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:58:46,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:58:47,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 00:58:49,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:58:50,486 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=15.54 vs. limit=22.5 2023-09-30 00:58:51,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:58:51,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-30 00:58:52,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-30 00:58:52,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:58:54,090 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-30 00:58:54,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:58:54,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:58:55,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-30 00:58:55,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:58:55,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-30 00:58:55,798 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-30 00:58:55,799 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-30 00:58:56,072 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=545813.3333333334, ans=0.05 2023-09-30 00:58:57,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-30 00:58:59,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:59:00,979 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:59:01,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:59:02,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:59:04,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 00:59:04,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:59:04,535 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=545813.3333333334, ans=0.125 2023-09-30 00:59:05,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:59:13,288 INFO [train.py:1039] (1/4) Epoch 16, batch 2200, loss[loss=0.1871, simple_loss=0.2579, pruned_loss=0.05816, over 23500.00 frames. ], tot_loss[loss=0.1844, simple_loss=0.2584, pruned_loss=0.05524, over 4730397.43 frames. ], batch size: 134, lr: 6.48e-03, grad_scale: 8.0 2023-09-30 00:59:13,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:59:13,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-30 00:59:17,266 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:59:19,823 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=2.693e-02 2023-09-30 00:59:24,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:59:24,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:59:24,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:59:25,930 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.544e+02 1.877e+02 2.145e+02 2.548e+02 4.503e+02, threshold=4.290e+02, percent-clipped=1.0 2023-09-30 00:59:26,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-30 00:59:27,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:59:29,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:59:29,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-30 00:59:33,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-30 00:59:36,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 00:59:42,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-30 00:59:44,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:59:45,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:59:45,668 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:59:49,429 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:59:50,847 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-30 00:59:52,086 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=546013.3333333334, ans=0.0 2023-09-30 00:59:53,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-30 00:59:55,011 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:59:56,349 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-30 00:59:59,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:00:00,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:00:03,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:00:04,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:00:07,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-30 01:00:09,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:00:09,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-30 01:00:10,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:00:12,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:00:12,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:00:13,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:00:13,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:00:13,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:00:15,248 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:00:15,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-30 01:00:16,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:00:18,589 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 01:00:22,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 01:00:23,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:00:25,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:00:28,098 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-30 01:00:29,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:00:31,188 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-30 01:00:31,471 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=546146.6666666666, ans=0.125 2023-09-30 01:00:32,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-30 01:00:32,881 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-30 01:00:34,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:00:34,495 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-30 01:00:37,296 INFO [train.py:1039] (1/4) Epoch 16, batch 2250, loss[loss=0.1786, simple_loss=0.262, pruned_loss=0.04764, over 24327.00 frames. ], tot_loss[loss=0.1849, simple_loss=0.2591, pruned_loss=0.05538, over 4728145.07 frames. ], batch size: 74, lr: 6.48e-03, grad_scale: 8.0 2023-09-30 01:00:37,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:00:39,524 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-30 01:00:41,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:00:41,474 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=546213.3333333334, ans=0.0 2023-09-30 01:00:42,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:00:43,002 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=546213.3333333334, ans=0.2 2023-09-30 01:00:48,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:00:51,773 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:00:53,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:00:55,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:00:55,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:00:58,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-30 01:00:58,735 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:00:58,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:01:02,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-30 01:01:03,915 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:01:03,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:01:04,135 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:01:04,443 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=546280.0, ans=0.0 2023-09-30 01:01:05,988 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=546280.0, ans=0.125 2023-09-30 01:01:08,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:01:09,233 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=546346.6666666666, ans=0.2 2023-09-30 01:01:10,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 01:01:10,477 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-30 01:01:12,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-30 01:01:14,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:01:15,295 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.25 vs. limit=22.5 2023-09-30 01:01:15,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:01:20,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:01:21,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:01:24,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:01:24,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:01:26,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:01:27,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:01:32,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:01:34,943 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-30 01:01:40,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:01:40,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:01:40,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:01:45,558 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=546480.0, ans=0.125 2023-09-30 01:01:50,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 01:01:52,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-30 01:01:52,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-30 01:01:52,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:01:52,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:01:54,071 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=546480.0, ans=0.1 2023-09-30 01:01:54,601 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.42 vs. limit=6.0 2023-09-30 01:01:55,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-30 01:01:57,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:01:57,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:01:59,841 INFO [train.py:1039] (1/4) Epoch 16, batch 2300, loss[loss=0.2566, simple_loss=0.3087, pruned_loss=0.1022, over 19507.00 frames. ], tot_loss[loss=0.1858, simple_loss=0.2598, pruned_loss=0.05591, over 4716318.77 frames. ], batch size: 388, lr: 6.47e-03, grad_scale: 8.0 2023-09-30 01:02:06,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:02:06,399 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:02:10,609 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-30 01:02:11,887 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.548e+02 1.854e+02 2.032e+02 2.223e+02 2.869e+02, threshold=4.064e+02, percent-clipped=0.0 2023-09-30 01:02:12,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:02:16,005 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.55 vs. limit=15.0 2023-09-30 01:02:19,879 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:02:19,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-30 01:02:19,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:02:20,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:02:20,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-30 01:02:20,183 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=546613.3333333334, ans=0.0 2023-09-30 01:02:20,238 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=546613.3333333334, ans=0.125 2023-09-30 01:02:21,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:02:23,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:02:24,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:02:26,971 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=546613.3333333334, ans=0.125 2023-09-30 01:02:29,761 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:02:31,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:02:34,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:02:39,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:02:41,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:02:43,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:02:46,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:02:50,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:02:51,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 01:02:52,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:02:52,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-30 01:02:57,638 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 01:02:57,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:02:57,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:02:57,730 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:02:57,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:02:59,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 01:02:59,331 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-30 01:02:59,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-30 01:03:01,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:03:01,412 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:03:01,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-30 01:03:05,611 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.34 vs. limit=22.5 2023-09-30 01:03:06,273 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:03:09,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:03:13,859 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.75 vs. limit=22.5 2023-09-30 01:03:14,688 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:03:14,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:03:16,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-30 01:03:18,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 01:03:18,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:03:19,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 01:03:21,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-30 01:03:22,683 INFO [train.py:1039] (1/4) Epoch 16, batch 2350, loss[loss=0.1586, simple_loss=0.2401, pruned_loss=0.03852, over 24600.00 frames. ], tot_loss[loss=0.1859, simple_loss=0.2599, pruned_loss=0.05595, over 4715338.39 frames. ], batch size: 60, lr: 6.47e-03, grad_scale: 8.0 2023-09-30 01:03:26,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:03:26,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-30 01:03:30,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-30 01:03:33,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:03:36,469 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=546880.0, ans=0.1 2023-09-30 01:03:37,578 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:03:37,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:03:37,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:03:39,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:03:40,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-30 01:03:45,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:03:51,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-30 01:03:53,608 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=546946.6666666666, ans=0.125 2023-09-30 01:03:54,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:03:59,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:03:59,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:04:00,910 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:04:01,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-30 01:04:02,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:04:05,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:04:05,498 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:04:05,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:04:09,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:04:12,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-30 01:04:12,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:04:15,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:04:15,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:04:16,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-30 01:04:18,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:04:22,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-30 01:04:22,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:04:27,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-30 01:04:31,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-30 01:04:32,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:04:32,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-30 01:04:32,693 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-30 01:04:32,734 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-30 01:04:37,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-30 01:04:37,374 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=547146.6666666666, ans=0.5 2023-09-30 01:04:40,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:04:44,728 INFO [train.py:1039] (1/4) Epoch 16, batch 2400, loss[loss=0.1873, simple_loss=0.2663, pruned_loss=0.05416, over 24117.00 frames. ], tot_loss[loss=0.1855, simple_loss=0.2598, pruned_loss=0.0556, over 4724776.02 frames. ], batch size: 80, lr: 6.47e-03, grad_scale: 16.0 2023-09-30 01:04:44,892 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:04:48,708 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:04:51,714 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:04:51,961 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=547213.3333333334, ans=0.125 2023-09-30 01:04:53,128 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-30 01:04:53,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-30 01:04:55,074 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=547213.3333333334, ans=0.1 2023-09-30 01:04:56,006 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.435e+02 1.899e+02 2.114e+02 2.474e+02 3.602e+02, threshold=4.228e+02, percent-clipped=0.0 2023-09-30 01:05:00,512 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 01:05:00,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:05:03,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-30 01:05:03,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:05:05,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:05:07,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-30 01:05:10,711 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:05:12,252 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-30 01:05:13,065 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.12 vs. limit=22.5 2023-09-30 01:05:18,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-30 01:05:19,315 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=14.36 vs. limit=22.5 2023-09-30 01:05:21,854 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-30 01:05:23,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:05:25,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:05:28,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:05:31,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-30 01:05:31,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:05:41,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:05:42,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:05:44,549 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=547413.3333333334, ans=0.125 2023-09-30 01:05:47,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:05:49,218 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=547413.3333333334, ans=0.125 2023-09-30 01:05:50,290 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:05:50,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-30 01:05:50,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:05:50,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:05:50,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:05:50,489 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 01:05:54,546 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.53 vs. limit=6.0 2023-09-30 01:05:57,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:05:58,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 01:05:58,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-30 01:06:00,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-30 01:06:01,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:06:01,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:06:01,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-30 01:06:03,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-30 01:06:03,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-30 01:06:03,296 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-30 01:06:04,787 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-30 01:06:06,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:06:08,315 INFO [train.py:1039] (1/4) Epoch 16, batch 2450, loss[loss=0.1873, simple_loss=0.2661, pruned_loss=0.05426, over 23387.00 frames. ], tot_loss[loss=0.1847, simple_loss=0.2588, pruned_loss=0.05527, over 4742261.66 frames. ], batch size: 105, lr: 6.47e-03, grad_scale: 16.0 2023-09-30 01:06:08,509 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:06:08,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:06:10,607 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-30 01:06:10,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:06:12,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-30 01:06:15,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:06:15,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:06:19,157 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:06:19,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:06:20,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-30 01:06:26,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:06:26,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:06:30,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:06:30,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:06:30,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:06:32,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-30 01:06:32,387 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=547613.3333333334, ans=0.0 2023-09-30 01:06:33,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:06:36,913 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 01:06:38,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:06:42,834 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=547680.0, ans=0.125 2023-09-30 01:06:44,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-30 01:06:44,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:06:45,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:06:45,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:06:48,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-30 01:06:48,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:06:57,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:06:58,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:06:58,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:07:00,287 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:07:00,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:07:00,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:07:01,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-30 01:07:05,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:07:06,838 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:07:07,133 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=547746.6666666666, ans=0.125 2023-09-30 01:07:10,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:07:10,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:07:16,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:07:16,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-30 01:07:18,499 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:07:18,946 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=547813.3333333334, ans=0.05 2023-09-30 01:07:20,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:07:20,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-30 01:07:22,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:07:22,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:07:26,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:07:27,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:07:29,144 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:07:30,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-30 01:07:32,344 INFO [train.py:1039] (1/4) Epoch 16, batch 2500, loss[loss=0.209, simple_loss=0.2889, pruned_loss=0.06451, over 24354.00 frames. ], tot_loss[loss=0.1839, simple_loss=0.2579, pruned_loss=0.05493, over 4731993.24 frames. ], batch size: 77, lr: 6.47e-03, grad_scale: 16.0 2023-09-30 01:07:32,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:07:37,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:07:44,696 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.913e+02 2.172e+02 2.502e+02 3.550e+02, threshold=4.344e+02, percent-clipped=0.0 2023-09-30 01:07:46,611 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=547880.0, ans=0.0 2023-09-30 01:07:47,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 01:07:47,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:07:49,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:07:49,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-30 01:07:58,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 01:07:59,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:08:01,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-30 01:08:01,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 01:08:01,325 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-30 01:08:04,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:08:04,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:08:06,021 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-30 01:08:06,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:08:07,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-30 01:08:07,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:08:08,401 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.89 vs. limit=15.0 2023-09-30 01:08:12,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:08:14,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:08:17,248 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 01:08:17,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-30 01:08:18,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:08:20,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:08:24,325 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:08:29,341 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:08:31,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:08:37,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-30 01:08:40,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-30 01:08:40,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:08:40,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-30 01:08:40,925 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=548146.6666666666, ans=0.125 2023-09-30 01:08:42,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:08:42,198 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:08:43,681 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-30 01:08:43,682 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-30 01:08:43,690 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-30 01:08:48,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:08:50,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-30 01:08:50,302 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-30 01:08:51,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:08:51,872 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-30 01:08:54,998 INFO [train.py:1039] (1/4) Epoch 16, batch 2550, loss[loss=0.198, simple_loss=0.2724, pruned_loss=0.06177, over 23299.00 frames. ], tot_loss[loss=0.1835, simple_loss=0.2583, pruned_loss=0.05438, over 4742019.45 frames. ], batch size: 93, lr: 6.46e-03, grad_scale: 16.0 2023-09-30 01:08:55,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-30 01:08:58,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:09:00,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:09:02,093 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:09:03,843 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:09:05,356 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-30 01:09:06,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-30 01:09:10,696 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-30 01:09:10,926 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:09:12,517 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:09:15,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:09:15,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 01:09:17,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 01:09:17,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:09:18,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:09:20,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:09:20,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-30 01:09:21,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-30 01:09:21,676 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:09:21,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-30 01:09:33,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:09:41,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:09:41,834 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:09:41,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:09:42,173 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=548346.6666666666, ans=0.125 2023-09-30 01:09:43,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 01:09:50,566 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=548413.3333333334, ans=0.0 2023-09-30 01:09:51,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:09:54,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 01:09:54,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:09:54,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:09:54,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-30 01:09:54,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-30 01:09:58,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:09:58,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:10:03,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:10:03,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-30 01:10:03,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:10:05,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:10:06,783 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-30 01:10:06,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 01:10:10,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:10:16,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:10:18,248 INFO [train.py:1039] (1/4) Epoch 16, batch 2600, loss[loss=0.1621, simple_loss=0.2417, pruned_loss=0.04121, over 24345.00 frames. ], tot_loss[loss=0.1837, simple_loss=0.2589, pruned_loss=0.05418, over 4747890.69 frames. ], batch size: 61, lr: 6.46e-03, grad_scale: 8.0 2023-09-30 01:10:19,942 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:10:24,927 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-30 01:10:26,571 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-30 01:10:26,610 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:10:26,663 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-30 01:10:28,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-30 01:10:28,208 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-30 01:10:31,185 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.861e+02 2.102e+02 2.275e+02 3.590e+02, threshold=4.204e+02, percent-clipped=0.0 2023-09-30 01:10:31,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:10:31,442 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-30 01:10:32,902 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-30 01:10:34,369 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-30 01:10:34,598 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=548613.3333333334, ans=0.2 2023-09-30 01:10:37,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:10:40,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-30 01:10:41,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-30 01:10:44,661 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-30 01:10:44,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-30 01:10:45,371 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.64 vs. limit=15.0 2023-09-30 01:10:48,115 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-30 01:10:48,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-30 01:10:54,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:10:54,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:10:54,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:10:54,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-30 01:10:54,827 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=548680.0, ans=0.04949747468305833 2023-09-30 01:10:58,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:11:04,513 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-30 01:11:09,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:11:09,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:11:10,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-30 01:11:10,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:11:10,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:11:12,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-30 01:11:16,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-30 01:11:16,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:11:19,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:11:22,835 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-30 01:11:22,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:11:22,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:11:23,042 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=548813.3333333334, ans=0.125 2023-09-30 01:11:27,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:11:29,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:11:29,660 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-30 01:11:31,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:11:32,856 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:11:34,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:11:40,237 INFO [train.py:1039] (1/4) Epoch 16, batch 2650, loss[loss=0.1591, simple_loss=0.2315, pruned_loss=0.04339, over 24456.00 frames. ], tot_loss[loss=0.1853, simple_loss=0.2601, pruned_loss=0.05522, over 4744882.01 frames. ], batch size: 58, lr: 6.46e-03, grad_scale: 4.0 2023-09-30 01:11:40,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-30 01:11:40,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:11:40,890 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=548880.0, ans=0.0 2023-09-30 01:11:43,779 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 01:11:48,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-30 01:11:48,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:11:49,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 01:11:49,138 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-30 01:11:51,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:11:54,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:11:54,554 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=548880.0, ans=0.125 2023-09-30 01:11:55,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 01:11:57,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:12:00,525 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:12:02,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-30 01:12:02,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 01:12:02,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:12:05,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-30 01:12:07,430 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-30 01:12:10,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:12:12,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-30 01:12:13,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:12:13,720 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-30 01:12:15,649 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=549013.3333333334, ans=0.1 2023-09-30 01:12:17,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:12:17,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:12:17,186 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:12:18,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:12:23,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-30 01:12:25,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-30 01:12:28,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-30 01:12:29,921 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=549080.0, ans=0.125 2023-09-30 01:12:31,332 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=549080.0, ans=0.035 2023-09-30 01:12:32,718 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-30 01:12:32,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:12:34,159 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:12:34,220 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-30 01:12:35,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:12:35,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:12:36,135 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=549080.0, ans=0.0 2023-09-30 01:12:37,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:12:39,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:12:40,643 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:12:40,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:12:42,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:12:44,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:12:44,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:12:46,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:12:47,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:12:49,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-30 01:12:52,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:12:52,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:12:52,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:12:52,594 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=549146.6666666666, ans=0.0 2023-09-30 01:12:53,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-30 01:12:57,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:12:59,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:13:03,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:13:03,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:13:04,932 INFO [train.py:1039] (1/4) Epoch 16, batch 2700, loss[loss=0.1926, simple_loss=0.252, pruned_loss=0.06655, over 23552.00 frames. ], tot_loss[loss=0.1859, simple_loss=0.2605, pruned_loss=0.0556, over 4731755.48 frames. ], batch size: 256, lr: 6.46e-03, grad_scale: 8.0 2023-09-30 01:13:06,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-30 01:13:06,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:13:08,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:13:08,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-30 01:13:11,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:13:11,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 01:13:13,366 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 01:13:16,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:13:16,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:13:16,848 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:13:18,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:13:18,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:13:18,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:13:18,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-30 01:13:18,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-30 01:13:19,479 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.941e+02 2.174e+02 2.573e+02 4.504e+02, threshold=4.348e+02, percent-clipped=1.0 2023-09-30 01:13:19,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:13:22,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-30 01:13:22,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 01:13:22,702 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:13:25,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:13:27,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-30 01:13:28,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:13:33,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:13:33,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:13:40,321 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:13:40,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:13:40,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:13:41,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:13:43,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:13:48,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:13:48,056 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:13:48,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:13:51,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:13:51,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:13:59,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:14:01,153 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:14:02,779 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:14:02,791 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:14:04,643 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=549413.3333333334, ans=0.0 2023-09-30 01:14:08,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:14:09,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:14:10,046 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=549480.0, ans=0.125 2023-09-30 01:14:11,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:14:12,764 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:14:14,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:14:14,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:14:16,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:14:18,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:14:18,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:14:22,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-30 01:14:22,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:14:25,720 INFO [train.py:1039] (1/4) Epoch 16, batch 2750, loss[loss=0.1744, simple_loss=0.2623, pruned_loss=0.04325, over 24466.00 frames. ], tot_loss[loss=0.1859, simple_loss=0.2604, pruned_loss=0.05575, over 4729650.53 frames. ], batch size: 69, lr: 6.46e-03, grad_scale: 8.0 2023-09-30 01:14:25,886 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:14:27,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-30 01:14:28,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-30 01:14:28,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:14:32,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:14:33,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:14:35,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:14:35,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-30 01:14:36,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:14:39,444 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.44 vs. limit=15.0 2023-09-30 01:14:40,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:14:40,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 01:14:40,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:14:40,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:14:40,540 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-30 01:14:42,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:14:42,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:14:48,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-30 01:14:49,115 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=549613.3333333334, ans=0.125 2023-09-30 01:14:50,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:14:50,576 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:14:50,677 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:14:52,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-30 01:14:52,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:14:53,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:14:53,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:14:53,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:14:57,305 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=549680.0, ans=0.125 2023-09-30 01:14:59,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 01:15:00,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 01:15:00,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:15:01,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:15:01,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 01:15:08,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:15:10,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 01:15:10,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:15:17,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:15:17,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:15:18,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 01:15:23,589 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:15:25,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:15:25,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-30 01:15:27,417 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.81 vs. limit=15.0 2023-09-30 01:15:30,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:15:32,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-30 01:15:37,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-30 01:15:39,580 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:15:40,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-30 01:15:41,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:15:42,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:15:42,723 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-30 01:15:44,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:15:47,547 INFO [train.py:1039] (1/4) Epoch 16, batch 2800, loss[loss=0.1764, simple_loss=0.2526, pruned_loss=0.05013, over 24621.00 frames. ], tot_loss[loss=0.1849, simple_loss=0.2591, pruned_loss=0.05529, over 4724259.04 frames. ], batch size: 60, lr: 6.46e-03, grad_scale: 16.0 2023-09-30 01:15:47,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-30 01:15:47,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:15:47,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:15:49,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-30 01:15:49,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:15:50,227 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=549880.0, ans=0.0 2023-09-30 01:15:51,340 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:15:52,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:15:54,433 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-30 01:15:54,434 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-30 01:15:57,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:16:00,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 01:16:00,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:16:01,934 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.507e+02 1.870e+02 2.056e+02 2.355e+02 4.086e+02, threshold=4.112e+02, percent-clipped=0.0 2023-09-30 01:16:02,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:16:03,652 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.47 vs. limit=15.0 2023-09-30 01:16:04,406 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-30 01:16:07,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-30 01:16:08,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-30 01:16:10,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:16:10,615 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:16:10,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:16:12,383 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=549946.6666666666, ans=0.125 2023-09-30 01:16:13,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:16:15,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:16:15,358 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-30 01:16:16,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:16:26,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:16:28,047 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:16:29,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:16:29,861 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=550013.3333333334, ans=0.125 2023-09-30 01:16:31,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:16:31,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:16:36,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:16:36,571 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-30 01:16:36,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:16:38,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:16:38,162 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:16:41,209 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=550080.0, ans=0.025 2023-09-30 01:16:45,436 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:16:45,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:16:48,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:16:49,049 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_abs, batch_count=550080.0, ans=0.5 2023-09-30 01:16:50,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:16:50,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:16:50,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 01:16:51,875 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 01:16:51,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 01:16:54,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:16:54,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-30 01:16:54,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:16:56,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:16:56,221 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:16:58,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-30 01:16:59,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:17:00,084 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=550146.6666666666, ans=0.1 2023-09-30 01:17:01,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:17:01,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:17:04,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-30 01:17:10,305 INFO [train.py:1039] (1/4) Epoch 16, batch 2850, loss[loss=0.1911, simple_loss=0.2604, pruned_loss=0.06088, over 23755.00 frames. ], tot_loss[loss=0.1849, simple_loss=0.2588, pruned_loss=0.05548, over 4723469.37 frames. ], batch size: 149, lr: 6.45e-03, grad_scale: 16.0 2023-09-30 01:17:10,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:17:10,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 01:17:10,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:17:12,807 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:17:15,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:17:15,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:17:17,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:17:19,281 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:17:20,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:17:22,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:17:23,921 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-30 01:17:29,250 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-30 01:17:31,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:17:31,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-30 01:17:32,392 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.15 vs. limit=6.0 2023-09-30 01:17:33,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:17:36,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-30 01:17:37,967 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-30 01:17:39,479 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:17:49,492 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=550346.6666666666, ans=0.125 2023-09-30 01:17:50,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:17:52,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:17:53,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:17:55,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 01:17:55,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 01:17:55,220 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:17:56,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:17:58,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-30 01:17:59,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:17:59,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:18:01,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:18:01,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:18:05,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:18:05,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:18:07,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:18:09,346 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:18:10,958 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:18:12,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:18:14,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:18:15,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:18:18,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:18:21,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-30 01:18:22,674 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-30 01:18:24,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 01:18:24,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:18:25,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-30 01:18:25,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:18:25,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:18:25,880 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:18:27,296 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:18:27,297 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-30 01:18:27,355 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-30 01:18:27,361 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:18:27,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:18:28,199 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.92 vs. limit=12.0 2023-09-30 01:18:33,209 INFO [train.py:1039] (1/4) Epoch 16, batch 2900, loss[loss=0.2109, simple_loss=0.2749, pruned_loss=0.07349, over 23808.00 frames. ], tot_loss[loss=0.1844, simple_loss=0.258, pruned_loss=0.05538, over 4708936.57 frames. ], batch size: 164, lr: 6.45e-03, grad_scale: 16.0 2023-09-30 01:18:33,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-30 01:18:33,604 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=550546.6666666666, ans=0.1 2023-09-30 01:18:34,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:18:34,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:18:37,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-30 01:18:41,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:18:41,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-30 01:18:43,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-30 01:18:45,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:18:46,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:18:46,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:18:48,256 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.629e+02 1.959e+02 2.363e+02 2.831e+02 4.091e+02, threshold=4.726e+02, percent-clipped=0.0 2023-09-30 01:18:48,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:18:49,109 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.whiten.whitening_limit, batch_count=550613.3333333334, ans=12.0 2023-09-30 01:18:51,588 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:18:53,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:18:56,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-30 01:18:56,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-30 01:18:56,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-30 01:18:58,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:19:01,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-30 01:19:01,658 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=550613.3333333334, ans=0.125 2023-09-30 01:19:03,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-30 01:19:04,686 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:19:04,691 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-30 01:19:04,728 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:19:07,866 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:19:07,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-30 01:19:09,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:19:11,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:19:14,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:19:19,894 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:19:21,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-30 01:19:21,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-30 01:19:21,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:19:27,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:19:28,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-30 01:19:29,707 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:19:35,878 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:19:39,183 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=550813.3333333334, ans=0.0 2023-09-30 01:19:45,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:19:45,789 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:19:46,167 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=550813.3333333334, ans=0.05 2023-09-30 01:19:47,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-30 01:19:47,644 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=550813.3333333334, ans=0.125 2023-09-30 01:19:50,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:19:52,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-30 01:19:52,296 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:19:52,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-30 01:19:54,365 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=550880.0, ans=0.0 2023-09-30 01:19:55,413 INFO [train.py:1039] (1/4) Epoch 16, batch 2950, loss[loss=0.1939, simple_loss=0.2632, pruned_loss=0.06227, over 23733.00 frames. ], tot_loss[loss=0.1855, simple_loss=0.2591, pruned_loss=0.05599, over 4710781.76 frames. ], batch size: 232, lr: 6.45e-03, grad_scale: 16.0 2023-09-30 01:19:59,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:20:00,831 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-30 01:20:02,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:20:02,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:20:04,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:20:06,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:20:07,130 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.78 vs. limit=10.0 2023-09-30 01:20:07,770 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-30 01:20:07,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-30 01:20:07,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:20:07,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:20:12,928 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=550946.6666666666, ans=0.0 2023-09-30 01:20:15,549 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:20:19,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:20:20,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:20:20,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:20:24,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:20:24,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:20:27,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:20:27,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:20:27,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:20:28,646 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=6.16 vs. limit=10.0 2023-09-30 01:20:29,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-30 01:20:34,670 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-30 01:20:34,705 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-30 01:20:36,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:20:38,436 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-30 01:20:39,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-30 01:20:39,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:20:40,240 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=551013.3333333334, ans=0.0 2023-09-30 01:20:41,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:20:41,181 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-30 01:20:41,189 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-30 01:20:44,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-30 01:20:45,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:20:45,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:20:47,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:20:49,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:20:49,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:20:49,186 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-30 01:20:51,382 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:20:51,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-30 01:20:54,820 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=551080.0, ans=0.2 2023-09-30 01:20:58,222 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:20:59,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:20:59,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-30 01:20:59,882 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:21:01,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-30 01:21:06,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:21:08,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:21:08,488 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=551146.6666666666, ans=0.125 2023-09-30 01:21:09,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:21:12,556 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:21:12,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 01:21:14,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:21:14,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:21:16,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-30 01:21:16,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:21:17,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:21:19,012 INFO [train.py:1039] (1/4) Epoch 16, batch 3000, loss[loss=0.1624, simple_loss=0.251, pruned_loss=0.03694, over 24423.00 frames. ], tot_loss[loss=0.1856, simple_loss=0.2598, pruned_loss=0.05569, over 4726664.08 frames. ], batch size: 69, lr: 6.45e-03, grad_scale: 16.0 2023-09-30 01:21:19,012 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-30 01:21:34,552 INFO [train.py:1071] (1/4) Epoch 16, validation: loss=0.3091, simple_loss=0.2818, pruned_loss=0.1682, over 1125622.00 frames. 2023-09-30 01:21:34,553 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-30 01:21:34,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:21:36,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:21:36,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-30 01:21:37,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:21:39,537 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:21:39,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-30 01:21:43,147 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-30 01:21:44,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-30 01:21:45,215 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.19 vs. limit=12.0 2023-09-30 01:21:47,560 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:21:47,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:21:48,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-30 01:21:48,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:21:49,607 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.865e+02 2.031e+02 2.277e+02 3.298e+02, threshold=4.063e+02, percent-clipped=0.0 2023-09-30 01:21:55,809 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 01:22:05,416 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:22:13,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-30 01:22:14,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:22:16,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:22:16,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:22:18,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:22:20,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:22:20,485 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-30 01:22:21,379 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.66 vs. limit=15.0 2023-09-30 01:22:23,603 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-30 01:22:25,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:22:25,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 01:22:28,755 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:22:30,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:22:31,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:22:31,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:22:32,739 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.69 vs. limit=12.0 2023-09-30 01:22:35,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 01:22:35,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:22:35,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:22:36,847 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=551413.3333333334, ans=0.2 2023-09-30 01:22:38,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:22:40,617 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-30 01:22:42,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:22:43,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:22:43,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:22:44,539 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.94 vs. limit=10.0 2023-09-30 01:22:45,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:22:47,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:22:48,928 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-30 01:22:48,974 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-30 01:22:49,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:22:49,091 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-30 01:22:50,616 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:22:52,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-30 01:22:53,796 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:22:55,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 01:22:56,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-30 01:22:57,405 INFO [train.py:1039] (1/4) Epoch 16, batch 3050, loss[loss=0.1984, simple_loss=0.2723, pruned_loss=0.06231, over 23385.00 frames. ], tot_loss[loss=0.1861, simple_loss=0.2603, pruned_loss=0.0559, over 4723209.84 frames. ], batch size: 93, lr: 6.45e-03, grad_scale: 8.0 2023-09-30 01:22:57,564 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-30 01:22:57,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 01:22:59,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:23:01,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:23:01,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-30 01:23:01,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:23:01,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:23:04,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-30 01:23:05,856 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:23:06,599 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.83 vs. limit=22.5 2023-09-30 01:23:08,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:23:10,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:23:15,324 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:23:17,906 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=551613.3333333334, ans=0.1 2023-09-30 01:23:19,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-30 01:23:22,601 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=551613.3333333334, ans=0.0 2023-09-30 01:23:23,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-30 01:23:23,842 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-30 01:23:23,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:23:29,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:23:31,419 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:23:33,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:23:33,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:23:37,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:23:38,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:23:38,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:23:38,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:23:38,720 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:23:40,161 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:23:41,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:23:45,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:23:45,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-30 01:23:45,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:23:46,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:23:49,155 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.31 vs. limit=10.0 2023-09-30 01:23:50,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:23:50,104 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 01:23:51,556 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:23:51,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:23:56,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:23:58,221 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:24:01,716 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=551813.3333333334, ans=0.125 2023-09-30 01:24:03,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:24:05,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:24:05,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:24:06,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:24:08,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 01:24:08,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:24:09,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-30 01:24:10,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:24:10,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:24:12,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-30 01:24:15,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:24:20,245 INFO [train.py:1039] (1/4) Epoch 16, batch 3100, loss[loss=0.1856, simple_loss=0.2602, pruned_loss=0.05556, over 23478.00 frames. ], tot_loss[loss=0.1857, simple_loss=0.2597, pruned_loss=0.05588, over 4716815.77 frames. ], batch size: 106, lr: 6.44e-03, grad_scale: 8.0 2023-09-30 01:24:20,570 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:24:22,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:24:25,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 01:24:26,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-30 01:24:29,093 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 01:24:30,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-30 01:24:30,748 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=551880.0, ans=0.2 2023-09-30 01:24:31,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-30 01:24:33,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:24:35,072 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:24:36,434 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.437e+02 1.867e+02 2.041e+02 2.309e+02 3.619e+02, threshold=4.081e+02, percent-clipped=0.0 2023-09-30 01:24:36,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:24:38,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-30 01:24:43,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:24:48,472 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=551946.6666666666, ans=0.2 2023-09-30 01:24:50,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-30 01:24:54,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 01:24:55,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:24:55,800 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=552013.3333333334, ans=0.125 2023-09-30 01:24:56,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:24:56,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:24:57,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-30 01:24:59,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:24:59,980 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-30 01:24:59,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:25:00,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:25:02,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-30 01:25:03,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:25:07,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:25:07,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-30 01:25:08,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-30 01:25:10,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:25:11,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:25:13,203 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:25:13,223 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:25:13,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:25:15,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-30 01:25:15,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:25:18,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:25:18,562 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:25:18,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:25:18,584 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 01:25:23,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:25:23,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-30 01:25:25,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:25:27,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-30 01:25:27,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:25:27,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:25:27,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-30 01:25:33,856 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=552146.6666666666, ans=0.0 2023-09-30 01:25:36,691 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.92 vs. limit=15.0 2023-09-30 01:25:37,543 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=552146.6666666666, ans=0.125 2023-09-30 01:25:40,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-30 01:25:42,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:25:42,656 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=552213.3333333334, ans=0.2 2023-09-30 01:25:43,550 INFO [train.py:1039] (1/4) Epoch 16, batch 3150, loss[loss=0.1834, simple_loss=0.2259, pruned_loss=0.07047, over 19240.00 frames. ], tot_loss[loss=0.1849, simple_loss=0.2578, pruned_loss=0.05606, over 4686878.15 frames. ], batch size: 388, lr: 6.44e-03, grad_scale: 8.0 2023-09-30 01:25:43,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:25:44,053 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=552213.3333333334, ans=0.125 2023-09-30 01:25:45,322 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:25:45,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:25:46,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-30 01:25:46,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:25:47,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-30 01:25:49,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-30 01:25:50,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:25:52,353 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-30 01:25:55,564 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=552213.3333333334, ans=0.95 2023-09-30 01:25:56,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-30 01:25:56,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:25:56,998 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-30 01:25:59,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-30 01:25:59,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-30 01:26:00,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-30 01:26:00,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-30 01:26:00,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:26:00,900 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:26:02,493 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:26:05,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-30 01:26:05,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:26:05,908 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=552280.0, ans=0.125 2023-09-30 01:26:07,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:26:07,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:26:09,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-30 01:26:12,557 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=552280.0, ans=0.125 2023-09-30 01:26:14,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-30 01:26:15,861 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:26:18,988 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-30 01:26:19,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:26:20,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-30 01:26:23,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-30 01:26:25,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:26:26,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 01:26:27,025 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 01:26:27,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:26:27,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:26:28,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-30 01:26:28,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-30 01:26:30,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-30 01:26:31,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 01:26:31,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:26:33,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:26:33,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:26:35,405 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-30 01:26:35,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:26:38,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-30 01:26:38,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:26:39,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-30 01:26:42,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-30 01:26:43,795 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:26:43,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:26:45,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-30 01:26:45,542 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 01:26:45,933 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=552413.3333333334, ans=0.0 2023-09-30 01:26:46,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:26:50,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:26:51,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:26:51,802 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:26:58,414 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:26:59,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:27:00,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-30 01:27:07,219 INFO [train.py:1039] (1/4) Epoch 16, batch 3200, loss[loss=0.161, simple_loss=0.2369, pruned_loss=0.04255, over 17202.00 frames. ], tot_loss[loss=0.1834, simple_loss=0.2568, pruned_loss=0.055, over 4695601.22 frames. ], batch size: 37, lr: 6.44e-03, grad_scale: 16.0 2023-09-30 01:27:07,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:27:07,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-30 01:27:10,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:27:11,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:27:11,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-30 01:27:12,270 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=552546.6666666666, ans=0.0 2023-09-30 01:27:15,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:27:19,044 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-30 01:27:23,443 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.941e+02 2.200e+02 2.639e+02 4.791e+02, threshold=4.401e+02, percent-clipped=2.0 2023-09-30 01:27:23,569 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:27:24,428 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.18 vs. limit=15.0 2023-09-30 01:27:32,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:27:42,296 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=8.58 vs. limit=15.0 2023-09-30 01:27:43,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-30 01:27:44,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:27:47,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-30 01:27:48,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 01:27:51,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:27:51,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:27:53,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:27:56,586 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-30 01:27:58,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-30 01:28:01,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-30 01:28:03,472 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-30 01:28:05,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:28:07,035 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=552746.6666666666, ans=0.5 2023-09-30 01:28:11,800 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:28:11,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:28:11,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:28:13,439 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-30 01:28:13,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 01:28:18,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:28:20,093 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-30 01:28:20,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-30 01:28:20,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-30 01:28:23,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-30 01:28:25,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:28:27,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-30 01:28:27,219 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-30 01:28:27,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:28:27,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:28:30,186 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-30 01:28:31,919 INFO [train.py:1039] (1/4) Epoch 16, batch 3250, loss[loss=0.1856, simple_loss=0.2675, pruned_loss=0.0518, over 24393.00 frames. ], tot_loss[loss=0.1838, simple_loss=0.2577, pruned_loss=0.05495, over 4693268.51 frames. ], batch size: 69, lr: 6.44e-03, grad_scale: 16.0 2023-09-30 01:28:32,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:28:34,652 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=552880.0, ans=0.1 2023-09-30 01:28:35,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:28:45,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:28:45,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-30 01:28:48,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:28:49,018 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:28:49,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:28:50,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:28:50,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 01:28:52,589 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=552946.6666666666, ans=0.1 2023-09-30 01:28:53,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:28:53,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:28:55,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:28:55,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:28:55,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:28:55,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:28:57,542 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.81 vs. limit=15.0 2023-09-30 01:28:58,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:29:00,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:29:02,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:29:03,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:29:05,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:29:05,463 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:29:05,478 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:29:12,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-30 01:29:12,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:29:12,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:29:13,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:29:14,354 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.34 vs. limit=15.0 2023-09-30 01:29:15,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-30 01:29:19,832 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.64 vs. limit=10.0 2023-09-30 01:29:22,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:29:30,391 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:29:30,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:29:30,468 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-30 01:29:30,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:29:30,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 01:29:31,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:29:34,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-30 01:29:35,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-30 01:29:35,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:29:37,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:29:38,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:29:40,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-30 01:29:40,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:29:44,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:29:44,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:29:46,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-30 01:29:46,898 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:29:49,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:29:49,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-30 01:29:50,191 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=553146.6666666666, ans=0.125 2023-09-30 01:29:52,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:29:53,006 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-30 01:29:54,958 INFO [train.py:1039] (1/4) Epoch 16, batch 3300, loss[loss=0.1869, simple_loss=0.2602, pruned_loss=0.05684, over 23438.00 frames. ], tot_loss[loss=0.184, simple_loss=0.2581, pruned_loss=0.05497, over 4709891.69 frames. ], batch size: 120, lr: 6.44e-03, grad_scale: 16.0 2023-09-30 01:29:55,175 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-30 01:29:56,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-30 01:29:56,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:30:01,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:30:02,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:30:02,849 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:30:05,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 01:30:05,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 01:30:06,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:30:08,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:30:11,912 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.919e+02 2.093e+02 2.361e+02 4.091e+02, threshold=4.187e+02, percent-clipped=0.0 2023-09-30 01:30:12,140 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-30 01:30:13,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:30:13,555 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:30:16,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:30:16,537 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-30 01:30:18,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:30:19,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 01:30:20,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 01:30:20,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:30:20,286 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-30 01:30:25,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:30:25,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-30 01:30:28,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:30:28,436 WARNING [train.py:1197] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-30 01:30:30,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-30 01:30:30,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:30:32,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:30:33,670 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-30 01:30:35,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-30 01:30:35,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:30:37,918 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=553346.6666666666, ans=0.125 2023-09-30 01:30:39,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-30 01:30:40,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:30:42,826 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.47 vs. limit=15.0 2023-09-30 01:30:44,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-30 01:30:45,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:30:49,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:30:50,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:30:50,444 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:30:50,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:30:51,585 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.42 vs. limit=15.0 2023-09-30 01:30:53,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:30:53,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:30:53,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:30:55,202 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-30 01:30:56,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-30 01:30:59,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-30 01:31:00,467 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:31:00,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:31:03,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:31:03,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:31:03,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 01:31:05,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:31:05,052 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-30 01:31:05,342 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 01:31:06,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:31:08,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 01:31:10,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-30 01:31:12,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:31:14,104 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:31:15,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:31:15,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:31:15,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:31:19,254 INFO [train.py:1039] (1/4) Epoch 16, batch 3350, loss[loss=0.1626, simple_loss=0.2378, pruned_loss=0.04367, over 24308.00 frames. ], tot_loss[loss=0.1848, simple_loss=0.2594, pruned_loss=0.05505, over 4717305.90 frames. ], batch size: 56, lr: 6.43e-03, grad_scale: 16.0 2023-09-30 01:31:19,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:31:19,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:31:21,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:31:22,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:31:24,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:31:26,162 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=553546.6666666666, ans=0.125 2023-09-30 01:31:27,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:31:28,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:31:28,967 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=553546.6666666666, ans=0.0 2023-09-30 01:31:30,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:31:31,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:31:33,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-30 01:31:34,741 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-30 01:31:35,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:31:38,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-30 01:31:38,511 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-30 01:31:39,909 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:31:39,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:31:40,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:31:41,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-30 01:31:42,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:31:42,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:31:45,104 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:31:47,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:31:47,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:31:49,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:31:50,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:31:55,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:31:55,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:31:58,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:31:59,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:32:01,469 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:32:01,485 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:32:03,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:32:06,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-30 01:32:06,072 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 01:32:07,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-30 01:32:07,550 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:32:09,093 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-30 01:32:10,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:32:11,730 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=553746.6666666666, ans=0.125 2023-09-30 01:32:12,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:32:20,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:32:21,547 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-30 01:32:23,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 01:32:25,036 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:32:25,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:32:31,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:32:34,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-30 01:32:34,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 01:32:34,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:32:37,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:32:38,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-30 01:32:38,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:32:38,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-30 01:32:40,389 INFO [train.py:1039] (1/4) Epoch 16, batch 3400, loss[loss=0.2046, simple_loss=0.268, pruned_loss=0.07059, over 23738.00 frames. ], tot_loss[loss=0.1848, simple_loss=0.26, pruned_loss=0.05487, over 4736874.07 frames. ], batch size: 164, lr: 6.43e-03, grad_scale: 16.0 2023-09-30 01:32:40,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:32:40,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:32:42,057 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-30 01:32:42,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:32:42,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-30 01:32:46,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-30 01:32:47,626 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-30 01:32:47,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:32:53,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:32:53,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 01:32:54,461 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:32:56,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:32:58,066 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.461e+02 1.849e+02 2.111e+02 2.348e+02 3.492e+02, threshold=4.221e+02, percent-clipped=0.0 2023-09-30 01:33:01,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:33:03,133 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=553946.6666666666, ans=0.125 2023-09-30 01:33:04,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-30 01:33:10,344 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:33:11,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:33:12,009 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:33:13,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-30 01:33:18,949 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.38 vs. limit=15.0 2023-09-30 01:33:19,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:33:25,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-30 01:33:32,368 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:33:32,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:33:33,254 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.22 vs. limit=15.0 2023-09-30 01:33:33,486 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.37 vs. limit=12.0 2023-09-30 01:33:33,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-30 01:33:33,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:33:35,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:33:36,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:33:36,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:33:40,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:33:43,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:33:43,682 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:33:51,058 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:33:52,707 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-30 01:33:56,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 01:34:01,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-30 01:34:03,415 INFO [train.py:1039] (1/4) Epoch 16, batch 3450, loss[loss=0.1688, simple_loss=0.2504, pruned_loss=0.04357, over 24639.00 frames. ], tot_loss[loss=0.1849, simple_loss=0.2598, pruned_loss=0.05493, over 4735037.12 frames. ], batch size: 68, lr: 6.43e-03, grad_scale: 16.0 2023-09-30 01:34:07,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-30 01:34:07,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:34:07,448 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=554213.3333333334, ans=0.125 2023-09-30 01:34:09,268 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:34:09,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-30 01:34:09,537 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=554213.3333333334, ans=0.07 2023-09-30 01:34:10,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:34:15,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:34:21,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:34:21,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:34:21,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:34:21,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:34:21,716 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=554280.0, ans=0.04949747468305833 2023-09-30 01:34:24,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:34:29,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-30 01:34:36,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-30 01:34:36,083 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 01:34:36,146 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:34:37,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:34:44,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-30 01:34:44,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:34:49,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:34:49,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:34:50,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-30 01:34:52,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:34:52,350 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=554413.3333333334, ans=0.0 2023-09-30 01:34:53,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-30 01:34:53,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:34:55,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:34:58,990 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.56 vs. limit=15.0 2023-09-30 01:34:59,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:35:02,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-30 01:35:06,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:35:11,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:35:13,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:35:17,579 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:35:22,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:35:22,280 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:35:22,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:35:22,442 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:35:25,304 INFO [train.py:1039] (1/4) Epoch 16, batch 3500, loss[loss=0.1911, simple_loss=0.2552, pruned_loss=0.06344, over 23707.00 frames. ], tot_loss[loss=0.1838, simple_loss=0.2581, pruned_loss=0.05481, over 4706323.48 frames. ], batch size: 149, lr: 6.43e-03, grad_scale: 8.0 2023-09-30 01:35:27,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:35:30,293 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:35:30,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-30 01:35:30,577 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=554546.6666666666, ans=0.1 2023-09-30 01:35:34,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 01:35:35,271 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=554546.6666666666, ans=0.1 2023-09-30 01:35:35,402 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=554546.6666666666, ans=0.2 2023-09-30 01:35:36,519 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-30 01:35:39,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:35:39,579 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-30 01:35:43,267 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.500e+02 1.893e+02 2.078e+02 2.352e+02 3.454e+02, threshold=4.155e+02, percent-clipped=0.0 2023-09-30 01:35:45,710 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:35:47,715 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:35:47,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:35:47,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:35:49,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-30 01:35:49,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:35:51,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:35:51,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-30 01:35:53,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:35:53,095 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-30 01:35:56,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:35:58,089 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=554680.0, ans=0.125 2023-09-30 01:36:00,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:36:00,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-30 01:36:00,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:36:03,405 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.82 vs. limit=10.0 2023-09-30 01:36:04,040 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:36:05,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:36:05,740 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:36:07,583 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=554680.0, ans=0.125 2023-09-30 01:36:08,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:36:08,696 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:36:11,705 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-30 01:36:13,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-30 01:36:13,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-30 01:36:13,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:36:16,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:36:16,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:36:17,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:36:22,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 01:36:22,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:36:26,679 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:36:28,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-30 01:36:28,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-30 01:36:28,195 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:36:31,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:36:32,747 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:36:34,398 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:36:37,452 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-30 01:36:37,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:36:40,492 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:36:41,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-30 01:36:43,579 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-30 01:36:44,561 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=22.16 vs. limit=22.5 2023-09-30 01:36:46,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:36:46,895 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=554880.0, ans=0.1 2023-09-30 01:36:48,100 INFO [train.py:1039] (1/4) Epoch 16, batch 3550, loss[loss=0.1711, simple_loss=0.2526, pruned_loss=0.04479, over 23692.00 frames. ], tot_loss[loss=0.1831, simple_loss=0.2572, pruned_loss=0.05447, over 4713892.17 frames. ], batch size: 85, lr: 6.43e-03, grad_scale: 8.0 2023-09-30 01:36:48,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:36:48,260 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:36:48,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:36:52,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:36:59,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:37:01,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 01:37:01,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:37:03,133 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:37:03,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:37:04,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:37:04,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 01:37:09,470 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:37:09,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:37:10,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:37:10,936 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-30 01:37:12,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 01:37:16,080 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=554946.6666666666, ans=0.125 2023-09-30 01:37:18,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:37:18,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:37:20,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:37:20,138 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:37:20,415 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=555013.3333333334, ans=0.0 2023-09-30 01:37:22,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:37:22,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-30 01:37:22,385 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:37:23,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:37:24,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 01:37:31,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:37:33,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:37:33,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:37:35,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-30 01:37:36,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:37:38,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-30 01:37:38,728 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=555080.0, ans=0.2 2023-09-30 01:37:39,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:37:41,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:37:41,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:37:44,618 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-30 01:37:44,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:37:51,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:37:52,436 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-30 01:37:52,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:37:58,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:37:58,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-30 01:38:02,980 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=555146.6666666666, ans=10.0 2023-09-30 01:38:04,310 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=555146.6666666666, ans=0.125 2023-09-30 01:38:08,710 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-30 01:38:10,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:38:10,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:38:11,659 INFO [train.py:1039] (1/4) Epoch 16, batch 3600, loss[loss=0.2088, simple_loss=0.2837, pruned_loss=0.06698, over 24047.00 frames. ], tot_loss[loss=0.1834, simple_loss=0.2578, pruned_loss=0.05454, over 4703370.00 frames. ], batch size: 86, lr: 6.42e-03, grad_scale: 16.0 2023-09-30 01:38:11,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:38:13,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:38:14,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:38:15,201 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=555213.3333333334, ans=0.125 2023-09-30 01:38:18,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:38:19,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:38:21,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:38:22,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:38:22,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:38:22,812 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-30 01:38:25,905 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 01:38:27,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:38:27,610 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=555280.0, ans=0.125 2023-09-30 01:38:28,712 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.483e+02 2.004e+02 2.343e+02 2.780e+02 3.954e+02, threshold=4.687e+02, percent-clipped=0.0 2023-09-30 01:38:31,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:38:35,550 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:38:37,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:38:39,135 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:38:39,182 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-30 01:38:39,292 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:38:41,046 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=555280.0, ans=0.125 2023-09-30 01:38:42,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:38:43,748 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:38:43,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:38:47,046 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:38:47,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:38:48,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-30 01:38:52,086 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=555346.6666666666, ans=0.2 2023-09-30 01:38:54,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:38:56,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 01:38:56,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-30 01:39:01,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:39:06,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:39:09,109 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:39:13,105 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=555413.3333333334, ans=0.2 2023-09-30 01:39:15,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:39:15,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:39:15,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-30 01:39:17,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-30 01:39:17,558 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-30 01:39:20,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:39:20,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:39:22,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-30 01:39:23,468 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:39:23,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:39:23,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:39:25,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-30 01:39:26,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-30 01:39:26,753 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=555480.0, ans=0.0 2023-09-30 01:39:29,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:39:30,980 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-30 01:39:32,466 INFO [train.py:1039] (1/4) Epoch 16, batch 3650, loss[loss=0.1522, simple_loss=0.2318, pruned_loss=0.03626, over 24592.00 frames. ], tot_loss[loss=0.1834, simple_loss=0.2577, pruned_loss=0.05454, over 4715681.77 frames. ], batch size: 60, lr: 6.42e-03, grad_scale: 16.0 2023-09-30 01:39:36,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-30 01:39:38,592 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:39:45,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-30 01:39:46,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-30 01:39:49,798 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:39:49,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:39:49,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 01:39:55,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-30 01:39:55,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:39:57,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-30 01:39:57,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-30 01:39:57,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:39:57,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-30 01:39:59,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 01:40:00,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:40:00,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:40:00,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:40:03,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-30 01:40:06,925 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-30 01:40:07,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:40:08,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-30 01:40:10,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:40:10,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:40:12,040 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=555680.0, ans=0.1 2023-09-30 01:40:15,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:40:17,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:40:17,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:40:19,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:40:20,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:40:20,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:40:25,621 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:40:27,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:40:27,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:40:28,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 01:40:28,921 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:40:30,330 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:40:36,445 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-30 01:40:39,813 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:40:41,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:40:41,254 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-30 01:40:42,790 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:40:42,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:40:44,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:40:44,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-30 01:40:46,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:40:49,727 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 01:40:53,297 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:40:53,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:40:54,841 INFO [train.py:1039] (1/4) Epoch 16, batch 3700, loss[loss=0.2085, simple_loss=0.277, pruned_loss=0.07, over 23274.00 frames. ], tot_loss[loss=0.1848, simple_loss=0.2587, pruned_loss=0.05545, over 4714984.78 frames. ], batch size: 105, lr: 6.42e-03, grad_scale: 16.0 2023-09-30 01:40:56,556 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:40:56,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-30 01:40:56,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:40:58,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 01:40:59,651 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 01:41:01,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 01:41:05,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:41:05,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:41:08,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:41:08,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:41:08,933 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 01:41:11,784 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.470e+02 1.928e+02 2.143e+02 2.451e+02 3.411e+02, threshold=4.285e+02, percent-clipped=0.0 2023-09-30 01:41:11,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:41:14,984 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-30 01:41:21,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:41:23,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 01:41:23,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 01:41:23,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-30 01:41:23,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:41:26,007 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=556013.3333333334, ans=0.05 2023-09-30 01:41:28,939 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.41 vs. limit=15.0 2023-09-30 01:41:30,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:41:31,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-30 01:41:32,964 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:41:34,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:41:37,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:41:37,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 01:41:39,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 01:41:43,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:41:43,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-30 01:41:44,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:41:45,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-30 01:41:48,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:41:48,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:41:51,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:41:53,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-30 01:41:54,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:41:54,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-30 01:41:54,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:41:55,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:41:58,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:42:01,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-30 01:42:02,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-30 01:42:04,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:42:04,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:42:05,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:42:07,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:42:10,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:42:11,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:42:12,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:42:13,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-30 01:42:15,960 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.20 vs. limit=15.0 2023-09-30 01:42:16,463 INFO [train.py:1039] (1/4) Epoch 16, batch 3750, loss[loss=0.1931, simple_loss=0.2665, pruned_loss=0.05981, over 23121.00 frames. ], tot_loss[loss=0.1852, simple_loss=0.2593, pruned_loss=0.05551, over 4733158.12 frames. ], batch size: 105, lr: 6.42e-03, grad_scale: 16.0 2023-09-30 01:42:16,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 01:42:19,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-30 01:42:21,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-30 01:42:21,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:42:22,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:42:24,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:42:25,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:42:30,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:42:36,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:42:36,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:42:36,475 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=556280.0, ans=0.2 2023-09-30 01:42:39,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:42:43,937 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.56 vs. limit=15.0 2023-09-30 01:42:44,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:42:44,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-30 01:42:46,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:42:47,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:42:49,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:42:53,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-30 01:42:53,873 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=556346.6666666666, ans=0.0 2023-09-30 01:42:57,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-30 01:42:59,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:43:00,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:43:00,291 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=556346.6666666666, ans=0.1 2023-09-30 01:43:01,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:43:06,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:43:08,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-30 01:43:13,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-30 01:43:16,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:43:20,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:43:22,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:43:25,277 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:43:28,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 01:43:30,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-30 01:43:31,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 01:43:33,341 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=556480.0, ans=0.125 2023-09-30 01:43:34,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:43:36,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:43:37,529 INFO [train.py:1039] (1/4) Epoch 16, batch 3800, loss[loss=0.19, simple_loss=0.2633, pruned_loss=0.05836, over 23391.00 frames. ], tot_loss[loss=0.1847, simple_loss=0.2589, pruned_loss=0.05528, over 4737071.96 frames. ], batch size: 105, lr: 6.42e-03, grad_scale: 16.0 2023-09-30 01:43:43,924 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:43:49,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:43:49,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 01:43:49,993 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-30 01:43:52,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=556546.6666666666, ans=0.125 2023-09-30 01:43:53,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:43:53,661 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:43:55,199 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-30 01:43:56,511 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.836e+02 2.013e+02 2.222e+02 3.108e+02, threshold=4.026e+02, percent-clipped=0.0 2023-09-30 01:43:56,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 01:43:56,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:43:58,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:43:58,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:43:59,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:43:59,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:44:01,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-30 01:44:04,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-30 01:44:06,102 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:44:07,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:44:10,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:44:12,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 01:44:12,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-30 01:44:12,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:44:17,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:44:17,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:44:18,562 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=14.10 vs. limit=15.0 2023-09-30 01:44:23,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 01:44:23,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-30 01:44:25,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:44:32,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:44:39,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:44:40,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-30 01:44:43,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-30 01:44:45,100 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:44:45,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:44:46,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:44:48,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-30 01:44:50,683 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.05 vs. limit=22.5 2023-09-30 01:44:53,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-30 01:44:53,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-30 01:44:53,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:44:55,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:45:01,127 INFO [train.py:1039] (1/4) Epoch 16, batch 3850, loss[loss=0.1892, simple_loss=0.2663, pruned_loss=0.05606, over 23900.00 frames. ], tot_loss[loss=0.1844, simple_loss=0.2579, pruned_loss=0.05541, over 4722274.38 frames. ], batch size: 86, lr: 6.41e-03, grad_scale: 16.0 2023-09-30 01:45:02,075 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=14.71 vs. limit=15.0 2023-09-30 01:45:02,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:45:02,810 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:45:08,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:45:10,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-30 01:45:10,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 01:45:12,034 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:45:15,156 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 01:45:16,986 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=556946.6666666666, ans=0.125 2023-09-30 01:45:18,159 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:45:19,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-30 01:45:21,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-30 01:45:28,504 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:45:30,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:45:34,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:45:34,694 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=557013.3333333334, ans=0.04949747468305833 2023-09-30 01:45:35,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 01:45:38,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:45:40,200 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:45:40,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:45:40,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:45:41,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:45:43,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:45:43,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:45:44,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:45:44,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-30 01:45:45,038 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-30 01:45:45,278 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=557013.3333333334, ans=0.125 2023-09-30 01:45:46,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:45:46,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:45:49,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:45:49,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:45:49,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-30 01:45:52,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-30 01:45:54,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:45:56,242 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-30 01:45:59,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-30 01:46:04,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:46:07,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:46:09,056 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=557146.6666666666, ans=0.125 2023-09-30 01:46:10,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:46:10,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-30 01:46:13,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-30 01:46:15,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:46:16,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:46:17,112 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.89 vs. limit=22.5 2023-09-30 01:46:19,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 01:46:19,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:46:19,641 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:46:21,173 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:46:21,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:46:21,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-30 01:46:22,639 INFO [train.py:1039] (1/4) Epoch 16, batch 3900, loss[loss=0.1795, simple_loss=0.2513, pruned_loss=0.05384, over 24344.00 frames. ], tot_loss[loss=0.1835, simple_loss=0.257, pruned_loss=0.05504, over 4719412.81 frames. ], batch size: 61, lr: 6.41e-03, grad_scale: 16.0 2023-09-30 01:46:22,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:46:23,222 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=557213.3333333334, ans=0.0 2023-09-30 01:46:24,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-30 01:46:25,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:46:25,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:46:26,755 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.25 vs. limit=6.0 2023-09-30 01:46:27,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:46:27,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:46:27,478 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:46:28,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:46:28,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:46:29,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:46:30,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-30 01:46:30,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:46:34,306 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:46:36,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 01:46:38,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:46:38,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:46:42,017 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.908e+02 2.170e+02 2.558e+02 5.090e+02, threshold=4.341e+02, percent-clipped=1.0 2023-09-30 01:46:42,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 01:46:42,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:46:42,410 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-30 01:46:45,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-30 01:46:45,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:46:46,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-30 01:46:47,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:46:48,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-30 01:46:48,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-30 01:46:53,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:46:53,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:46:53,889 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 01:46:55,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:46:58,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:47:01,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:47:03,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:47:03,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:47:05,372 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:47:12,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:47:12,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:47:16,773 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=557413.3333333334, ans=0.0 2023-09-30 01:47:19,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 01:47:21,165 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:47:28,971 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=557480.0, ans=0.125 2023-09-30 01:47:33,231 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:47:34,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:47:35,080 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=557480.0, ans=0.0 2023-09-30 01:47:36,326 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-30 01:47:36,386 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-30 01:47:37,769 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:47:37,986 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=557480.0, ans=0.125 2023-09-30 01:47:40,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-30 01:47:42,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:47:43,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-30 01:47:46,591 INFO [train.py:1039] (1/4) Epoch 16, batch 3950, loss[loss=0.1837, simple_loss=0.2674, pruned_loss=0.04999, over 24400.00 frames. ], tot_loss[loss=0.1833, simple_loss=0.2562, pruned_loss=0.05518, over 4694631.70 frames. ], batch size: 77, lr: 6.41e-03, grad_scale: 16.0 2023-09-30 01:47:52,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:47:53,721 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-30 01:47:53,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:47:58,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:48:00,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:48:05,165 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-30 01:48:06,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 01:48:06,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-30 01:48:06,820 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-30 01:48:08,247 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:48:09,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:48:09,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-30 01:48:09,980 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:48:11,703 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-30 01:48:13,397 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=557613.3333333334, ans=0.1 2023-09-30 01:48:14,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:48:16,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 01:48:16,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 01:48:17,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 01:48:17,409 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=557613.3333333334, ans=0.125 2023-09-30 01:48:18,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:48:31,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:48:31,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:48:36,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-30 01:48:40,161 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=557746.6666666666, ans=0.0 2023-09-30 01:48:40,166 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=557746.6666666666, ans=0.0 2023-09-30 01:48:43,004 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-30 01:48:43,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-30 01:48:43,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:48:46,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:48:48,009 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 01:48:53,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:48:53,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-30 01:48:53,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:48:53,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:48:55,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-30 01:48:58,680 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=557813.3333333334, ans=0.0 2023-09-30 01:49:00,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:49:00,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:49:07,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-30 01:49:10,303 INFO [train.py:1039] (1/4) Epoch 16, batch 4000, loss[loss=0.1911, simple_loss=0.2681, pruned_loss=0.05704, over 23464.00 frames. ], tot_loss[loss=0.1843, simple_loss=0.2574, pruned_loss=0.05563, over 4693326.19 frames. ], batch size: 93, lr: 6.41e-03, grad_scale: 32.0 2023-09-30 01:49:15,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:49:19,008 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.51 vs. limit=22.5 2023-09-30 01:49:21,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:49:27,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:49:28,510 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.918e+02 2.123e+02 2.513e+02 3.159e+02, threshold=4.246e+02, percent-clipped=0.0 2023-09-30 01:49:28,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:49:28,715 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:49:28,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-30 01:49:30,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:49:31,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-30 01:49:31,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:49:31,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-30 01:49:33,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:49:37,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:49:37,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:49:37,093 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:49:37,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:49:37,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-30 01:49:40,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:49:40,260 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-30 01:49:40,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:49:42,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:49:43,883 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-30 01:49:44,224 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=558013.3333333334, ans=0.1 2023-09-30 01:49:45,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 01:49:45,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:49:47,292 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=558013.3333333334, ans=0.0 2023-09-30 01:49:51,813 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-30 01:49:53,283 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:49:53,460 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=558013.3333333334, ans=0.125 2023-09-30 01:49:56,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:49:57,483 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-30 01:49:59,073 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:49:59,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-30 01:49:59,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:50:00,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:50:02,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:50:03,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:50:03,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-30 01:50:03,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:50:06,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-30 01:50:07,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:50:10,277 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-30 01:50:15,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 01:50:15,880 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.27 vs. limit=10.0 2023-09-30 01:50:16,871 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=558146.6666666666, ans=0.1 2023-09-30 01:50:16,953 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=558146.6666666666, ans=0.0 2023-09-30 01:50:18,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 01:50:20,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:50:21,080 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.81 vs. limit=12.0 2023-09-30 01:50:21,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:50:22,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:50:23,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:50:28,808 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:50:29,236 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=558146.6666666666, ans=0.125 2023-09-30 01:50:31,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-30 01:50:31,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-30 01:50:33,777 INFO [train.py:1039] (1/4) Epoch 16, batch 4050, loss[loss=0.2345, simple_loss=0.2882, pruned_loss=0.09043, over 19413.00 frames. ], tot_loss[loss=0.1857, simple_loss=0.259, pruned_loss=0.05622, over 4686077.42 frames. ], batch size: 388, lr: 6.41e-03, grad_scale: 8.0 2023-09-30 01:50:35,393 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 01:50:35,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:50:36,877 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:50:37,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:50:38,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:50:39,650 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.05 vs. limit=15.0 2023-09-30 01:50:43,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:50:46,713 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:50:48,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 01:50:49,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:50:51,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:50:53,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:50:55,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:50:59,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 01:51:03,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-30 01:51:03,246 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-30 01:51:05,414 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=558346.6666666666, ans=0.125 2023-09-30 01:51:06,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:51:14,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-30 01:51:15,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:51:19,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:51:22,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:51:23,498 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:51:23,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:51:25,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:51:30,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-30 01:51:30,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 01:51:32,262 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:51:32,565 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=558413.3333333334, ans=0.07 2023-09-30 01:51:34,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-30 01:51:36,189 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.40 vs. limit=15.0 2023-09-30 01:51:40,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:51:45,983 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=558480.0, ans=0.0 2023-09-30 01:51:47,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-30 01:51:48,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:51:48,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 01:51:50,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-30 01:51:50,514 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-30 01:51:50,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:51:53,834 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=558546.6666666666, ans=0.125 2023-09-30 01:51:54,847 INFO [train.py:1039] (1/4) Epoch 16, batch 4100, loss[loss=0.1861, simple_loss=0.2747, pruned_loss=0.04873, over 24637.00 frames. ], tot_loss[loss=0.1857, simple_loss=0.2597, pruned_loss=0.05587, over 4683177.48 frames. ], batch size: 68, lr: 6.41e-03, grad_scale: 8.0 2023-09-30 01:51:55,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:51:55,168 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:51:55,193 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:52:03,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-30 01:52:06,236 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-30 01:52:06,468 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=558546.6666666666, ans=0.125 2023-09-30 01:52:07,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-30 01:52:09,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-30 01:52:09,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:52:11,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:52:11,465 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:52:11,486 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:52:13,057 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-30 01:52:16,681 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.940e+02 2.145e+02 2.448e+02 4.292e+02, threshold=4.289e+02, percent-clipped=1.0 2023-09-30 01:52:16,887 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:52:18,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:52:18,431 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:52:20,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 01:52:23,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:52:23,655 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:52:23,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:52:25,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-30 01:52:25,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:52:25,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:52:25,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:52:25,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:52:27,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-30 01:52:30,243 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:52:33,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-30 01:52:35,086 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:52:36,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:52:36,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-30 01:52:39,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:52:39,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:52:39,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:52:41,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-30 01:52:43,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-30 01:52:43,553 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:52:45,340 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-30 01:52:46,605 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-30 01:52:46,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:52:48,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:52:50,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:52:53,207 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=558746.6666666666, ans=0.2 2023-09-30 01:52:56,014 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:52:59,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:53:00,442 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:53:02,232 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=558813.3333333334, ans=0.04949747468305833 2023-09-30 01:53:05,252 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=558813.3333333334, ans=0.125 2023-09-30 01:53:10,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:53:10,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:53:11,020 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.36 vs. limit=6.0 2023-09-30 01:53:13,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:53:16,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:53:18,274 INFO [train.py:1039] (1/4) Epoch 16, batch 4150, loss[loss=0.1618, simple_loss=0.2464, pruned_loss=0.03859, over 24435.00 frames. ], tot_loss[loss=0.1855, simple_loss=0.2594, pruned_loss=0.05582, over 4681100.27 frames. ], batch size: 63, lr: 6.40e-03, grad_scale: 8.0 2023-09-30 01:53:20,163 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:53:21,766 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:53:21,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:53:21,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:53:22,176 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=558880.0, ans=0.0 2023-09-30 01:53:25,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-30 01:53:25,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:53:25,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-30 01:53:27,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-30 01:53:27,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-30 01:53:29,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:53:33,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:53:33,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:53:37,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:53:37,568 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=558946.6666666666, ans=0.0 2023-09-30 01:53:39,247 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:53:39,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-30 01:53:42,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 01:53:42,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:53:42,716 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-30 01:53:49,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:53:52,501 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:53:54,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-30 01:53:55,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-30 01:53:55,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:53:58,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-30 01:53:58,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:53:58,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:54:01,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:54:02,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:54:06,316 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.17 vs. limit=12.0 2023-09-30 01:54:08,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-30 01:54:10,387 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:54:11,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:54:13,363 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-30 01:54:13,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:54:15,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-30 01:54:17,854 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.42 vs. limit=6.0 2023-09-30 01:54:19,430 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.96 vs. limit=15.0 2023-09-30 01:54:19,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:54:20,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:54:22,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:54:22,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-30 01:54:22,214 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:54:22,217 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-30 01:54:23,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 01:54:24,305 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=559146.6666666666, ans=0.125 2023-09-30 01:54:26,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-30 01:54:26,933 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:54:26,941 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:54:26,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 01:54:28,495 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-30 01:54:28,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:54:28,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 01:54:28,730 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=559146.6666666666, ans=0.0 2023-09-30 01:54:30,693 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:54:32,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:54:32,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-30 01:54:32,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:54:38,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:54:40,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-30 01:54:41,596 INFO [train.py:1039] (1/4) Epoch 16, batch 4200, loss[loss=0.1804, simple_loss=0.2492, pruned_loss=0.05579, over 23763.00 frames. ], tot_loss[loss=0.1851, simple_loss=0.2584, pruned_loss=0.05584, over 4691697.35 frames. ], batch size: 149, lr: 6.40e-03, grad_scale: 8.0 2023-09-30 01:54:41,812 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:54:43,407 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:54:44,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 01:54:46,359 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:54:46,362 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:54:49,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-30 01:54:49,662 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=559213.3333333334, ans=0.125 2023-09-30 01:54:53,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-30 01:54:53,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:54:53,592 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=559213.3333333334, ans=0.2 2023-09-30 01:54:55,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:54:58,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:55:03,002 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.901e+02 2.194e+02 2.609e+02 4.040e+02, threshold=4.389e+02, percent-clipped=0.0 2023-09-30 01:55:03,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-30 01:55:03,340 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:55:05,237 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:55:05,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-30 01:55:05,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:55:06,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:55:08,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:55:08,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 01:55:09,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:55:11,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-30 01:55:11,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:55:16,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-30 01:55:16,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:55:19,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:55:19,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:55:22,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:55:22,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-30 01:55:22,697 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=559346.6666666666, ans=0.2 2023-09-30 01:55:24,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:55:24,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:55:30,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-30 01:55:32,586 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:55:39,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:55:42,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-30 01:55:45,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:55:50,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 01:55:52,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:55:53,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-30 01:55:56,918 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:56:01,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:56:01,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:56:03,849 INFO [train.py:1039] (1/4) Epoch 16, batch 4250, loss[loss=0.1755, simple_loss=0.2574, pruned_loss=0.0468, over 24285.00 frames. ], tot_loss[loss=0.1835, simple_loss=0.2561, pruned_loss=0.0554, over 4685430.16 frames. ], batch size: 74, lr: 6.40e-03, grad_scale: 8.0 2023-09-30 01:56:04,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:56:09,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-30 01:56:09,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-30 01:56:09,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:56:13,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:56:17,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:56:20,620 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=559613.3333333334, ans=0.0 2023-09-30 01:56:21,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:56:21,875 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:56:24,804 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:56:24,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:56:24,939 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=559613.3333333334, ans=0.015 2023-09-30 01:56:26,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:56:27,926 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:56:30,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:56:30,492 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=559613.3333333334, ans=0.0 2023-09-30 01:56:31,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:56:33,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:56:35,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-30 01:56:35,320 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=559680.0, ans=0.125 2023-09-30 01:56:40,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-30 01:56:40,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:56:40,445 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=559680.0, ans=0.0 2023-09-30 01:56:40,491 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=559680.0, ans=0.125 2023-09-30 01:56:42,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:56:42,305 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:56:43,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-30 01:56:43,810 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:56:43,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:56:48,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-30 01:56:49,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:56:53,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:56:54,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:56:54,977 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=559746.6666666666, ans=0.1 2023-09-30 01:56:56,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-30 01:56:56,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 01:56:56,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-30 01:56:57,939 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:57:00,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-30 01:57:02,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:57:02,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:57:04,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-30 01:57:06,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 01:57:06,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-30 01:57:10,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:57:14,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:57:17,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:57:19,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:57:21,521 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:57:21,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:57:21,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:57:21,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-30 01:57:22,013 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=559813.3333333334, ans=0.2 2023-09-30 01:57:22,120 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=559813.3333333334, ans=0.125 2023-09-30 01:57:24,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:57:25,313 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.29 vs. limit=12.0 2023-09-30 01:57:26,119 INFO [train.py:1039] (1/4) Epoch 16, batch 4300, loss[loss=0.1712, simple_loss=0.2562, pruned_loss=0.04316, over 24635.00 frames. ], tot_loss[loss=0.1829, simple_loss=0.2559, pruned_loss=0.05494, over 4692083.17 frames. ], batch size: 73, lr: 6.40e-03, grad_scale: 8.0 2023-09-30 01:57:29,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:57:30,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:57:33,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:57:37,546 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=559880.0, ans=0.0 2023-09-30 01:57:42,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:57:43,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-30 01:57:43,627 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:57:44,329 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.42 vs. limit=12.0 2023-09-30 01:57:45,333 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:57:45,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:57:47,239 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 1.838e+02 2.077e+02 2.423e+02 4.089e+02, threshold=4.153e+02, percent-clipped=0.0 2023-09-30 01:57:47,371 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-30 01:57:49,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 01:57:52,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:57:53,551 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=559946.6666666666, ans=0.125 2023-09-30 01:57:59,779 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-30 01:57:59,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:57:59,841 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-30 01:58:02,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 01:58:03,098 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:58:07,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:58:07,498 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:58:07,652 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 01:58:09,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:58:09,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:58:09,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-30 01:58:10,915 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-30 01:58:12,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:58:16,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:58:17,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 01:58:17,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:58:17,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:58:17,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-30 01:58:17,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-30 01:58:18,586 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-30 01:58:20,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:58:20,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-30 01:58:20,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-30 01:58:25,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:58:27,411 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-30 01:58:28,173 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:58:29,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:58:29,628 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:58:32,619 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-30 01:58:33,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:58:33,960 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:58:35,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:58:35,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:58:35,746 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=560146.6666666666, ans=0.1 2023-09-30 01:58:36,991 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:58:38,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:58:41,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:58:41,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:58:42,488 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.66 vs. limit=10.0 2023-09-30 01:58:43,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:58:47,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-30 01:58:48,222 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=560146.6666666666, ans=0.125 2023-09-30 01:58:49,261 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-30 01:58:50,649 INFO [train.py:1039] (1/4) Epoch 16, batch 4350, loss[loss=0.1802, simple_loss=0.2456, pruned_loss=0.05739, over 23757.00 frames. ], tot_loss[loss=0.1838, simple_loss=0.2572, pruned_loss=0.05526, over 4698530.68 frames. ], batch size: 179, lr: 6.40e-03, grad_scale: 8.0 2023-09-30 01:58:54,431 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:58:56,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:58:57,543 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.48 vs. limit=15.0 2023-09-30 01:58:58,491 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=560213.3333333334, ans=0.0 2023-09-30 01:58:59,143 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.06 vs. limit=22.5 2023-09-30 01:59:00,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:59:00,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:59:06,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 01:59:10,131 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:59:10,488 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=560280.0, ans=0.2 2023-09-30 01:59:13,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:59:13,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:59:16,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:59:18,218 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=560280.0, ans=0.0 2023-09-30 01:59:18,393 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.54 vs. limit=15.0 2023-09-30 01:59:20,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:59:22,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-30 01:59:27,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-30 01:59:27,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:59:28,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:59:36,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:59:40,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-30 01:59:41,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:59:43,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 01:59:48,110 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-30 01:59:51,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:59:51,244 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:59:52,798 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-30 01:59:52,911 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-30 01:59:52,920 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:59:52,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:59:54,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:59:56,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:59:56,110 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:59:56,189 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:59:59,326 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-30 01:59:59,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:59:59,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:59:59,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:00:00,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-30 02:00:02,413 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-30 02:00:02,421 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-30 02:00:02,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-30 02:00:07,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:00:07,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:00:07,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:00:09,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:00:11,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-30 02:00:13,467 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-30 02:00:14,773 INFO [train.py:1039] (1/4) Epoch 16, batch 4400, loss[loss=0.1969, simple_loss=0.267, pruned_loss=0.06339, over 23489.00 frames. ], tot_loss[loss=0.1841, simple_loss=0.2577, pruned_loss=0.05529, over 4713460.25 frames. ], batch size: 120, lr: 6.39e-03, grad_scale: 16.0 2023-09-30 02:00:15,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:00:19,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:00:19,874 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:00:22,884 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:00:24,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-30 02:00:24,473 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-30 02:00:24,562 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-30 02:00:24,609 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-30 02:00:26,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 02:00:26,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:00:27,841 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-30 02:00:29,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:00:30,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:00:30,879 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-30 02:00:35,159 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.842e+02 2.058e+02 2.254e+02 3.655e+02, threshold=4.116e+02, percent-clipped=0.0 2023-09-30 02:00:35,372 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:00:35,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-30 02:00:36,807 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-30 02:00:38,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-30 02:00:40,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-30 02:00:40,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-30 02:00:40,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:00:41,687 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:00:41,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:00:43,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:00:45,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-30 02:00:45,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-30 02:00:48,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:00:49,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:00:49,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:00:51,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:00:53,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:00:53,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-30 02:00:53,256 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-30 02:00:57,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:01:04,104 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:01:04,410 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=560746.6666666666, ans=0.5 2023-09-30 02:01:07,079 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-30 02:01:08,792 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=560746.6666666666, ans=0.125 2023-09-30 02:01:08,884 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=560746.6666666666, ans=0.125 2023-09-30 02:01:10,189 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:01:13,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:01:16,458 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:01:16,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-30 02:01:18,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:01:18,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:01:18,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 02:01:18,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-30 02:01:24,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-30 02:01:28,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-30 02:01:29,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-30 02:01:29,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:01:29,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-30 02:01:31,521 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:01:34,514 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:01:36,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-30 02:01:37,435 INFO [train.py:1039] (1/4) Epoch 16, batch 4450, loss[loss=0.2013, simple_loss=0.2827, pruned_loss=0.05991, over 24099.00 frames. ], tot_loss[loss=0.1857, simple_loss=0.2592, pruned_loss=0.05612, over 4699626.87 frames. ], batch size: 80, lr: 6.39e-03, grad_scale: 16.0 2023-09-30 02:01:40,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:01:43,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:01:45,209 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:01:51,457 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:01:51,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:01:53,159 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=560946.6666666666, ans=0.1 2023-09-30 02:01:57,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:01:57,382 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=560946.6666666666, ans=0.0 2023-09-30 02:02:00,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:02:02,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:02:02,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:02:02,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-30 02:02:02,662 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:02:04,153 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:02:05,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:02:05,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:02:09,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 02:02:13,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:02:15,319 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:02:15,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:02:15,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:02:17,593 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.53 vs. limit=15.0 2023-09-30 02:02:18,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:02:23,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 02:02:24,654 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-30 02:02:24,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-30 02:02:24,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:02:26,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:02:28,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-30 02:02:28,803 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=561080.0, ans=0.0 2023-09-30 02:02:31,033 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=561080.0, ans=0.125 2023-09-30 02:02:32,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-30 02:02:37,563 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:02:37,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-30 02:02:37,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:02:37,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:02:37,710 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:02:37,721 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:02:38,196 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=561080.0, ans=0.0 2023-09-30 02:02:39,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:02:43,182 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-30 02:02:44,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-30 02:02:46,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 02:02:47,817 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:02:50,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:02:52,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:02:52,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 02:02:52,447 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=561146.6666666666, ans=0.125 2023-09-30 02:02:53,153 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.67 vs. limit=15.0 2023-09-30 02:02:55,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-30 02:02:58,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-30 02:03:00,046 INFO [train.py:1039] (1/4) Epoch 16, batch 4500, loss[loss=0.1654, simple_loss=0.2495, pruned_loss=0.04062, over 24514.00 frames. ], tot_loss[loss=0.1853, simple_loss=0.2587, pruned_loss=0.05598, over 4700283.30 frames. ], batch size: 66, lr: 6.39e-03, grad_scale: 8.0 2023-09-30 02:03:00,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:03:03,631 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=561213.3333333334, ans=0.0 2023-09-30 02:03:04,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:03:05,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-30 02:03:05,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-30 02:03:08,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:03:14,957 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:03:15,038 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:03:15,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 02:03:17,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:03:17,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:03:18,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:03:19,281 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=561280.0, ans=0.125 2023-09-30 02:03:23,474 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 1.923e+02 2.201e+02 2.757e+02 3.678e+02, threshold=4.403e+02, percent-clipped=0.0 2023-09-30 02:03:28,910 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.63 vs. limit=15.0 2023-09-30 02:03:30,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:03:31,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:03:33,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:03:33,339 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:03:35,572 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:03:43,656 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 02:03:47,554 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=561346.6666666666, ans=0.0 2023-09-30 02:03:48,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:03:52,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 02:03:55,796 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:03:57,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-30 02:03:58,592 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:03:58,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:03:59,483 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=10.57 vs. limit=15.0 2023-09-30 02:04:00,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=561413.3333333334, ans=0.05 2023-09-30 02:04:01,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:04:01,628 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:04:03,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:04:03,489 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-30 02:04:03,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 02:04:03,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:04:08,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:04:08,312 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:04:09,104 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=7.05 vs. limit=12.0 2023-09-30 02:04:11,923 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:04:14,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:04:14,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:04:15,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-30 02:04:17,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-30 02:04:17,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-30 02:04:18,464 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=561480.0, ans=0.0 2023-09-30 02:04:21,870 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.99 vs. limit=15.0 2023-09-30 02:04:22,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-30 02:04:24,036 INFO [train.py:1039] (1/4) Epoch 16, batch 4550, loss[loss=0.1652, simple_loss=0.2442, pruned_loss=0.04308, over 24599.00 frames. ], tot_loss[loss=0.1839, simple_loss=0.2574, pruned_loss=0.05515, over 4697404.60 frames. ], batch size: 60, lr: 6.39e-03, grad_scale: 8.0 2023-09-30 02:04:26,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-30 02:04:27,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:04:31,198 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=561546.6666666666, ans=0.125 2023-09-30 02:04:32,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:04:32,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:04:33,031 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.55 vs. limit=15.0 2023-09-30 02:04:35,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:04:38,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:04:39,644 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.25 vs. limit=15.0 2023-09-30 02:04:40,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:04:41,966 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:04:41,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-30 02:04:41,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:04:45,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:04:45,489 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:04:50,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:04:54,048 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-30 02:04:54,148 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-30 02:04:54,437 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=561613.3333333334, ans=0.125 2023-09-30 02:04:55,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:04:57,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-30 02:05:00,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-30 02:05:01,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:05:05,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-30 02:05:06,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 02:05:11,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:05:11,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:05:11,472 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-30 02:05:13,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-30 02:05:16,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:05:18,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:05:19,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:05:19,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:05:22,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-30 02:05:22,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-30 02:05:22,145 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:05:22,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-30 02:05:25,749 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-30 02:05:25,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:05:27,366 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:05:27,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:05:28,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:05:28,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:05:31,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 02:05:31,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-30 02:05:34,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:05:34,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 02:05:34,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-30 02:05:34,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:05:37,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-30 02:05:40,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:05:40,266 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:05:43,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:05:43,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:05:43,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-30 02:05:46,086 INFO [train.py:1039] (1/4) Epoch 16, batch 4600, loss[loss=0.2056, simple_loss=0.279, pruned_loss=0.06617, over 23638.00 frames. ], tot_loss[loss=0.1825, simple_loss=0.2557, pruned_loss=0.05467, over 4692883.96 frames. ], batch size: 85, lr: 6.39e-03, grad_scale: 8.0 2023-09-30 02:05:46,146 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:05:46,484 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=561880.0, ans=0.125 2023-09-30 02:05:47,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-30 02:05:49,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:05:49,716 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=561880.0, ans=0.125 2023-09-30 02:05:51,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:05:55,199 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:05:55,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:05:55,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:05:56,784 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-30 02:05:58,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-30 02:06:02,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:06:02,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:06:05,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:06:09,418 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=12.22 vs. limit=15.0 2023-09-30 02:06:09,848 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.941e+02 2.143e+02 2.418e+02 3.430e+02, threshold=4.285e+02, percent-clipped=0.0 2023-09-30 02:06:12,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-30 02:06:13,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:06:15,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:06:18,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:06:18,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:06:25,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-30 02:06:25,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 02:06:25,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:06:33,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:06:34,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:06:35,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:06:40,322 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-30 02:06:40,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-30 02:06:40,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=562080.0, ans=0.0 2023-09-30 02:06:45,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:06:47,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:06:50,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:06:50,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 02:06:50,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:06:51,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-30 02:06:51,802 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:06:52,074 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=562146.6666666666, ans=0.125 2023-09-30 02:06:53,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:06:54,709 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:06:54,831 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:06:58,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:06:58,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-30 02:06:58,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-30 02:06:58,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-30 02:06:59,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:07:01,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:07:01,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:07:03,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:07:09,713 INFO [train.py:1039] (1/4) Epoch 16, batch 4650, loss[loss=0.1929, simple_loss=0.2591, pruned_loss=0.06336, over 23800.00 frames. ], tot_loss[loss=0.1826, simple_loss=0.2563, pruned_loss=0.05446, over 4703261.18 frames. ], batch size: 179, lr: 6.38e-03, grad_scale: 8.0 2023-09-30 02:07:13,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:07:16,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:07:16,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:07:16,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:07:16,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:07:16,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:07:20,140 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:07:21,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-30 02:07:22,119 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=562213.3333333334, ans=0.1 2023-09-30 02:07:26,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:07:27,992 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-30 02:07:29,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:07:29,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-30 02:07:29,464 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:07:30,915 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-30 02:07:30,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-30 02:07:30,971 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:07:31,245 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=562280.0, ans=0.125 2023-09-30 02:07:33,042 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:07:36,744 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 02:07:38,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:07:38,336 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-30 02:07:41,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:07:44,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-30 02:07:46,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:07:46,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:07:48,293 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-30 02:07:48,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:07:50,357 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=562346.6666666666, ans=0.125 2023-09-30 02:07:51,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:07:52,263 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.18 vs. limit=15.0 2023-09-30 02:07:56,736 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:08:01,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:08:04,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:08:04,886 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=562413.3333333334, ans=10.0 2023-09-30 02:08:06,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:08:06,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:08:10,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-30 02:08:11,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-30 02:08:13,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 02:08:13,271 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-30 02:08:14,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:08:21,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:08:21,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:08:23,188 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-30 02:08:23,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:08:24,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:08:24,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:08:26,199 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:08:30,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:08:30,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:08:32,590 INFO [train.py:1039] (1/4) Epoch 16, batch 4700, loss[loss=0.1779, simple_loss=0.2531, pruned_loss=0.05139, over 24298.00 frames. ], tot_loss[loss=0.1835, simple_loss=0.2574, pruned_loss=0.05475, over 4708987.18 frames. ], batch size: 61, lr: 6.38e-03, grad_scale: 8.0 2023-09-30 02:08:32,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:08:35,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:08:35,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:08:35,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 02:08:36,106 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=562546.6666666666, ans=0.0 2023-09-30 02:08:36,284 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=562546.6666666666, ans=0.125 2023-09-30 02:08:37,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-30 02:08:38,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-30 02:08:39,053 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-30 02:08:42,353 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=562546.6666666666, ans=0.125 2023-09-30 02:08:43,579 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=562546.6666666666, ans=0.125 2023-09-30 02:08:49,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:08:50,669 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:08:50,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:08:52,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:08:53,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 02:08:55,745 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.890e+02 2.173e+02 2.583e+02 3.915e+02, threshold=4.346e+02, percent-clipped=0.0 2023-09-30 02:08:58,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-30 02:08:59,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-30 02:09:02,431 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:09:02,564 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:09:03,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:09:06,275 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.95 vs. limit=22.5 2023-09-30 02:09:07,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:09:15,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:09:16,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 02:09:18,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:09:24,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-30 02:09:26,213 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:09:27,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:09:33,047 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=562746.6666666666, ans=0.0 2023-09-30 02:09:34,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-30 02:09:35,851 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:09:39,123 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:09:39,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-30 02:09:40,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:09:40,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:09:44,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:09:44,414 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:09:44,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-30 02:09:47,325 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-30 02:09:48,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:09:51,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:09:51,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:09:51,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-30 02:09:53,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:09:54,821 INFO [train.py:1039] (1/4) Epoch 16, batch 4750, loss[loss=0.1907, simple_loss=0.2587, pruned_loss=0.06139, over 23783.00 frames. ], tot_loss[loss=0.1843, simple_loss=0.2582, pruned_loss=0.05523, over 4699588.97 frames. ], batch size: 212, lr: 6.38e-03, grad_scale: 8.0 2023-09-30 02:09:55,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-30 02:09:58,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:09:58,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:10:02,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:10:04,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:10:04,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-30 02:10:05,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:10:06,221 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=562880.0, ans=0.0 2023-09-30 02:10:08,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-30 02:10:10,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:10:10,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:10:11,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:10:18,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-30 02:10:23,136 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:10:26,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-30 02:10:27,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:10:29,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:10:29,271 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:10:29,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:10:31,413 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-30 02:10:31,417 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-30 02:10:35,383 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=563013.3333333334, ans=0.2 2023-09-30 02:10:38,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-30 02:10:39,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:10:42,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:10:44,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:10:44,440 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-30 02:10:44,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:10:47,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:10:49,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:10:50,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-30 02:10:50,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-30 02:10:53,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:10:53,114 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:10:54,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:10:54,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 02:10:56,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-30 02:10:59,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-30 02:11:02,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:11:07,860 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:11:07,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-30 02:11:07,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:11:08,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:11:11,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:11:12,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:11:12,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:11:16,081 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:11:17,303 INFO [train.py:1039] (1/4) Epoch 16, batch 4800, loss[loss=0.1841, simple_loss=0.2515, pruned_loss=0.05841, over 23583.00 frames. ], tot_loss[loss=0.1844, simple_loss=0.2582, pruned_loss=0.05532, over 4693601.43 frames. ], batch size: 120, lr: 6.38e-03, grad_scale: 16.0 2023-09-30 02:11:17,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-30 02:11:17,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-30 02:11:19,005 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-30 02:11:22,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-30 02:11:22,086 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:11:23,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-30 02:11:28,731 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:11:28,820 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:11:33,478 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 02:11:35,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:11:35,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:11:36,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-30 02:11:38,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:11:38,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:11:40,584 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.887e+02 2.070e+02 2.375e+02 3.869e+02, threshold=4.141e+02, percent-clipped=0.0 2023-09-30 02:11:42,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-30 02:11:46,246 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:11:47,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:11:47,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:11:48,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:11:48,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 02:11:49,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:11:50,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:11:54,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:11:55,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:11:57,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:11:57,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-30 02:11:58,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 02:11:59,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:12:02,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-30 02:12:02,615 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-30 02:12:04,035 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:12:04,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:12:05,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:12:05,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:12:05,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-30 02:12:07,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:12:07,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:12:11,977 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:12:16,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:12:17,622 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:12:22,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-30 02:12:22,957 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:12:23,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:12:24,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:12:25,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:12:26,500 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.91 vs. limit=10.0 2023-09-30 02:12:29,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:12:30,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:12:30,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:12:32,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:12:33,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:12:33,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 02:12:33,944 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=563480.0, ans=0.125 2023-09-30 02:12:37,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:12:37,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:12:37,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:12:40,151 INFO [train.py:1039] (1/4) Epoch 16, batch 4850, loss[loss=0.1738, simple_loss=0.2469, pruned_loss=0.05037, over 24294.00 frames. ], tot_loss[loss=0.1837, simple_loss=0.2577, pruned_loss=0.05486, over 4710769.12 frames. ], batch size: 56, lr: 6.38e-03, grad_scale: 16.0 2023-09-30 02:12:40,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-30 02:12:41,044 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.78 vs. limit=15.0 2023-09-30 02:12:41,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-30 02:12:41,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:12:41,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:12:43,344 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:12:43,346 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:12:46,422 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:12:46,911 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=563546.6666666666, ans=0.0 2023-09-30 02:12:53,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-30 02:12:55,431 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:13:00,110 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:13:01,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 02:13:01,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:13:04,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:13:06,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:13:06,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:13:06,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-30 02:13:11,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:13:13,776 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.72 vs. limit=15.0 2023-09-30 02:13:14,368 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:13:14,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 02:13:16,090 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:13:16,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-30 02:13:17,907 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=563680.0, ans=0.125 2023-09-30 02:13:19,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:13:19,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:13:24,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:13:24,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-30 02:13:25,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-30 02:13:26,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 02:13:34,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:13:34,600 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-30 02:13:37,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:13:37,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:13:39,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-30 02:13:40,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-30 02:13:40,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:13:42,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-30 02:13:42,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:13:42,340 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:13:44,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-30 02:13:53,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:13:55,944 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=563813.3333333334, ans=0.125 2023-09-30 02:13:59,178 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:13:59,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:14:02,881 INFO [train.py:1039] (1/4) Epoch 16, batch 4900, loss[loss=0.2055, simple_loss=0.264, pruned_loss=0.07349, over 23818.00 frames. ], tot_loss[loss=0.1835, simple_loss=0.2574, pruned_loss=0.05482, over 4699538.93 frames. ], batch size: 212, lr: 6.38e-03, grad_scale: 16.0 2023-09-30 02:14:04,117 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.71 vs. limit=10.0 2023-09-30 02:14:05,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-30 02:14:05,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:14:10,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:14:12,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:14:12,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-30 02:14:15,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-30 02:14:20,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-30 02:14:23,917 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=563946.6666666666, ans=0.1 2023-09-30 02:14:24,986 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.686e+02 1.946e+02 2.133e+02 2.467e+02 3.436e+02, threshold=4.266e+02, percent-clipped=0.0 2023-09-30 02:14:27,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-30 02:14:27,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-30 02:14:28,800 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-30 02:14:28,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:14:28,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:14:28,925 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:14:28,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:14:30,549 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-30 02:14:35,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-30 02:14:36,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 02:14:37,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:14:39,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-30 02:14:41,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:14:42,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:14:42,877 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:14:42,891 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-30 02:14:43,222 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=564013.3333333334, ans=0.125 2023-09-30 02:14:44,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:14:44,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:14:44,785 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=564013.3333333334, ans=0.0 2023-09-30 02:14:46,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-30 02:14:46,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-30 02:14:49,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-30 02:14:50,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-30 02:14:51,742 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=12.63 vs. limit=15.0 2023-09-30 02:14:52,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:14:52,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:14:52,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:14:53,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 02:14:53,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:14:54,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-30 02:14:56,131 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:14:59,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-30 02:15:02,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:15:05,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-30 02:15:06,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:15:06,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-30 02:15:08,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-30 02:15:13,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:15:14,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:15:16,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-30 02:15:16,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 02:15:16,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:15:19,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:15:19,197 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=564146.6666666666, ans=0.125 2023-09-30 02:15:22,242 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=564146.6666666666, ans=0.125 2023-09-30 02:15:23,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:15:23,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-30 02:15:24,965 INFO [train.py:1039] (1/4) Epoch 16, batch 4950, loss[loss=0.1905, simple_loss=0.2656, pruned_loss=0.0577, over 23302.00 frames. ], tot_loss[loss=0.1826, simple_loss=0.2556, pruned_loss=0.05481, over 4691166.32 frames. ], batch size: 119, lr: 6.37e-03, grad_scale: 16.0 2023-09-30 02:15:25,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:15:25,084 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-30 02:15:26,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 02:15:30,493 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:15:30,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 02:15:35,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-30 02:15:35,644 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-30 02:15:35,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-30 02:15:37,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-30 02:15:37,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:15:37,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:15:38,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:15:38,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:15:41,094 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=564280.0, ans=0.125 2023-09-30 02:15:42,367 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:15:42,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:15:44,546 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:15:46,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:15:47,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:15:47,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:15:50,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 02:15:55,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:15:56,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:16:00,404 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:16:00,486 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:16:01,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:16:02,149 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-30 02:16:03,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-30 02:16:06,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:16:08,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:16:08,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:16:10,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:16:10,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:16:11,932 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-30 02:16:13,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:16:17,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-30 02:16:20,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:16:22,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:16:23,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:16:23,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-30 02:16:23,838 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=564413.3333333334, ans=0.125 2023-09-30 02:16:24,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:16:26,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 02:16:29,865 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=564480.0, ans=0.1 2023-09-30 02:16:30,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:16:32,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:16:32,533 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:16:32,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:16:32,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:16:33,020 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=564480.0, ans=0.125 2023-09-30 02:16:34,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:16:34,997 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.70 vs. limit=15.0 2023-09-30 02:16:36,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:16:37,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:16:37,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:16:39,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-30 02:16:45,988 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:16:46,506 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=564546.6666666666, ans=0.04949747468305833 2023-09-30 02:16:47,451 INFO [train.py:1039] (1/4) Epoch 16, batch 5000, loss[loss=0.1786, simple_loss=0.259, pruned_loss=0.04912, over 24315.00 frames. ], tot_loss[loss=0.1822, simple_loss=0.2555, pruned_loss=0.0544, over 4708683.20 frames. ], batch size: 61, lr: 6.37e-03, grad_scale: 16.0 2023-09-30 02:16:50,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-30 02:16:50,744 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-30 02:16:56,827 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:16:56,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:16:59,633 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-30 02:16:59,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-30 02:17:02,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:17:04,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-30 02:17:04,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:17:05,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 02:17:07,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-30 02:17:07,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:17:08,828 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:17:08,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-30 02:17:08,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:17:09,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:17:10,778 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.604e+02 1.922e+02 2.167e+02 2.587e+02 4.159e+02, threshold=4.333e+02, percent-clipped=0.0 2023-09-30 02:17:12,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-30 02:17:12,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-30 02:17:13,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:17:13,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-30 02:17:14,003 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 02:17:14,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:17:15,524 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 02:17:15,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-30 02:17:15,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-30 02:17:17,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-30 02:17:17,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:17:19,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:17:19,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-30 02:17:20,829 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:17:21,231 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=564680.0, ans=0.1 2023-09-30 02:17:22,367 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:17:22,488 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:17:24,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-30 02:17:26,096 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-30 02:17:26,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:17:28,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:17:31,551 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-30 02:17:36,029 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:17:36,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:17:36,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:17:40,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-30 02:17:40,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:17:40,912 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:17:42,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:17:44,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-30 02:17:46,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:17:49,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:17:49,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:17:51,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=564746.6666666666, ans=0.125 2023-09-30 02:17:54,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-30 02:17:54,866 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=564813.3333333334, ans=0.05 2023-09-30 02:18:02,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:18:10,334 INFO [train.py:1039] (1/4) Epoch 16, batch 5050, loss[loss=0.1744, simple_loss=0.2431, pruned_loss=0.05288, over 24309.00 frames. ], tot_loss[loss=0.1829, simple_loss=0.2565, pruned_loss=0.0546, over 4714135.63 frames. ], batch size: 56, lr: 6.37e-03, grad_scale: 8.0 2023-09-30 02:18:10,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:18:12,106 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:18:12,117 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:18:12,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:18:13,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 02:18:13,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:18:13,612 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:18:18,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:18:18,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-30 02:18:20,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:18:23,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:18:23,673 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=564880.0, ans=0.125 2023-09-30 02:18:24,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:18:24,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-30 02:18:26,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:18:26,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:18:30,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 02:18:31,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 02:18:31,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-30 02:18:41,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-30 02:18:43,093 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-30 02:18:44,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:18:44,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-30 02:18:44,720 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:18:44,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:18:46,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:18:46,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:18:46,350 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-30 02:18:47,766 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-30 02:18:47,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:18:52,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:18:54,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:18:55,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-30 02:18:57,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:18:57,749 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=565080.0, ans=0.125 2023-09-30 02:19:00,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-30 02:19:01,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:19:01,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:19:03,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:19:04,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:19:04,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=565080.0, ans=0.125 2023-09-30 02:19:05,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:19:07,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:19:08,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:19:09,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:19:09,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:19:10,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-30 02:19:11,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:19:13,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:19:17,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:19:17,132 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-30 02:19:17,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-30 02:19:18,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:19:20,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:19:20,160 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-30 02:19:23,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:19:23,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-30 02:19:23,115 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:19:26,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:19:26,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:19:28,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-30 02:19:28,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-30 02:19:31,367 INFO [train.py:1039] (1/4) Epoch 16, batch 5100, loss[loss=0.1795, simple_loss=0.2651, pruned_loss=0.0469, over 24659.00 frames. ], tot_loss[loss=0.1838, simple_loss=0.2578, pruned_loss=0.05494, over 4713899.34 frames. ], batch size: 68, lr: 6.37e-03, grad_scale: 8.0 2023-09-30 02:19:31,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:19:31,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:19:31,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:19:34,689 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-30 02:19:38,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:19:42,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-30 02:19:42,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-30 02:19:44,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:19:45,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:19:48,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:19:50,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-30 02:19:50,079 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-30 02:19:54,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:19:54,729 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:19:55,998 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.868e+02 2.082e+02 2.336e+02 3.756e+02, threshold=4.164e+02, percent-clipped=0.0 2023-09-30 02:19:57,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:20:01,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-30 02:20:01,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:20:04,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:20:05,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-30 02:20:05,295 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=565346.6666666666, ans=0.0 2023-09-30 02:20:07,384 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.56 vs. limit=15.0 2023-09-30 02:20:08,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:20:08,268 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:20:09,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-30 02:20:12,024 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-30 02:20:12,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:20:14,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-30 02:20:14,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-30 02:20:16,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:20:19,722 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=565346.6666666666, ans=0.125 2023-09-30 02:20:25,756 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:20:26,126 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=565413.3333333334, ans=0.125 2023-09-30 02:20:27,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-30 02:20:28,826 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-30 02:20:28,839 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-30 02:20:29,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-30 02:20:29,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:20:32,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-30 02:20:35,874 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-30 02:20:37,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 02:20:39,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:20:40,616 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-30 02:20:44,959 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-30 02:20:45,031 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-30 02:20:51,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:20:51,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:20:51,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:20:53,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:20:53,525 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=565480.0, ans=0.0 2023-09-30 02:20:54,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 02:20:55,079 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=565546.6666666666, ans=0.1 2023-09-30 02:20:56,258 INFO [train.py:1039] (1/4) Epoch 16, batch 5150, loss[loss=0.1767, simple_loss=0.2487, pruned_loss=0.05234, over 21346.00 frames. ], tot_loss[loss=0.1846, simple_loss=0.2585, pruned_loss=0.0553, over 4715351.24 frames. ], batch size: 47, lr: 6.37e-03, grad_scale: 8.0 2023-09-30 02:20:56,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:20:56,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-30 02:20:56,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-30 02:20:57,903 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-30 02:20:59,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:20:59,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-30 02:21:00,913 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:21:00,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 02:21:02,647 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:21:04,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:21:09,775 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 02:21:10,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 02:21:10,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-30 02:21:12,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:21:12,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:21:14,443 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=565613.3333333334, ans=0.125 2023-09-30 02:21:15,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-30 02:21:15,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:21:15,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:21:17,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:21:17,066 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:21:17,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-30 02:21:18,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:21:20,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:21:23,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 02:21:24,042 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-30 02:21:26,236 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=565613.3333333334, ans=0.125 2023-09-30 02:21:27,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:21:29,845 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=565680.0, ans=0.2 2023-09-30 02:21:32,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-30 02:21:35,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-30 02:21:38,779 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:21:45,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:21:46,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:21:50,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:21:50,100 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:21:53,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-30 02:21:58,838 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:22:00,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-30 02:22:00,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:22:02,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:22:04,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:22:06,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-30 02:22:10,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:22:11,020 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 02:22:12,808 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=565813.3333333334, ans=0.2 2023-09-30 02:22:14,073 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:22:14,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:22:14,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-30 02:22:15,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-30 02:22:15,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:22:15,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:22:19,276 INFO [train.py:1039] (1/4) Epoch 16, batch 5200, loss[loss=0.1961, simple_loss=0.2496, pruned_loss=0.07132, over 22656.00 frames. ], tot_loss[loss=0.1843, simple_loss=0.2586, pruned_loss=0.05501, over 4725305.22 frames. ], batch size: 322, lr: 6.36e-03, grad_scale: 16.0 2023-09-30 02:22:19,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:22:22,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:22:23,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:22:29,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-30 02:22:29,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:22:30,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:22:32,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:22:34,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:22:34,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:22:36,824 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=565946.6666666666, ans=0.1 2023-09-30 02:22:39,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-30 02:22:41,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 02:22:41,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:22:44,239 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.323e+02 1.894e+02 2.058e+02 2.319e+02 3.515e+02, threshold=4.116e+02, percent-clipped=0.0 2023-09-30 02:22:44,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-30 02:22:46,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-30 02:22:46,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-30 02:22:47,732 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-30 02:22:49,254 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-30 02:22:52,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-30 02:22:53,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:22:53,702 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-30 02:22:53,714 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:22:55,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:22:55,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:22:55,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-30 02:22:57,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:22:58,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:23:02,793 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-30 02:23:02,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-30 02:23:04,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-30 02:23:09,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-30 02:23:11,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 02:23:16,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:23:16,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:23:18,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-30 02:23:19,500 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:23:19,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-30 02:23:19,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:23:19,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 02:23:22,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:23:22,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:23:27,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:23:29,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:23:29,265 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:23:34,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:23:35,743 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-30 02:23:37,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:23:37,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:23:38,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:23:40,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-30 02:23:40,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:23:42,263 INFO [train.py:1039] (1/4) Epoch 16, batch 5250, loss[loss=0.1642, simple_loss=0.2481, pruned_loss=0.04021, over 24482.00 frames. ], tot_loss[loss=0.1833, simple_loss=0.2579, pruned_loss=0.05436, over 4723670.59 frames. ], batch size: 66, lr: 6.36e-03, grad_scale: 16.0 2023-09-30 02:23:45,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:23:49,543 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.65 vs. limit=22.5 2023-09-30 02:23:50,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:23:50,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:23:52,005 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:23:58,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:23:58,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:24:01,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:24:03,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 02:24:05,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-30 02:24:05,341 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:24:08,953 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:24:46,377 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=566480.0, ans=0.1 2023-09-30 02:24:50,852 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.97 vs. limit=6.0 2023-09-30 02:24:51,794 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=566480.0, ans=0.1 2023-09-30 02:24:57,069 INFO [train.py:1039] (1/4) Epoch 16, batch 5300, loss[loss=0.1642, simple_loss=0.2457, pruned_loss=0.04134, over 24675.00 frames. ], tot_loss[loss=0.183, simple_loss=0.2568, pruned_loss=0.05461, over 4724173.40 frames. ], batch size: 65, lr: 6.36e-03, grad_scale: 16.0 2023-09-30 02:25:08,490 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 02:25:12,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:25:12,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-30 02:25:12,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-30 02:25:12,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:25:12,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:25:12,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:25:12,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:25:12,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:25:12,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:25:13,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:25:13,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-30 02:25:13,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:25:13,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-30 02:25:13,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-30 02:25:13,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-30 02:25:14,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-30 02:25:14,562 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-30 02:25:14,692 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-30 02:25:14,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:25:15,407 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:25:15,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:25:15,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:25:15,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:25:16,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:25:16,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:25:16,369 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:25:16,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:25:16,551 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:25:16,558 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:25:16,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:25:16,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:25:17,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-30 02:25:17,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:25:18,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:25:18,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-30 02:25:18,656 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-30 02:25:18,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:25:18,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:25:18,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-30 02:25:19,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-30 02:25:19,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-30 02:25:19,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:25:20,115 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:25:20,274 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-30 02:25:20,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-30 02:25:20,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-30 02:25:20,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:25:20,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-30 02:25:20,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-30 02:25:20,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-30 02:25:21,206 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-30 02:25:29,927 INFO [train.py:1039] (1/4) Epoch 17, batch 0, loss[loss=0.1789, simple_loss=0.2604, pruned_loss=0.04869, over 24050.00 frames. ], tot_loss[loss=0.1789, simple_loss=0.2604, pruned_loss=0.04869, over 24050.00 frames. ], batch size: 80, lr: 6.17e-03, grad_scale: 32.0 2023-09-30 02:25:29,928 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-30 02:25:40,568 INFO [zipformer.py:1853] (1/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([2.0471, 3.3460, 4.7660, 3.8026], device='cuda:1') 2023-09-30 02:25:43,989 INFO [train.py:1071] (1/4) Epoch 17, validation: loss=0.3013, simple_loss=0.2697, pruned_loss=0.1665, over 1125622.00 frames. 2023-09-30 02:25:43,990 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-30 02:25:45,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-30 02:25:47,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:25:49,124 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.597e+02 1.973e+02 2.191e+02 2.524e+02 3.767e+02, threshold=4.382e+02, percent-clipped=0.0 2023-09-30 02:25:49,306 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:25:56,521 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:25:56,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:25:56,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:25:58,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-30 02:26:00,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-30 02:26:01,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:26:02,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:26:05,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:26:05,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:26:07,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:26:07,336 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:26:09,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-30 02:26:10,894 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:26:18,061 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=566760.0, ans=0.0 2023-09-30 02:26:19,358 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:26:19,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:26:19,667 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=566760.0, ans=0.125 2023-09-30 02:26:22,261 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-30 02:26:26,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:26:26,080 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 02:26:27,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:26:32,557 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:26:35,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:26:41,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-30 02:26:45,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-30 02:26:45,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:26:45,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:26:46,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:26:47,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:26:47,256 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=566826.6666666666, ans=0.125 2023-09-30 02:26:49,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-30 02:26:50,651 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.50 vs. limit=22.5 2023-09-30 02:26:52,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:26:53,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:26:58,862 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:27:02,200 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-30 02:27:03,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:27:07,325 INFO [train.py:1039] (1/4) Epoch 17, batch 50, loss[loss=0.1878, simple_loss=0.2605, pruned_loss=0.05753, over 23432.00 frames. ], tot_loss[loss=0.1876, simple_loss=0.2614, pruned_loss=0.05695, over 1059895.72 frames. ], batch size: 285, lr: 6.17e-03, grad_scale: 16.0 2023-09-30 02:27:07,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:27:10,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:27:10,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-30 02:27:10,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:27:12,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:27:13,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:27:15,977 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:27:18,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:27:22,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-30 02:27:22,099 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:27:27,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-30 02:27:30,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-30 02:27:31,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-30 02:27:34,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:27:35,164 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=567026.6666666666, ans=0.1 2023-09-30 02:27:36,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:27:36,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:27:36,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:27:38,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-30 02:27:38,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 02:27:38,518 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:27:39,449 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.52 vs. limit=12.0 2023-09-30 02:27:45,597 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=567093.3333333334, ans=0.1 2023-09-30 02:27:46,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:27:47,121 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:27:48,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 02:27:48,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-30 02:27:52,289 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 02:27:53,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 02:27:53,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-30 02:27:53,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:27:56,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-30 02:28:05,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:28:05,168 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:28:05,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:28:06,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:28:06,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-30 02:28:09,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-30 02:28:09,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-30 02:28:12,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:28:12,892 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-30 02:28:13,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:28:13,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:28:15,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-30 02:28:16,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-30 02:28:18,107 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-30 02:28:18,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:28:18,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:28:19,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-30 02:28:19,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-30 02:28:19,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:28:20,261 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=567226.6666666666, ans=0.125 2023-09-30 02:28:21,620 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:28:23,169 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=567226.6666666666, ans=0.1 2023-09-30 02:28:25,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-30 02:28:25,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:28:28,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:28:29,421 INFO [train.py:1039] (1/4) Epoch 17, batch 100, loss[loss=0.1654, simple_loss=0.2455, pruned_loss=0.0426, over 24425.00 frames. ], tot_loss[loss=0.1872, simple_loss=0.2611, pruned_loss=0.0567, over 1854984.53 frames. ], batch size: 63, lr: 6.16e-03, grad_scale: 16.0 2023-09-30 02:28:31,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:28:34,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:28:36,141 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.604e+02 1.907e+02 2.184e+02 2.612e+02 4.946e+02, threshold=4.368e+02, percent-clipped=2.0 2023-09-30 02:28:36,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-30 02:28:36,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:28:39,596 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:28:39,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:28:41,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:28:41,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:28:41,136 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:28:42,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-30 02:28:44,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-30 02:28:44,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:28:45,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:28:45,865 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:28:49,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-30 02:28:51,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:28:52,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:28:54,288 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-30 02:28:55,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 02:28:59,510 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-30 02:28:59,537 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-30 02:28:59,731 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:28:59,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:29:02,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-30 02:29:05,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:29:07,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:29:12,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:29:12,505 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-30 02:29:15,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-30 02:29:20,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:29:20,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:29:22,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:29:27,251 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:29:28,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:29:31,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:29:32,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:29:34,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:29:37,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:29:37,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:29:37,235 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:29:38,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-30 02:29:38,685 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-30 02:29:38,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:29:38,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:29:39,063 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=567560.0, ans=0.0 2023-09-30 02:29:40,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:29:40,298 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:29:40,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 02:29:41,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 02:29:41,840 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-30 02:29:41,850 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:29:41,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:29:43,411 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:29:43,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:29:45,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:29:46,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:29:49,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:29:49,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:29:51,227 INFO [train.py:1039] (1/4) Epoch 17, batch 150, loss[loss=0.2034, simple_loss=0.2692, pruned_loss=0.06876, over 22778.00 frames. ], tot_loss[loss=0.1883, simple_loss=0.2623, pruned_loss=0.05712, over 2489703.57 frames. ], batch size: 322, lr: 6.16e-03, grad_scale: 8.0 2023-09-30 02:29:51,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:29:51,811 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=567626.6666666666, ans=0.125 2023-09-30 02:29:53,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:29:55,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:29:57,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:29:58,512 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:30:03,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-30 02:30:03,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-30 02:30:03,453 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-30 02:30:05,233 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:30:06,575 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 02:30:08,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:30:09,669 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:30:09,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:30:09,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:30:11,220 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:30:12,782 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-30 02:30:14,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:30:21,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:30:25,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:30:25,783 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-30 02:30:29,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:30:29,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:30:29,511 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:30:31,189 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=567760.0, ans=0.0 2023-09-30 02:30:31,436 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.69 vs. limit=10.0 2023-09-30 02:30:33,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:30:36,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:30:36,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:30:38,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:30:39,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-30 02:30:46,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:30:47,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:30:48,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:30:48,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:30:51,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:30:52,007 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.97 vs. limit=15.0 2023-09-30 02:30:52,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 02:30:55,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:30:56,021 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=567893.3333333334, ans=0.125 2023-09-30 02:30:58,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:30:59,546 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:31:02,383 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-30 02:31:02,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-30 02:31:02,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:31:02,718 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=567893.3333333334, ans=0.125 2023-09-30 02:31:03,831 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-30 02:31:06,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:31:07,298 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=567893.3333333334, ans=0.125 2023-09-30 02:31:12,017 INFO [train.py:1039] (1/4) Epoch 17, batch 200, loss[loss=0.192, simple_loss=0.2606, pruned_loss=0.06173, over 23645.00 frames. ], tot_loss[loss=0.1872, simple_loss=0.2614, pruned_loss=0.05653, over 2999455.10 frames. ], batch size: 149, lr: 6.16e-03, grad_scale: 8.0 2023-09-30 02:31:12,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:31:12,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:31:15,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-30 02:31:16,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:31:16,891 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=567960.0, ans=0.125 2023-09-30 02:31:18,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:31:20,235 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.492e+02 1.902e+02 2.115e+02 2.489e+02 3.841e+02, threshold=4.230e+02, percent-clipped=0.0 2023-09-30 02:31:22,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-30 02:31:22,415 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=567960.0, ans=0.125 2023-09-30 02:31:23,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-30 02:31:23,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:31:25,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:31:28,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:31:28,354 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:31:28,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:31:37,023 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=568026.6666666666, ans=0.1 2023-09-30 02:31:38,568 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=568026.6666666666, ans=0.07 2023-09-30 02:31:47,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=568093.3333333334, ans=0.1 2023-09-30 02:31:50,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:31:51,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:31:52,571 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=568093.3333333334, ans=15.0 2023-09-30 02:31:53,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:31:53,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:31:55,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 02:31:55,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 02:31:56,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:31:57,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 02:31:58,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:31:58,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:32:00,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-30 02:32:00,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 02:32:00,217 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:32:03,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:32:10,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:32:15,569 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:32:15,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:32:17,845 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 02:32:23,544 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:32:26,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-30 02:32:26,581 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:32:27,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:32:27,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:32:29,414 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:32:30,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-30 02:32:32,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:32:32,402 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-30 02:32:35,454 INFO [train.py:1039] (1/4) Epoch 17, batch 250, loss[loss=0.1714, simple_loss=0.2557, pruned_loss=0.04356, over 24497.00 frames. ], tot_loss[loss=0.1851, simple_loss=0.26, pruned_loss=0.05508, over 3378884.09 frames. ], batch size: 66, lr: 6.16e-03, grad_scale: 8.0 2023-09-30 02:32:35,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:32:35,936 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=568293.3333333334, ans=0.125 2023-09-30 02:32:37,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:32:39,326 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:32:40,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:32:41,111 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=568293.3333333334, ans=0.125 2023-09-30 02:32:42,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:32:43,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:32:45,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:32:47,209 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=568293.3333333334, ans=0.2 2023-09-30 02:32:48,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:33:00,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:33:02,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:33:04,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:33:10,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-30 02:33:11,310 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.32 vs. limit=15.0 2023-09-30 02:33:12,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-30 02:33:14,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:33:14,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:33:15,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 02:33:15,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:33:15,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:33:18,784 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:33:21,995 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=568426.6666666666, ans=0.125 2023-09-30 02:33:23,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-30 02:33:23,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:33:24,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:33:24,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-30 02:33:24,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:33:26,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:33:27,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 02:33:28,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:33:31,525 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:33:31,710 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:33:31,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:33:35,458 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:33:40,064 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.49 vs. limit=15.0 2023-09-30 02:33:40,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:33:43,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:33:45,582 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=568560.0, ans=0.025 2023-09-30 02:33:50,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:33:51,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:33:55,033 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-30 02:33:57,772 INFO [train.py:1039] (1/4) Epoch 17, batch 300, loss[loss=0.1794, simple_loss=0.2445, pruned_loss=0.05716, over 23685.00 frames. ], tot_loss[loss=0.1834, simple_loss=0.258, pruned_loss=0.05443, over 3677777.78 frames. ], batch size: 135, lr: 6.16e-03, grad_scale: 8.0 2023-09-30 02:33:57,917 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:33:59,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:34:00,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-30 02:34:00,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-30 02:34:02,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:34:02,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-30 02:34:05,244 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.942e+02 2.251e+02 2.659e+02 4.378e+02, threshold=4.502e+02, percent-clipped=1.0 2023-09-30 02:34:07,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:34:07,198 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:34:10,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:34:11,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-30 02:34:11,314 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=568626.6666666666, ans=0.1 2023-09-30 02:34:14,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:34:14,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 02:34:14,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-30 02:34:14,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:34:17,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-30 02:34:22,987 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:34:24,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-30 02:34:27,603 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-30 02:34:27,661 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:34:27,982 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=568693.3333333334, ans=0.2 2023-09-30 02:34:30,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:34:32,439 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=568760.0, ans=0.125 2023-09-30 02:34:33,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:34:33,692 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-30 02:34:33,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:34:35,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:34:37,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:34:37,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:34:42,653 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-30 02:34:42,673 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-30 02:34:44,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:34:47,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:34:47,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-30 02:34:49,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:34:53,093 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=568826.6666666666, ans=0.125 2023-09-30 02:34:54,287 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:34:59,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:34:59,976 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.32 vs. limit=15.0 2023-09-30 02:35:00,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-30 02:35:03,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:35:03,961 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 02:35:07,026 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:35:08,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:35:08,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-30 02:35:08,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 02:35:08,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:35:10,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-30 02:35:13,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:35:13,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:35:15,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:35:15,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:35:15,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:35:20,311 INFO [train.py:1039] (1/4) Epoch 17, batch 350, loss[loss=0.1825, simple_loss=0.2661, pruned_loss=0.04948, over 24477.00 frames. ], tot_loss[loss=0.1817, simple_loss=0.2562, pruned_loss=0.05361, over 3909300.85 frames. ], batch size: 66, lr: 6.15e-03, grad_scale: 8.0 2023-09-30 02:35:22,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:35:22,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 02:35:25,664 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:35:31,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:35:35,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:35:35,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:35:37,230 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=569026.6666666666, ans=0.125 2023-09-30 02:35:38,674 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-30 02:35:41,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:35:41,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-30 02:35:43,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:35:43,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-30 02:35:44,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:35:48,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-30 02:35:52,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:35:52,684 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=569093.3333333334, ans=0.2 2023-09-30 02:35:53,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:35:55,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:35:55,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:35:55,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:35:57,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:35:57,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:35:58,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:36:00,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:36:00,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:36:06,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:36:09,007 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-30 02:36:09,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:36:10,541 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:36:16,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-30 02:36:16,759 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:36:20,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:36:20,095 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:36:20,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:36:21,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-30 02:36:23,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:36:23,953 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-30 02:36:27,532 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-30 02:36:27,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:36:30,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:36:30,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-30 02:36:31,781 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.96 vs. limit=15.0 2023-09-30 02:36:32,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:36:32,427 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=569226.6666666666, ans=0.125 2023-09-30 02:36:34,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:36:35,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:36:37,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:36:37,441 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:36:41,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:36:44,002 INFO [train.py:1039] (1/4) Epoch 17, batch 400, loss[loss=0.1903, simple_loss=0.2658, pruned_loss=0.05744, over 24456.00 frames. ], tot_loss[loss=0.1824, simple_loss=0.257, pruned_loss=0.05394, over 4103755.10 frames. ], batch size: 63, lr: 6.15e-03, grad_scale: 16.0 2023-09-30 02:36:44,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:36:44,485 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=569293.3333333334, ans=0.0 2023-09-30 02:36:45,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:36:47,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-30 02:36:47,394 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:36:47,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:36:50,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:36:50,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:36:51,860 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.862e+02 1.986e+02 2.213e+02 4.165e+02, threshold=3.971e+02, percent-clipped=0.0 2023-09-30 02:36:55,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:36:56,065 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=569293.3333333334, ans=0.2 2023-09-30 02:36:57,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:36:58,884 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-30 02:37:00,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-30 02:37:00,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:37:02,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-30 02:37:04,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:37:05,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:37:05,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:37:05,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-30 02:37:07,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:37:07,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:37:08,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:37:08,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:37:12,417 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-30 02:37:13,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-30 02:37:16,861 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=569426.6666666666, ans=0.0 2023-09-30 02:37:18,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:37:18,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:37:19,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-30 02:37:22,391 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-30 02:37:25,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:37:29,026 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:37:35,383 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-30 02:37:35,736 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=569493.3333333334, ans=0.2 2023-09-30 02:37:38,911 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-30 02:37:40,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-30 02:37:40,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:37:43,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-30 02:37:43,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-30 02:37:47,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:37:47,953 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=569493.3333333334, ans=0.1 2023-09-30 02:37:51,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 02:37:52,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:37:54,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:37:55,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-30 02:37:57,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-30 02:37:58,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-30 02:37:59,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 02:37:59,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:37:59,365 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=569560.0, ans=0.1 2023-09-30 02:38:02,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-30 02:38:05,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 02:38:05,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:38:07,149 INFO [train.py:1039] (1/4) Epoch 17, batch 450, loss[loss=0.235, simple_loss=0.2909, pruned_loss=0.08955, over 19659.00 frames. ], tot_loss[loss=0.1821, simple_loss=0.2571, pruned_loss=0.05357, over 4246296.54 frames. ], batch size: 388, lr: 6.15e-03, grad_scale: 8.0 2023-09-30 02:38:07,244 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-30 02:38:07,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-30 02:38:07,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:38:08,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:38:09,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:38:09,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-30 02:38:09,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:38:11,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 02:38:14,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:38:24,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:38:24,786 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:38:25,080 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=569693.3333333334, ans=0.125 2023-09-30 02:38:26,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-30 02:38:28,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-30 02:38:31,400 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=569693.3333333334, ans=0.1 2023-09-30 02:38:32,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-30 02:38:32,931 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=569693.3333333334, ans=0.0 2023-09-30 02:38:36,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:38:37,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:38:42,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:38:42,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:38:45,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-30 02:38:45,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-30 02:38:47,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-30 02:38:48,825 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:38:48,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:38:50,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:38:51,444 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=569760.0, ans=0.0 2023-09-30 02:38:52,614 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-30 02:38:52,631 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-30 02:38:52,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:38:54,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:38:57,705 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-30 02:39:00,911 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-30 02:39:02,363 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:39:02,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-30 02:39:03,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-30 02:39:05,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:39:07,414 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 02:39:08,470 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-30 02:39:09,885 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 02:39:11,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-30 02:39:16,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:39:16,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-30 02:39:18,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-30 02:39:19,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:39:23,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:39:26,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:39:27,878 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:39:27,915 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-30 02:39:29,811 INFO [train.py:1039] (1/4) Epoch 17, batch 500, loss[loss=0.189, simple_loss=0.2608, pruned_loss=0.05856, over 23892.00 frames. ], tot_loss[loss=0.1825, simple_loss=0.2576, pruned_loss=0.0537, over 4339759.14 frames. ], batch size: 195, lr: 6.15e-03, grad_scale: 8.0 2023-09-30 02:39:33,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:39:33,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:39:33,801 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:39:35,244 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-30 02:39:36,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-30 02:39:36,808 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:39:39,660 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.531e+02 1.860e+02 2.180e+02 2.486e+02 3.417e+02, threshold=4.360e+02, percent-clipped=0.0 2023-09-30 02:39:39,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 02:39:44,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 02:39:46,172 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:39:47,819 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:39:47,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:39:47,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:39:59,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:39:59,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-30 02:40:01,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-30 02:40:01,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:40:01,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-30 02:40:01,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:40:03,989 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=570093.3333333334, ans=0.1 2023-09-30 02:40:05,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:40:05,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:40:05,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:40:05,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:40:07,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-30 02:40:09,135 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=570093.3333333334, ans=0.125 2023-09-30 02:40:10,529 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-30 02:40:13,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:40:15,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:40:16,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:40:16,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:40:17,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-30 02:40:20,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-30 02:40:24,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:40:25,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:40:31,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:40:34,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:40:39,503 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.44 vs. limit=12.0 2023-09-30 02:40:40,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:40:45,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-30 02:40:45,281 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:40:45,300 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:40:47,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-30 02:40:48,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-30 02:40:48,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:40:53,259 INFO [train.py:1039] (1/4) Epoch 17, batch 550, loss[loss=0.1504, simple_loss=0.2219, pruned_loss=0.03948, over 24406.00 frames. ], tot_loss[loss=0.1836, simple_loss=0.2588, pruned_loss=0.05425, over 4421351.42 frames. ], batch size: 58, lr: 6.15e-03, grad_scale: 8.0 2023-09-30 02:40:54,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-30 02:40:55,176 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=570293.3333333334, ans=0.0 2023-09-30 02:40:56,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-30 02:40:56,358 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:40:56,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-30 02:40:57,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:40:57,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:40:59,412 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:41:00,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:41:00,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:41:02,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:41:04,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:41:06,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-30 02:41:06,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:41:11,138 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:41:11,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:41:12,163 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=570360.0, ans=0.125 2023-09-30 02:41:13,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:41:14,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:41:20,304 WARNING [train.py:1197] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-30 02:41:20,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-30 02:41:23,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:41:27,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:41:28,003 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:41:29,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-30 02:41:33,215 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=570426.6666666666, ans=0.015 2023-09-30 02:41:34,609 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:41:34,627 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-30 02:41:36,089 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:41:37,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 02:41:37,862 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=570426.6666666666, ans=0.2 2023-09-30 02:41:42,600 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:41:42,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 02:41:42,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-30 02:41:44,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:41:45,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-30 02:41:48,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-30 02:41:48,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:41:48,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:41:48,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:41:48,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:41:51,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:41:54,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-30 02:41:57,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:41:57,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:41:59,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 02:42:00,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:42:02,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:42:02,250 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:42:02,503 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=570560.0, ans=0.1 2023-09-30 02:42:03,673 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:42:03,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-30 02:42:03,921 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-30 02:42:12,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-30 02:42:15,749 INFO [train.py:1039] (1/4) Epoch 17, batch 600, loss[loss=0.1801, simple_loss=0.2617, pruned_loss=0.0492, over 24675.00 frames. ], tot_loss[loss=0.1843, simple_loss=0.2592, pruned_loss=0.05472, over 4487863.10 frames. ], batch size: 73, lr: 6.15e-03, grad_scale: 8.0 2023-09-30 02:42:15,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-30 02:42:16,121 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:42:17,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 02:42:17,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:42:21,539 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=570626.6666666666, ans=0.1 2023-09-30 02:42:26,651 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.479e+02 1.830e+02 1.971e+02 2.265e+02 3.697e+02, threshold=3.941e+02, percent-clipped=0.0 2023-09-30 02:42:26,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:42:28,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 02:42:30,033 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-30 02:42:30,984 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.78 vs. limit=12.0 2023-09-30 02:42:31,669 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-30 02:42:33,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:42:36,173 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:42:37,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-30 02:42:39,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:42:41,121 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=570693.3333333334, ans=0.125 2023-09-30 02:42:45,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-30 02:42:47,710 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=570760.0, ans=0.125 2023-09-30 02:42:48,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:42:48,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:42:48,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:42:57,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:42:57,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:42:57,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:43:04,733 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 02:43:05,002 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=570826.6666666666, ans=0.0 2023-09-30 02:43:09,345 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:43:09,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:43:09,364 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:43:18,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-30 02:43:24,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-30 02:43:24,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:43:29,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-30 02:43:31,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:43:33,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-30 02:43:33,189 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:43:34,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:43:39,673 INFO [train.py:1039] (1/4) Epoch 17, batch 650, loss[loss=0.2038, simple_loss=0.2847, pruned_loss=0.06139, over 23885.00 frames. ], tot_loss[loss=0.1838, simple_loss=0.2585, pruned_loss=0.0545, over 4527835.45 frames. ], batch size: 86, lr: 6.14e-03, grad_scale: 8.0 2023-09-30 02:43:42,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 02:43:42,927 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-30 02:43:43,274 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=570960.0, ans=0.1 2023-09-30 02:43:44,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:43:46,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:43:49,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:43:51,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-30 02:43:52,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:43:57,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:43:57,745 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:44:01,276 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:44:06,163 WARNING [train.py:1197] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-30 02:44:07,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:44:07,821 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:44:11,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:44:11,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 02:44:14,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:44:14,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:44:14,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 02:44:16,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:44:17,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 02:44:18,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:44:19,431 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-30 02:44:19,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:44:19,489 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:44:24,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:44:24,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:44:26,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:44:26,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-30 02:44:27,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-30 02:44:29,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:44:29,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:44:30,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-30 02:44:30,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:44:32,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 02:44:34,365 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-30 02:44:35,282 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=571160.0, ans=0.5 2023-09-30 02:44:36,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-30 02:44:37,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:44:37,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:44:38,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:44:39,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:44:39,768 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=571160.0, ans=0.0 2023-09-30 02:44:40,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:44:47,789 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:44:47,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:44:49,354 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:44:53,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:44:53,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 02:44:53,912 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:45:00,852 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=571293.3333333334, ans=0.1 2023-09-30 02:45:01,943 INFO [train.py:1039] (1/4) Epoch 17, batch 700, loss[loss=0.1742, simple_loss=0.2432, pruned_loss=0.05262, over 23717.00 frames. ], tot_loss[loss=0.1828, simple_loss=0.2571, pruned_loss=0.05429, over 4559477.07 frames. ], batch size: 232, lr: 6.14e-03, grad_scale: 8.0 2023-09-30 02:45:02,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 02:45:02,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:45:02,082 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:45:02,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:45:04,047 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=571293.3333333334, ans=0.1 2023-09-30 02:45:06,810 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-30 02:45:08,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-30 02:45:12,196 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.891e+02 2.090e+02 2.521e+02 3.567e+02, threshold=4.179e+02, percent-clipped=0.0 2023-09-30 02:45:12,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-30 02:45:13,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:45:15,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:45:16,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-30 02:45:20,747 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:45:23,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:45:26,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:45:27,049 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=571360.0, ans=0.0 2023-09-30 02:45:28,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-30 02:45:28,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:45:30,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:45:33,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 02:45:33,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:45:34,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-30 02:45:35,138 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=571426.6666666666, ans=0.125 2023-09-30 02:45:38,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-30 02:45:42,109 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-30 02:45:42,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:45:42,339 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=571426.6666666666, ans=0.1 2023-09-30 02:45:45,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-30 02:45:48,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:45:48,948 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=571426.6666666666, ans=0.125 2023-09-30 02:45:50,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-30 02:45:55,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:45:55,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:45:55,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-30 02:45:58,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:46:00,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:46:03,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:46:08,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:46:09,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-30 02:46:13,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-30 02:46:13,225 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-30 02:46:15,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:46:19,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:46:19,375 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:46:22,872 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:46:22,883 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-30 02:46:25,714 INFO [train.py:1039] (1/4) Epoch 17, batch 750, loss[loss=0.1723, simple_loss=0.2531, pruned_loss=0.04577, over 24300.00 frames. ], tot_loss[loss=0.182, simple_loss=0.2564, pruned_loss=0.05384, over 4584063.57 frames. ], batch size: 61, lr: 6.14e-03, grad_scale: 4.0 2023-09-30 02:46:28,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-30 02:46:28,795 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-30 02:46:28,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-30 02:46:30,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-30 02:46:30,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-30 02:46:31,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:46:32,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-30 02:46:33,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:46:35,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-30 02:46:36,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:46:38,242 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:46:38,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-30 02:46:38,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:46:42,015 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:46:42,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:46:45,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:46:48,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:46:48,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:46:49,647 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-30 02:46:51,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:46:52,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:46:53,589 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:46:57,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-30 02:46:59,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-30 02:46:59,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:47:00,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-30 02:47:00,813 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-30 02:47:00,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-30 02:47:00,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:47:00,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 02:47:03,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:47:09,063 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=571760.0, ans=0.07 2023-09-30 02:47:10,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-30 02:47:11,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:47:11,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 02:47:13,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:47:17,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:47:17,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-30 02:47:17,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:47:18,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-30 02:47:20,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:47:23,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:47:24,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-30 02:47:24,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:47:31,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:47:33,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:47:33,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:47:36,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:47:39,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-30 02:47:40,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:47:40,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:47:43,819 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:47:43,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:47:46,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:47:48,748 INFO [train.py:1039] (1/4) Epoch 17, batch 800, loss[loss=0.1853, simple_loss=0.2588, pruned_loss=0.05588, over 23359.00 frames. ], tot_loss[loss=0.1826, simple_loss=0.2572, pruned_loss=0.05403, over 4612751.15 frames. ], batch size: 105, lr: 6.14e-03, grad_scale: 8.0 2023-09-30 02:47:48,808 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-30 02:47:50,886 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=571960.0, ans=0.2 2023-09-30 02:47:55,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:47:55,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:47:56,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:47:56,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:47:58,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:48:00,274 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.820e+02 2.048e+02 2.346e+02 3.292e+02, threshold=4.096e+02, percent-clipped=0.0 2023-09-30 02:48:00,377 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:48:02,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:48:05,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:48:07,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:48:10,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-30 02:48:11,503 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.06 vs. limit=15.0 2023-09-30 02:48:12,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:48:13,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:48:15,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:48:15,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:48:16,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-30 02:48:16,673 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:48:18,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-30 02:48:18,874 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.57 vs. limit=10.0 2023-09-30 02:48:21,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:48:24,482 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:48:26,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:48:26,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:48:28,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:48:28,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:48:32,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:48:34,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:48:34,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-30 02:48:36,568 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-30 02:48:36,626 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-30 02:48:36,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 02:48:36,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:48:39,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:48:39,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:48:44,902 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-30 02:48:46,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-30 02:48:48,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:48:50,127 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=572160.0, ans=0.125 2023-09-30 02:48:51,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:48:54,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:48:58,966 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:49:00,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-30 02:49:00,622 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:49:04,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-30 02:49:09,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:49:11,245 INFO [train.py:1039] (1/4) Epoch 17, batch 850, loss[loss=0.1896, simple_loss=0.2717, pruned_loss=0.05371, over 23927.00 frames. ], tot_loss[loss=0.1829, simple_loss=0.2578, pruned_loss=0.05406, over 4643225.00 frames. ], batch size: 80, lr: 6.14e-03, grad_scale: 8.0 2023-09-30 02:49:11,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:49:12,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-30 02:49:12,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:49:14,325 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:49:14,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-30 02:49:14,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:49:16,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:49:18,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:49:19,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 02:49:19,872 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=572293.3333333334, ans=0.0 2023-09-30 02:49:21,116 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:49:23,173 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-30 02:49:24,540 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-30 02:49:24,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-30 02:49:26,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:49:26,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:49:28,636 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.51 vs. limit=15.0 2023-09-30 02:49:29,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:49:29,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:49:29,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:49:35,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:49:35,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:49:37,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-30 02:49:40,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-30 02:49:44,411 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:49:44,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-30 02:49:47,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-30 02:49:49,872 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-30 02:49:51,592 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-30 02:49:51,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:49:51,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:49:52,861 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 02:49:54,660 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:49:56,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:49:56,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-30 02:49:59,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:50:01,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:50:02,677 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:50:02,724 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-30 02:50:04,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:50:06,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-30 02:50:06,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-30 02:50:08,338 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=572493.3333333334, ans=0.0 2023-09-30 02:50:11,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:50:11,147 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:50:12,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:50:12,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:50:14,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:50:15,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:50:18,337 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=572560.0, ans=0.1 2023-09-30 02:50:19,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:50:21,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-30 02:50:21,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:50:22,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-30 02:50:28,112 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=572560.0, ans=0.2 2023-09-30 02:50:29,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-30 02:50:31,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:50:32,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-30 02:50:33,927 INFO [train.py:1039] (1/4) Epoch 17, batch 900, loss[loss=0.1734, simple_loss=0.2561, pruned_loss=0.04531, over 24502.00 frames. ], tot_loss[loss=0.1839, simple_loss=0.2583, pruned_loss=0.05477, over 4657767.22 frames. ], batch size: 66, lr: 6.14e-03, grad_scale: 8.0 2023-09-30 02:50:34,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:50:34,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:50:37,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-30 02:50:44,468 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:50:45,733 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.608e+02 1.971e+02 2.244e+02 2.720e+02 3.662e+02, threshold=4.487e+02, percent-clipped=0.0 2023-09-30 02:50:47,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:50:47,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=572626.6666666666, ans=0.125 2023-09-30 02:50:49,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-30 02:50:49,307 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=572693.3333333334, ans=0.0 2023-09-30 02:50:52,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:50:52,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-30 02:50:53,709 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-30 02:50:55,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:50:55,816 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:50:55,888 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 02:50:55,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:50:59,191 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=572693.3333333334, ans=0.125 2023-09-30 02:51:05,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:51:06,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:51:07,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 02:51:07,734 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=572760.0, ans=0.125 2023-09-30 02:51:11,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:51:15,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-30 02:51:18,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:51:20,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-30 02:51:22,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-30 02:51:22,389 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-30 02:51:23,962 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-30 02:51:24,187 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=572826.6666666666, ans=0.0 2023-09-30 02:51:30,783 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-30 02:51:30,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:51:32,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 02:51:40,628 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:51:40,650 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:51:41,110 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=572893.3333333334, ans=0.125 2023-09-30 02:51:42,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-30 02:51:42,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:51:42,418 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=572893.3333333334, ans=0.125 2023-09-30 02:51:45,402 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-30 02:51:46,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:51:46,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:51:49,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:51:49,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:51:51,073 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=572893.3333333334, ans=0.125 2023-09-30 02:51:53,695 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-30 02:51:55,220 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-30 02:51:55,435 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-30 02:51:55,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-30 02:51:56,789 INFO [train.py:1039] (1/4) Epoch 17, batch 950, loss[loss=0.2475, simple_loss=0.3073, pruned_loss=0.09382, over 19812.00 frames. ], tot_loss[loss=0.1841, simple_loss=0.2583, pruned_loss=0.05493, over 4668766.85 frames. ], batch size: 388, lr: 6.13e-03, grad_scale: 8.0 2023-09-30 02:51:58,558 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:52:03,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-30 02:52:07,027 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=572960.0, ans=0.0 2023-09-30 02:52:07,185 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=572960.0, ans=0.0 2023-09-30 02:52:08,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:52:10,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:52:10,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:52:10,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 02:52:12,595 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-30 02:52:18,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:52:18,546 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:52:18,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:52:18,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:52:20,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-30 02:52:22,150 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-30 02:52:23,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:52:27,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-30 02:52:28,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:52:33,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:52:33,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:52:34,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:52:34,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-30 02:52:37,120 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 02:52:38,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:52:40,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:52:44,809 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:52:44,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:52:48,662 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-30 02:52:51,669 WARNING [train.py:1197] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 02:52:51,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:52:51,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:52:51,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:52:51,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:52:56,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-30 02:52:58,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:53:01,783 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:53:02,097 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=573226.6666666666, ans=0.125 2023-09-30 02:53:03,255 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:53:03,287 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-30 02:53:03,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:53:03,333 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:53:03,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-30 02:53:08,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:53:10,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:53:14,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:53:16,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-30 02:53:16,495 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-30 02:53:18,437 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=573293.3333333334, ans=0.125 2023-09-30 02:53:20,123 INFO [train.py:1039] (1/4) Epoch 17, batch 1000, loss[loss=0.1852, simple_loss=0.2594, pruned_loss=0.05544, over 23337.00 frames. ], tot_loss[loss=0.1843, simple_loss=0.258, pruned_loss=0.05532, over 4675476.87 frames. ], batch size: 105, lr: 6.13e-03, grad_scale: 8.0 2023-09-30 02:53:20,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:53:23,678 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-30 02:53:23,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:53:28,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:53:30,708 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-30 02:53:30,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-30 02:53:32,101 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.898e+02 2.133e+02 2.497e+02 3.739e+02, threshold=4.265e+02, percent-clipped=0.0 2023-09-30 02:53:36,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:53:36,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:53:37,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:53:42,649 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-30 02:53:45,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-30 02:53:47,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-30 02:53:47,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:53:51,041 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-30 02:53:52,601 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-30 02:53:52,938 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=573426.6666666666, ans=0.125 2023-09-30 02:53:53,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-30 02:53:54,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:53:55,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:54:02,432 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=573426.6666666666, ans=0.125 2023-09-30 02:54:03,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:54:05,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:54:05,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:54:05,690 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=573426.6666666666, ans=0.125 2023-09-30 02:54:06,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:54:06,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-30 02:54:07,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:54:08,435 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:54:08,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:54:08,623 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-30 02:54:12,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-30 02:54:14,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-30 02:54:17,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-30 02:54:18,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:54:27,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:54:27,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-30 02:54:27,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:54:27,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:54:28,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-30 02:54:30,552 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:54:30,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-30 02:54:32,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-30 02:54:33,774 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:54:33,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:54:36,189 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=573560.0, ans=0.125 2023-09-30 02:54:37,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:54:39,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 02:54:39,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:54:42,757 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=573626.6666666666, ans=0.125 2023-09-30 02:54:44,362 INFO [train.py:1039] (1/4) Epoch 17, batch 1050, loss[loss=0.1754, simple_loss=0.2482, pruned_loss=0.05134, over 21854.00 frames. ], tot_loss[loss=0.183, simple_loss=0.2566, pruned_loss=0.0547, over 4673680.72 frames. ], batch size: 48, lr: 6.13e-03, grad_scale: 8.0 2023-09-30 02:54:44,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:54:45,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:54:45,495 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=573626.6666666666, ans=0.1 2023-09-30 02:54:48,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 02:54:48,442 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=573626.6666666666, ans=0.125 2023-09-30 02:54:49,683 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:54:51,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:54:51,625 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=573626.6666666666, ans=0.125 2023-09-30 02:54:55,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:54:57,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:54:59,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:55:01,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:55:01,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-30 02:55:02,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:55:02,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-30 02:55:04,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:55:05,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-30 02:55:08,166 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.70 vs. limit=12.0 2023-09-30 02:55:08,757 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:55:08,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-30 02:55:08,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-30 02:55:15,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:55:17,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-30 02:55:17,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:55:20,397 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=573760.0, ans=0.1 2023-09-30 02:55:22,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-30 02:55:22,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-30 02:55:22,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:55:24,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-30 02:55:28,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-30 02:55:28,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:55:28,887 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=573760.0, ans=0.1 2023-09-30 02:55:31,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 02:55:32,676 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.56 vs. limit=15.0 2023-09-30 02:55:33,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-30 02:55:34,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:55:35,404 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:55:39,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:55:44,938 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-30 02:55:45,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-30 02:55:46,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-30 02:55:46,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:55:46,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 02:55:48,156 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-30 02:55:51,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:55:54,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:55:54,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:55:56,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:55:56,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:55:58,504 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=573893.3333333334, ans=0.1 2023-09-30 02:55:59,029 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.13 vs. limit=15.0 2023-09-30 02:56:00,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:56:00,244 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-30 02:56:01,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:56:01,887 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-30 02:56:01,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-30 02:56:03,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:56:06,392 INFO [train.py:1039] (1/4) Epoch 17, batch 1100, loss[loss=0.1873, simple_loss=0.2517, pruned_loss=0.06143, over 23633.00 frames. ], tot_loss[loss=0.1823, simple_loss=0.2559, pruned_loss=0.05436, over 4683282.61 frames. ], batch size: 232, lr: 6.13e-03, grad_scale: 8.0 2023-09-30 02:56:06,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:56:13,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:56:18,396 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.918e+02 2.180e+02 2.425e+02 3.491e+02, threshold=4.360e+02, percent-clipped=0.0 2023-09-30 02:56:20,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:56:20,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:56:20,368 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:56:21,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-30 02:56:22,421 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.78 vs. limit=12.0 2023-09-30 02:56:23,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:56:25,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:56:28,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:56:31,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:56:31,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-30 02:56:35,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 02:56:36,824 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:56:36,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:56:38,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:56:42,937 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:56:48,245 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:56:50,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-30 02:56:51,450 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-30 02:56:51,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:56:55,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:56:55,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-30 02:56:55,193 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:56:55,367 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=574160.0, ans=0.125 2023-09-30 02:56:56,894 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=574160.0, ans=0.125 2023-09-30 02:56:56,972 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=574160.0, ans=0.125 2023-09-30 02:56:58,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-30 02:56:58,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:56:58,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:56:58,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:56:59,713 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:56:59,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-30 02:57:06,456 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:57:06,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-30 02:57:09,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:57:10,256 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.39 vs. limit=15.0 2023-09-30 02:57:14,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 02:57:17,445 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=574226.6666666666, ans=0.2 2023-09-30 02:57:18,771 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-30 02:57:18,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-30 02:57:18,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:57:20,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:57:22,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:57:22,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-30 02:57:24,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:57:24,237 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:57:24,450 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=574226.6666666666, ans=0.125 2023-09-30 02:57:27,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-30 02:57:27,601 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:57:27,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-30 02:57:28,976 INFO [train.py:1039] (1/4) Epoch 17, batch 1150, loss[loss=0.1882, simple_loss=0.2515, pruned_loss=0.06242, over 23675.00 frames. ], tot_loss[loss=0.1827, simple_loss=0.2566, pruned_loss=0.05444, over 4695782.42 frames. ], batch size: 232, lr: 6.13e-03, grad_scale: 8.0 2023-09-30 02:57:29,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:57:29,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:57:30,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-30 02:57:35,794 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=574293.3333333334, ans=0.0 2023-09-30 02:57:36,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:57:39,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:57:41,610 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.53 vs. limit=6.0 2023-09-30 02:57:42,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:57:42,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:57:44,194 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-30 02:57:44,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:57:47,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-30 02:57:48,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:57:48,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 02:57:49,260 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=574360.0, ans=0.125 2023-09-30 02:57:49,330 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=574360.0, ans=0.125 2023-09-30 02:57:49,389 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=574360.0, ans=0.2 2023-09-30 02:57:52,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-30 02:57:55,907 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:57:59,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:58:00,977 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:58:01,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-30 02:58:01,090 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-30 02:58:01,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:58:03,027 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=574426.6666666666, ans=0.125 2023-09-30 02:58:04,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-30 02:58:04,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:58:05,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:58:12,751 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=574426.6666666666, ans=0.2 2023-09-30 02:58:17,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:58:22,601 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:58:24,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-30 02:58:24,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:58:24,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:58:30,911 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-30 02:58:33,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:58:41,353 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.82 vs. limit=12.0 2023-09-30 02:58:42,093 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-30 02:58:46,664 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:58:47,003 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=574560.0, ans=0.2 2023-09-30 02:58:48,815 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-30 02:58:48,870 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-30 02:58:48,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:58:51,675 INFO [train.py:1039] (1/4) Epoch 17, batch 1200, loss[loss=0.192, simple_loss=0.2591, pruned_loss=0.06249, over 23577.00 frames. ], tot_loss[loss=0.1833, simple_loss=0.257, pruned_loss=0.05475, over 4697571.66 frames. ], batch size: 256, lr: 6.12e-03, grad_scale: 16.0 2023-09-30 02:58:51,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:58:56,377 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.37 vs. limit=22.5 2023-09-30 02:58:57,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:58:59,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-30 02:59:00,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:59:00,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:59:01,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:59:03,935 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.927e+02 2.114e+02 2.514e+02 4.321e+02, threshold=4.228e+02, percent-clipped=0.0 2023-09-30 02:59:04,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:59:05,725 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 02:59:07,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:59:07,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:59:11,717 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-30 02:59:13,906 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-30 02:59:17,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 02:59:17,483 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=574693.3333333334, ans=0.0 2023-09-30 02:59:20,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:59:21,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:59:25,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:59:25,296 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-30 02:59:25,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:59:35,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-30 02:59:35,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:59:35,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-30 02:59:37,372 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:59:40,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-30 02:59:43,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-30 02:59:45,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:59:45,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:59:46,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:59:46,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-30 02:59:48,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:59:48,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:59:50,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:59:50,385 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-30 02:59:50,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 02:59:51,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:59:51,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 02:59:53,560 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:59:53,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:59:58,826 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-30 02:59:59,161 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=574893.3333333334, ans=0.125 2023-09-30 03:00:01,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:00:06,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-30 03:00:10,316 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-30 03:00:13,279 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:00:14,774 INFO [train.py:1039] (1/4) Epoch 17, batch 1250, loss[loss=0.1973, simple_loss=0.2662, pruned_loss=0.06423, over 23509.00 frames. ], tot_loss[loss=0.1831, simple_loss=0.2571, pruned_loss=0.05457, over 4711479.62 frames. ], batch size: 285, lr: 6.12e-03, grad_scale: 16.0 2023-09-30 03:00:14,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:00:15,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:00:16,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:00:21,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-30 03:00:24,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:00:26,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:00:27,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-30 03:00:30,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:00:32,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 03:00:36,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 03:00:37,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:00:37,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:00:37,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:00:40,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-30 03:00:44,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 03:00:46,306 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-30 03:00:46,315 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:00:47,974 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:00:48,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:00:51,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:00:52,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-30 03:00:57,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-30 03:00:59,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:00:59,601 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=575093.3333333334, ans=0.125 2023-09-30 03:01:02,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:01:02,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-30 03:01:02,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:01:02,498 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-30 03:01:04,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:01:04,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:01:06,475 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.01 vs. limit=15.0 2023-09-30 03:01:08,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:01:12,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:01:12,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:01:13,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-30 03:01:15,370 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-30 03:01:15,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-30 03:01:18,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:01:21,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-30 03:01:21,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:01:23,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-30 03:01:23,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:01:24,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-30 03:01:24,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-30 03:01:24,769 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:01:24,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-30 03:01:26,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:01:27,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-30 03:01:30,940 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:01:32,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:01:34,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 03:01:37,371 INFO [train.py:1039] (1/4) Epoch 17, batch 1300, loss[loss=0.1553, simple_loss=0.2358, pruned_loss=0.03746, over 24468.00 frames. ], tot_loss[loss=0.1841, simple_loss=0.258, pruned_loss=0.0551, over 4705690.49 frames. ], batch size: 63, lr: 6.12e-03, grad_scale: 16.0 2023-09-30 03:01:38,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-30 03:01:42,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:01:42,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-30 03:01:48,561 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.803e+02 1.977e+02 2.132e+02 2.913e+02, threshold=3.954e+02, percent-clipped=0.0 2023-09-30 03:01:48,710 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:01:50,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-30 03:01:50,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:01:53,012 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=575360.0, ans=0.125 2023-09-30 03:01:54,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:01:54,344 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:01:54,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-30 03:01:59,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:02:00,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:02:02,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-30 03:02:05,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 03:02:09,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:02:10,753 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:02:12,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:02:14,092 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=575426.6666666666, ans=0.0 2023-09-30 03:02:15,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:02:15,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 03:02:16,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-30 03:02:17,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-30 03:02:22,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:02:22,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 03:02:24,652 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-30 03:02:26,649 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 03:02:26,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:02:29,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:02:31,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-30 03:02:31,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:02:32,848 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-30 03:02:34,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:02:37,759 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:02:37,763 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:02:42,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-30 03:02:42,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-30 03:02:44,602 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-30 03:02:46,422 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=575560.0, ans=0.125 2023-09-30 03:02:48,249 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=575560.0, ans=0.125 2023-09-30 03:02:49,282 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:02:51,609 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=575560.0, ans=0.0 2023-09-30 03:02:52,830 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-30 03:02:53,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:02:58,242 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=575560.0, ans=0.0 2023-09-30 03:03:01,428 INFO [train.py:1039] (1/4) Epoch 17, batch 1350, loss[loss=0.1823, simple_loss=0.2481, pruned_loss=0.05827, over 23813.00 frames. ], tot_loss[loss=0.1836, simple_loss=0.2575, pruned_loss=0.05486, over 4709451.33 frames. ], batch size: 212, lr: 6.12e-03, grad_scale: 16.0 2023-09-30 03:03:03,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-30 03:03:07,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:03:08,218 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=575626.6666666666, ans=0.125 2023-09-30 03:03:09,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:03:11,210 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:03:11,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:03:14,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:03:14,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-30 03:03:20,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-30 03:03:20,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-30 03:03:22,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-30 03:03:24,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:03:26,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-30 03:03:27,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:03:27,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:03:27,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-30 03:03:29,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-30 03:03:31,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-30 03:03:34,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:03:34,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-30 03:03:40,332 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=575760.0, ans=0.125 2023-09-30 03:03:46,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:03:56,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:03:57,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:03:58,184 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-30 03:04:01,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:04:01,718 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=575826.6666666666, ans=0.0 2023-09-30 03:04:02,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-30 03:04:02,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-30 03:04:04,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:04:06,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:04:10,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-30 03:04:11,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:04:17,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-30 03:04:19,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-30 03:04:24,863 INFO [train.py:1039] (1/4) Epoch 17, batch 1400, loss[loss=0.173, simple_loss=0.2576, pruned_loss=0.04422, over 24495.00 frames. ], tot_loss[loss=0.1826, simple_loss=0.256, pruned_loss=0.05458, over 4696732.57 frames. ], batch size: 66, lr: 6.12e-03, grad_scale: 16.0 2023-09-30 03:04:25,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-30 03:04:26,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:04:29,719 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:04:31,084 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.60 vs. limit=12.0 2023-09-30 03:04:31,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:04:33,802 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-30 03:04:35,700 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-30 03:04:36,886 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.880e+02 2.143e+02 2.482e+02 5.482e+02, threshold=4.285e+02, percent-clipped=2.0 2023-09-30 03:04:48,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:04:50,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:04:52,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:04:52,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-30 03:04:55,801 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:04:57,297 WARNING [train.py:1197] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 03:05:06,994 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:05:08,350 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:05:11,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-30 03:05:11,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:05:13,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:05:13,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:05:15,191 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:05:15,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:05:17,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:05:17,318 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:05:17,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-30 03:05:17,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:05:22,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:05:27,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:05:27,389 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=576160.0, ans=0.125 2023-09-30 03:05:36,413 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-30 03:05:37,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 03:05:38,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:05:41,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 03:05:43,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:05:44,790 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:05:46,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-30 03:05:48,361 INFO [train.py:1039] (1/4) Epoch 17, batch 1450, loss[loss=0.1757, simple_loss=0.2608, pruned_loss=0.0453, over 24637.00 frames. ], tot_loss[loss=0.181, simple_loss=0.2548, pruned_loss=0.05366, over 4706821.84 frames. ], batch size: 68, lr: 6.12e-03, grad_scale: 16.0 2023-09-30 03:05:48,702 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:05:48,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:05:50,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-30 03:05:55,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:05:56,863 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 03:05:58,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:05:58,435 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-30 03:05:59,262 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.24 vs. limit=15.0 2023-09-30 03:05:59,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 03:06:02,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-30 03:06:02,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:06:05,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:06:05,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-30 03:06:05,189 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:06:06,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-30 03:06:06,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 03:06:08,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:06:08,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:06:09,968 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=576360.0, ans=0.125 2023-09-30 03:06:11,018 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:06:12,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:06:17,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:06:17,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:06:19,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:06:21,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:06:21,907 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.04 vs. limit=10.0 2023-09-30 03:06:22,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:06:22,653 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:06:24,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:06:24,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:06:28,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-30 03:06:31,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:06:31,897 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=576426.6666666666, ans=0.125 2023-09-30 03:06:35,274 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-30 03:06:35,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:06:36,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:06:38,556 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:06:39,377 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.88 vs. limit=15.0 2023-09-30 03:06:41,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-30 03:06:44,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:06:45,432 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.93 vs. limit=22.5 2023-09-30 03:06:46,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-30 03:06:46,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-30 03:06:47,841 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:06:51,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:06:51,691 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:06:53,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-30 03:06:56,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-30 03:06:56,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-30 03:06:58,368 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:06:58,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 03:07:11,619 INFO [train.py:1039] (1/4) Epoch 17, batch 1500, loss[loss=0.1886, simple_loss=0.2612, pruned_loss=0.05795, over 23919.00 frames. ], tot_loss[loss=0.1815, simple_loss=0.2555, pruned_loss=0.05378, over 4715918.56 frames. ], batch size: 212, lr: 6.11e-03, grad_scale: 16.0 2023-09-30 03:07:11,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-30 03:07:11,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:07:11,797 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:07:13,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:07:13,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:07:14,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:07:16,403 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-30 03:07:16,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:07:16,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-30 03:07:18,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:07:19,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:07:21,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:07:21,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:07:22,949 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.695e+02 1.876e+02 2.186e+02 2.555e+02 3.680e+02, threshold=4.372e+02, percent-clipped=0.0 2023-09-30 03:07:26,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:07:26,363 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-30 03:07:27,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:07:27,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:07:29,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:07:35,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-30 03:07:39,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-30 03:07:42,089 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:07:42,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-30 03:07:45,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-30 03:07:48,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:07:49,754 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:07:49,776 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:07:51,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-30 03:07:52,832 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:07:54,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:07:54,443 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-30 03:07:54,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:08:01,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:08:01,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-30 03:08:07,478 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 03:08:09,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 03:08:11,909 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.87 vs. limit=12.0 2023-09-30 03:08:14,738 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-30 03:08:14,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:08:14,832 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-30 03:08:17,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:08:19,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:08:21,470 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-30 03:08:21,608 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:08:24,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-30 03:08:25,132 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=576893.3333333334, ans=0.1 2023-09-30 03:08:26,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:08:29,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:08:29,498 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=576893.3333333334, ans=0.125 2023-09-30 03:08:30,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:08:30,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:08:30,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:08:32,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:08:34,389 INFO [train.py:1039] (1/4) Epoch 17, batch 1550, loss[loss=0.1944, simple_loss=0.2766, pruned_loss=0.05612, over 23970.00 frames. ], tot_loss[loss=0.1827, simple_loss=0.257, pruned_loss=0.05419, over 4724517.49 frames. ], batch size: 80, lr: 6.11e-03, grad_scale: 16.0 2023-09-30 03:08:34,566 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-30 03:08:35,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-30 03:08:36,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:08:36,173 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-30 03:08:37,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-30 03:08:38,501 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.63 vs. limit=10.0 2023-09-30 03:08:39,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:08:39,596 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=576960.0, ans=0.125 2023-09-30 03:08:40,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:08:42,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:08:42,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:08:43,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:08:45,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:08:49,568 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-30 03:08:49,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:08:49,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:08:51,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 03:08:53,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-30 03:08:53,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-30 03:08:53,532 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.49 vs. limit=10.0 2023-09-30 03:08:56,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:08:56,062 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-30 03:08:58,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-30 03:08:58,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-30 03:08:58,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:08:59,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:09:02,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:09:05,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-30 03:09:05,713 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-30 03:09:11,318 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.55 vs. limit=15.0 2023-09-30 03:09:14,218 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.93 vs. limit=15.0 2023-09-30 03:09:16,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:09:18,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:09:19,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-30 03:09:19,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:09:19,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-30 03:09:20,097 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=577093.3333333334, ans=0.0 2023-09-30 03:09:25,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 03:09:26,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:09:31,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:09:33,741 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=577160.0, ans=0.0 2023-09-30 03:09:34,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:09:34,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:09:34,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-30 03:09:34,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:09:36,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:09:38,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:09:38,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-30 03:09:38,147 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-30 03:09:39,911 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=577226.6666666666, ans=0.025 2023-09-30 03:09:39,995 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=577226.6666666666, ans=0.0 2023-09-30 03:09:41,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:09:47,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-30 03:09:49,057 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.89 vs. limit=6.0 2023-09-30 03:09:52,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:09:54,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:09:54,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-30 03:09:55,667 INFO [train.py:1039] (1/4) Epoch 17, batch 1600, loss[loss=0.1633, simple_loss=0.2385, pruned_loss=0.044, over 19859.00 frames. ], tot_loss[loss=0.1832, simple_loss=0.2576, pruned_loss=0.05438, over 4708773.89 frames. ], batch size: 43, lr: 6.11e-03, grad_scale: 32.0 2023-09-30 03:09:55,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:09:55,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:09:55,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:09:56,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:09:57,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:10:00,193 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=577293.3333333334, ans=0.125 2023-09-30 03:10:01,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:10:01,667 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=577293.3333333334, ans=0.0 2023-09-30 03:10:02,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-30 03:10:04,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-30 03:10:06,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-30 03:10:06,689 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:10:07,922 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.883e+02 2.101e+02 2.421e+02 4.828e+02, threshold=4.202e+02, percent-clipped=4.0 2023-09-30 03:10:08,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-30 03:10:09,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:10:11,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:10:16,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:10:19,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-30 03:10:22,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:10:22,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-30 03:10:24,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:10:24,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-30 03:10:31,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-30 03:10:40,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:10:40,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-30 03:10:42,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:10:42,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:10:42,432 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:10:42,624 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=577426.6666666666, ans=0.125 2023-09-30 03:10:45,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-30 03:10:50,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 03:10:53,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:10:53,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:10:55,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:10:55,170 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:10:58,107 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:10:58,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:11:01,095 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:11:07,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:11:09,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:11:11,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-30 03:11:11,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:11:12,962 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-30 03:11:18,130 INFO [train.py:1039] (1/4) Epoch 17, batch 1650, loss[loss=0.1598, simple_loss=0.2349, pruned_loss=0.0423, over 24431.00 frames. ], tot_loss[loss=0.1846, simple_loss=0.2589, pruned_loss=0.05519, over 4701088.60 frames. ], batch size: 58, lr: 6.11e-03, grad_scale: 32.0 2023-09-30 03:11:18,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:11:21,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:11:21,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:11:21,969 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-30 03:11:23,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-30 03:11:23,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-30 03:11:23,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-30 03:11:27,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:11:28,331 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=577626.6666666666, ans=0.125 2023-09-30 03:11:29,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:11:29,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:11:29,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-30 03:11:31,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:11:32,972 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-30 03:11:36,052 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:11:36,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:11:36,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:11:36,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:11:37,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-30 03:11:37,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-30 03:11:45,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 03:11:48,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-30 03:11:58,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-30 03:11:59,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:12:00,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-30 03:12:04,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:12:07,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:12:07,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:12:07,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:12:08,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:12:08,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:12:11,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:12:11,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:12:13,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:12:13,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:12:14,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:12:14,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:12:19,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:12:19,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-30 03:12:20,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:12:22,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-30 03:12:22,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-30 03:12:22,700 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-30 03:12:22,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:12:24,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:12:24,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:12:24,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:12:24,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-30 03:12:29,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:12:30,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:12:31,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:12:32,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-30 03:12:37,381 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=577893.3333333334, ans=0.07 2023-09-30 03:12:38,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:12:38,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:12:39,907 INFO [train.py:1039] (1/4) Epoch 17, batch 1700, loss[loss=0.1701, simple_loss=0.2505, pruned_loss=0.04485, over 24451.00 frames. ], tot_loss[loss=0.1833, simple_loss=0.257, pruned_loss=0.05483, over 4693548.31 frames. ], batch size: 69, lr: 6.11e-03, grad_scale: 32.0 2023-09-30 03:12:39,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-30 03:12:40,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:12:40,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:12:40,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:12:43,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:12:44,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:12:44,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-30 03:12:48,497 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 03:12:48,710 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=577960.0, ans=0.0 2023-09-30 03:12:51,379 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 1.846e+02 2.021e+02 2.222e+02 3.253e+02, threshold=4.041e+02, percent-clipped=0.0 2023-09-30 03:12:56,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:12:58,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=578026.6666666666, ans=0.125 2023-09-30 03:13:00,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:13:07,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-30 03:13:07,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:13:08,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:13:08,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:13:10,240 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-30 03:13:11,967 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:13:13,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:13:13,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-30 03:13:13,889 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=578093.3333333334, ans=0.125 2023-09-30 03:13:14,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-30 03:13:17,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-30 03:13:17,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-30 03:13:19,662 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.25 vs. limit=15.0 2023-09-30 03:13:20,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:13:23,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-30 03:13:24,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:13:31,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:13:34,604 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=578160.0, ans=0.07 2023-09-30 03:13:35,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:13:35,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:13:37,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-30 03:13:37,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-30 03:13:38,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:13:40,282 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:13:40,283 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-30 03:13:41,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:13:41,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:13:41,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:13:41,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:13:43,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:13:43,650 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:13:45,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:13:45,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:13:46,626 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:13:53,031 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:13:53,177 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-30 03:13:54,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:13:56,468 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:13:58,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-30 03:13:58,822 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.32 vs. limit=15.0 2023-09-30 03:13:59,701 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=578226.6666666666, ans=0.07 2023-09-30 03:14:02,728 INFO [train.py:1039] (1/4) Epoch 17, batch 1750, loss[loss=0.1707, simple_loss=0.2325, pruned_loss=0.05443, over 23485.00 frames. ], tot_loss[loss=0.1826, simple_loss=0.2565, pruned_loss=0.0544, over 4703360.71 frames. ], batch size: 285, lr: 6.11e-03, grad_scale: 32.0 2023-09-30 03:14:04,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:14:06,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:14:06,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-30 03:14:07,081 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=578293.3333333334, ans=0.0 2023-09-30 03:14:08,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-30 03:14:09,714 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:14:13,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:14:14,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:14:15,068 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.46 vs. limit=15.0 2023-09-30 03:14:16,220 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.82 vs. limit=15.0 2023-09-30 03:14:17,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-30 03:14:19,109 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=578360.0, ans=0.2 2023-09-30 03:14:20,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:14:24,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-30 03:14:24,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:14:27,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:14:28,190 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.06 vs. limit=6.0 2023-09-30 03:14:30,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 03:14:30,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-30 03:14:31,942 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:14:31,990 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-30 03:14:37,192 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.98 vs. limit=6.0 2023-09-30 03:14:41,593 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:14:43,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:14:43,914 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:14:48,351 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:14:48,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:14:50,073 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:14:51,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:14:54,470 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:14:54,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:14:56,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-30 03:14:59,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:15:02,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-30 03:15:04,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:15:04,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:15:05,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:15:09,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 03:15:09,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-30 03:15:11,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:15:12,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:15:16,719 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=578560.0, ans=0.125 2023-09-30 03:15:17,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:15:20,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:15:21,856 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:15:23,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-30 03:15:23,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:15:24,644 INFO [train.py:1039] (1/4) Epoch 17, batch 1800, loss[loss=0.1665, simple_loss=0.2362, pruned_loss=0.04838, over 23919.00 frames. ], tot_loss[loss=0.1821, simple_loss=0.2565, pruned_loss=0.05388, over 4713888.01 frames. ], batch size: 195, lr: 6.10e-03, grad_scale: 32.0 2023-09-30 03:15:26,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-30 03:15:26,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:15:26,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:15:26,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:15:27,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-30 03:15:28,154 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=578626.6666666666, ans=0.1 2023-09-30 03:15:30,799 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:15:30,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:15:32,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 03:15:34,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:15:36,406 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.427e+02 1.847e+02 2.029e+02 2.247e+02 3.215e+02, threshold=4.058e+02, percent-clipped=0.0 2023-09-30 03:15:36,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 03:15:37,032 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=578626.6666666666, ans=0.1 2023-09-30 03:15:39,746 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:15:41,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:15:42,136 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.24 vs. limit=10.0 2023-09-30 03:15:44,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:15:44,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:15:45,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:15:48,161 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:15:48,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-30 03:15:50,236 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:15:54,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:15:57,951 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-30 03:16:01,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-30 03:16:01,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-30 03:16:01,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:16:02,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:16:02,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:16:04,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:16:11,392 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-30 03:16:11,778 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=578760.0, ans=0.2 2023-09-30 03:16:12,880 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:16:14,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:16:16,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-30 03:16:17,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-30 03:16:17,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:16:18,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:16:20,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 03:16:26,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-30 03:16:31,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:16:31,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-30 03:16:32,736 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:16:32,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:16:32,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:16:32,918 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-30 03:16:37,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:16:37,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:16:39,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-30 03:16:39,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:16:42,795 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:16:42,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-30 03:16:42,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:16:44,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:16:45,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:16:47,375 INFO [train.py:1039] (1/4) Epoch 17, batch 1850, loss[loss=0.1835, simple_loss=0.251, pruned_loss=0.05802, over 23304.00 frames. ], tot_loss[loss=0.1818, simple_loss=0.2564, pruned_loss=0.05355, over 4714910.00 frames. ], batch size: 119, lr: 6.10e-03, grad_scale: 32.0 2023-09-30 03:16:47,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:16:47,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:16:47,924 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=578960.0, ans=0.125 2023-09-30 03:16:50,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:16:52,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:16:52,519 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 03:17:01,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:17:01,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-30 03:17:05,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=579026.6666666666, ans=0.0 2023-09-30 03:17:06,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-30 03:17:08,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-30 03:17:12,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:17:12,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-30 03:17:12,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 03:17:17,130 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.20 vs. limit=15.0 2023-09-30 03:17:23,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:17:24,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-30 03:17:27,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:17:29,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:17:32,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-30 03:17:34,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:17:34,250 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 03:17:36,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:17:38,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:17:41,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:17:45,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-30 03:17:45,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:17:45,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 03:17:45,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:17:48,789 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:17:48,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:17:49,117 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 03:17:54,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-30 03:17:54,232 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:17:57,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:17:57,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:17:57,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-30 03:17:57,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-30 03:18:00,368 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-30 03:18:00,500 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-30 03:18:02,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 03:18:02,068 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:18:03,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:18:03,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:18:03,531 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-30 03:18:03,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:18:05,443 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:18:05,651 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer_ff3.min_abs, batch_count=579226.6666666666, ans=0.2 2023-09-30 03:18:07,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-30 03:18:09,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 03:18:10,970 INFO [train.py:1039] (1/4) Epoch 17, batch 1900, loss[loss=0.201, simple_loss=0.2792, pruned_loss=0.06146, over 24090.00 frames. ], tot_loss[loss=0.1828, simple_loss=0.2575, pruned_loss=0.05406, over 4709714.50 frames. ], batch size: 80, lr: 6.10e-03, grad_scale: 16.0 2023-09-30 03:18:11,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:18:11,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-30 03:18:14,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:18:14,188 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-30 03:18:14,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:18:15,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:18:20,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:18:23,281 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.519e+02 1.816e+02 1.988e+02 2.233e+02 2.900e+02, threshold=3.976e+02, percent-clipped=0.0 2023-09-30 03:18:23,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:18:24,959 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-30 03:18:26,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-30 03:18:28,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:18:30,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:18:30,147 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-30 03:18:30,232 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-30 03:18:33,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-30 03:18:35,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:18:40,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-30 03:18:42,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-30 03:18:50,802 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=579426.6666666666, ans=0.125 2023-09-30 03:18:52,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-30 03:18:55,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-30 03:18:55,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:18:55,167 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-30 03:18:56,577 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-30 03:18:56,642 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-30 03:18:56,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-30 03:18:56,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:19:01,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-30 03:19:04,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:19:08,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:19:08,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-30 03:19:11,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:19:14,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-30 03:19:14,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-30 03:19:23,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:19:23,176 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:19:23,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:19:24,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:19:26,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 03:19:26,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-30 03:19:27,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:19:29,478 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:19:29,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:19:32,261 INFO [train.py:1039] (1/4) Epoch 17, batch 1950, loss[loss=0.1949, simple_loss=0.2693, pruned_loss=0.06027, over 23382.00 frames. ], tot_loss[loss=0.1834, simple_loss=0.258, pruned_loss=0.05441, over 4705126.16 frames. ], batch size: 119, lr: 6.10e-03, grad_scale: 16.0 2023-09-30 03:19:32,411 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:19:32,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:19:32,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-30 03:19:34,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:19:36,193 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=579626.6666666666, ans=0.5 2023-09-30 03:19:37,931 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:19:40,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:19:41,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:19:41,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 03:19:42,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-30 03:19:42,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 03:19:44,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:19:44,555 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=579626.6666666666, ans=0.125 2023-09-30 03:19:45,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:19:48,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:19:48,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:19:50,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:19:52,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:19:52,768 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=579693.3333333334, ans=0.125 2023-09-30 03:19:55,723 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:19:55,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 03:19:55,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:19:55,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:20:00,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:20:03,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:20:03,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:20:03,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-30 03:20:03,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-30 03:20:04,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 03:20:05,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:20:06,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:20:11,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:20:14,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:20:19,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 03:20:20,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:20:21,543 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.31 vs. limit=12.0 2023-09-30 03:20:22,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-30 03:20:22,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-30 03:20:22,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:20:27,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:20:28,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:20:30,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-30 03:20:38,697 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:20:40,220 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:20:43,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:20:44,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:20:48,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:20:48,544 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:20:48,659 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-30 03:20:48,668 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 03:20:50,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:20:51,557 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-30 03:20:52,966 INFO [train.py:1039] (1/4) Epoch 17, batch 2000, loss[loss=0.1989, simple_loss=0.277, pruned_loss=0.06033, over 24055.00 frames. ], tot_loss[loss=0.1833, simple_loss=0.2584, pruned_loss=0.05407, over 4715340.72 frames. ], batch size: 80, lr: 6.10e-03, grad_scale: 32.0 2023-09-30 03:20:54,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:20:56,601 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=579960.0, ans=0.0 2023-09-30 03:20:57,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-30 03:20:59,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:21:00,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:21:00,210 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=579960.0, ans=0.0 2023-09-30 03:21:01,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:21:04,110 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:21:07,546 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.864e+02 2.151e+02 2.440e+02 3.319e+02, threshold=4.303e+02, percent-clipped=0.0 2023-09-30 03:21:07,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-30 03:21:07,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-30 03:21:12,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:21:13,872 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-30 03:21:14,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 03:21:14,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:21:18,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:21:19,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-30 03:21:21,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:21:25,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:21:25,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:21:25,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=580093.3333333334, ans=0.125 2023-09-30 03:21:26,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-30 03:21:26,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 03:21:28,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-30 03:21:28,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:21:33,391 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:21:33,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-30 03:21:33,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:21:33,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:21:36,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:21:37,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-30 03:21:40,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-30 03:21:40,109 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:21:40,124 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:21:45,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:21:47,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:21:47,130 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:21:48,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:21:48,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:21:50,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:21:50,318 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:21:50,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:21:51,932 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:21:53,954 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=580160.0, ans=0.125 2023-09-30 03:21:55,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:21:57,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-30 03:22:03,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 03:22:04,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:22:08,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:22:08,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:22:12,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:22:15,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:22:15,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:22:17,173 INFO [train.py:1039] (1/4) Epoch 17, batch 2050, loss[loss=0.1704, simple_loss=0.2343, pruned_loss=0.05328, over 23776.00 frames. ], tot_loss[loss=0.1833, simple_loss=0.258, pruned_loss=0.05432, over 4711704.01 frames. ], batch size: 212, lr: 6.09e-03, grad_scale: 32.0 2023-09-30 03:22:17,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 03:22:17,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:22:18,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:22:20,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:22:22,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:22:23,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:22:28,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:22:31,883 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:22:31,970 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:22:33,435 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:22:33,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-30 03:22:33,845 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=580360.0, ans=0.125 2023-09-30 03:22:35,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:22:35,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:22:36,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:22:45,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:22:45,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:22:49,416 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-30 03:22:52,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:22:53,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-30 03:22:53,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:22:57,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:22:58,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:23:00,203 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-30 03:23:00,276 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:23:00,609 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=580426.6666666666, ans=0.2 2023-09-30 03:23:01,816 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:23:03,851 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:23:05,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:23:06,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:23:08,562 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:23:12,060 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-30 03:23:13,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:23:18,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:23:22,426 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:23:23,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-30 03:23:30,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:23:30,301 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:23:33,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:23:33,717 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=580560.0, ans=0.2 2023-09-30 03:23:35,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-30 03:23:40,049 INFO [train.py:1039] (1/4) Epoch 17, batch 2100, loss[loss=0.1778, simple_loss=0.2455, pruned_loss=0.05502, over 23746.00 frames. ], tot_loss[loss=0.1824, simple_loss=0.2572, pruned_loss=0.0538, over 4723823.14 frames. ], batch size: 164, lr: 6.09e-03, grad_scale: 32.0 2023-09-30 03:23:40,308 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-30 03:23:40,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:23:40,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:23:41,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 03:23:41,980 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:23:41,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-30 03:23:42,377 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=580626.6666666666, ans=0.1 2023-09-30 03:23:43,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-30 03:23:44,068 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:23:47,492 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=580626.6666666666, ans=0.125 2023-09-30 03:23:48,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:23:48,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:23:51,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:23:51,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:23:51,831 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-30 03:23:53,674 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 1.830e+02 2.027e+02 2.301e+02 3.593e+02, threshold=4.054e+02, percent-clipped=0.0 2023-09-30 03:23:53,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:23:53,990 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-30 03:23:53,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-30 03:23:55,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:23:55,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:23:55,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-30 03:23:57,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 03:23:57,547 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=580693.3333333334, ans=0.1 2023-09-30 03:24:04,802 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-30 03:24:04,804 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 03:24:07,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:24:07,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:24:11,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:24:11,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-30 03:24:13,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:24:13,267 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 03:24:13,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-30 03:24:15,017 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:24:15,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-30 03:24:15,099 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-30 03:24:15,164 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-30 03:24:16,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-30 03:24:19,955 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:24:21,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 03:24:23,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 03:24:25,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:24:26,932 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:24:26,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-30 03:24:28,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:24:28,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:24:29,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:24:29,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-30 03:24:32,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-30 03:24:33,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-30 03:24:36,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:24:39,881 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:24:39,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-30 03:24:45,898 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=580893.3333333334, ans=0.1 2023-09-30 03:24:47,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:24:48,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:24:48,984 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=580893.3333333334, ans=0.2 2023-09-30 03:24:50,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:24:50,097 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:24:50,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-30 03:24:51,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 03:24:51,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:24:52,088 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=580893.3333333334, ans=0.125 2023-09-30 03:24:52,140 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=580893.3333333334, ans=0.04949747468305833 2023-09-30 03:24:53,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-30 03:24:53,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:24:53,351 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:24:54,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-30 03:24:56,552 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-30 03:24:56,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:24:58,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:24:58,216 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:25:00,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:25:00,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:25:03,754 INFO [train.py:1039] (1/4) Epoch 17, batch 2150, loss[loss=0.1765, simple_loss=0.2458, pruned_loss=0.05355, over 23812.00 frames. ], tot_loss[loss=0.1813, simple_loss=0.256, pruned_loss=0.05332, over 4716840.99 frames. ], batch size: 212, lr: 6.09e-03, grad_scale: 16.0 2023-09-30 03:25:05,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 03:25:07,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:25:08,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:25:08,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:25:08,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:25:10,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:25:13,481 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:25:15,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:25:15,598 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:25:16,236 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.94 vs. limit=10.0 2023-09-30 03:25:18,800 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=581026.6666666666, ans=0.125 2023-09-30 03:25:20,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:25:20,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-30 03:25:23,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:25:24,971 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:25:26,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:25:26,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:25:26,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:25:28,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:25:28,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:25:28,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:25:30,222 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:25:31,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-30 03:25:33,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:25:34,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:25:34,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:25:35,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 03:25:35,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:25:38,662 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:25:38,859 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=581093.3333333334, ans=0.125 2023-09-30 03:25:40,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-30 03:25:40,819 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.08 vs. limit=10.0 2023-09-30 03:25:40,852 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.78 vs. limit=6.0 2023-09-30 03:25:41,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:25:41,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-30 03:25:43,048 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-30 03:25:44,912 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.min_positive, batch_count=581093.3333333334, ans=0.025 2023-09-30 03:25:46,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:25:46,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:25:48,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:25:49,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 03:25:50,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:25:52,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:25:52,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-30 03:25:53,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-30 03:25:53,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:25:55,166 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-30 03:25:55,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:25:55,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:25:56,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-30 03:25:56,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:25:56,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-30 03:25:56,858 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-30 03:25:56,859 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-30 03:25:56,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-30 03:25:59,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:26:01,310 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:26:01,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:26:02,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:26:03,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 03:26:06,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:26:06,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:26:10,035 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=581226.6666666666, ans=0.0 2023-09-30 03:26:11,538 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=581226.6666666666, ans=0.125 2023-09-30 03:26:16,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:26:16,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-30 03:26:19,617 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:26:24,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:26:25,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:26:26,983 INFO [train.py:1039] (1/4) Epoch 17, batch 2200, loss[loss=0.16, simple_loss=0.2437, pruned_loss=0.03819, over 24457.00 frames. ], tot_loss[loss=0.182, simple_loss=0.2566, pruned_loss=0.05368, over 4707147.62 frames. ], batch size: 66, lr: 6.09e-03, grad_scale: 16.0 2023-09-30 03:26:27,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:26:27,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-30 03:26:30,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:26:30,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:26:30,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-30 03:26:36,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-30 03:26:40,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 03:26:41,769 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.818e+02 1.974e+02 2.282e+02 3.535e+02, threshold=3.948e+02, percent-clipped=0.0 2023-09-30 03:26:42,150 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=581360.0, ans=0.125 2023-09-30 03:26:47,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-30 03:26:50,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:26:51,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:26:51,920 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:26:52,286 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=581360.0, ans=0.125 2023-09-30 03:26:55,023 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:26:56,431 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-30 03:27:01,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:27:01,697 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:27:03,828 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-30 03:27:05,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-30 03:27:07,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:27:08,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:27:10,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:27:13,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-30 03:27:15,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:27:16,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-30 03:27:19,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:27:19,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-30 03:27:19,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:27:21,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:27:23,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:27:23,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:27:23,485 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:27:26,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-30 03:27:26,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:27:28,215 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 03:27:32,061 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.48 vs. limit=15.0 2023-09-30 03:27:32,685 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 03:27:32,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:27:34,764 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=581560.0, ans=0.125 2023-09-30 03:27:36,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-30 03:27:37,956 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-30 03:27:40,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:27:40,340 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-30 03:27:41,220 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.46 vs. limit=15.0 2023-09-30 03:27:41,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-30 03:27:41,931 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-30 03:27:44,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:27:44,891 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-30 03:27:46,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:27:47,848 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-30 03:27:49,188 INFO [train.py:1039] (1/4) Epoch 17, batch 2250, loss[loss=0.2187, simple_loss=0.2901, pruned_loss=0.07367, over 23167.00 frames. ], tot_loss[loss=0.182, simple_loss=0.2569, pruned_loss=0.05358, over 4713852.09 frames. ], batch size: 119, lr: 6.09e-03, grad_scale: 16.0 2023-09-30 03:27:50,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:27:54,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:27:58,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:28:01,172 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:28:03,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:28:04,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:28:05,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:28:07,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-30 03:28:07,402 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:28:07,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:28:11,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-30 03:28:11,177 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:28:12,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:28:15,996 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:28:22,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:28:23,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 03:28:23,724 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-30 03:28:25,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-30 03:28:25,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:28:29,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:28:32,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:28:34,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:28:35,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:28:35,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:28:37,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:28:40,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:28:45,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:28:47,658 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=581826.6666666666, ans=0.125 2023-09-30 03:28:50,359 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-30 03:28:55,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 03:28:55,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-30 03:28:58,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:29:03,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 03:29:05,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-30 03:29:05,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-30 03:29:05,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:29:06,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:29:09,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-30 03:29:09,662 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=581893.3333333334, ans=0.1 2023-09-30 03:29:10,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:29:11,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:29:12,826 INFO [train.py:1039] (1/4) Epoch 17, batch 2300, loss[loss=0.1947, simple_loss=0.2585, pruned_loss=0.06548, over 23825.00 frames. ], tot_loss[loss=0.1823, simple_loss=0.2574, pruned_loss=0.05362, over 4725703.83 frames. ], batch size: 179, lr: 6.09e-03, grad_scale: 16.0 2023-09-30 03:29:19,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:29:19,742 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:29:21,345 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-30 03:29:22,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:29:26,952 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=581960.0, ans=0.0 2023-09-30 03:29:27,891 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.892e+02 2.152e+02 2.503e+02 3.822e+02, threshold=4.305e+02, percent-clipped=0.0 2023-09-30 03:29:28,235 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:29:29,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-30 03:29:29,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:29:29,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:29:29,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-30 03:29:31,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:29:32,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:29:34,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:29:38,050 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:29:42,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-30 03:29:46,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:29:51,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:29:53,154 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:29:57,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:30:00,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:30:01,050 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=582160.0, ans=0.125 2023-09-30 03:30:04,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:30:04,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 03:30:06,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:30:06,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-30 03:30:11,077 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 03:30:11,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:30:12,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:30:12,587 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:30:12,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:30:14,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 03:30:14,193 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-30 03:30:14,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-30 03:30:14,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:30:14,344 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:30:15,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-30 03:30:23,375 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:30:24,005 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=582226.6666666666, ans=0.2 2023-09-30 03:30:25,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:30:29,124 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:30:29,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:30:30,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-30 03:30:33,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 03:30:33,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:30:33,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 03:30:35,039 INFO [train.py:1039] (1/4) Epoch 17, batch 2350, loss[loss=0.2024, simple_loss=0.2656, pruned_loss=0.06958, over 23815.00 frames. ], tot_loss[loss=0.1834, simple_loss=0.2586, pruned_loss=0.0541, over 4715238.02 frames. ], batch size: 179, lr: 6.08e-03, grad_scale: 16.0 2023-09-30 03:30:35,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-30 03:30:40,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:30:41,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-30 03:30:47,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-30 03:30:50,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:30:54,499 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:30:54,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:30:54,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:30:54,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:30:56,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-30 03:30:59,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:31:00,566 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.39 vs. limit=15.0 2023-09-30 03:31:08,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-30 03:31:09,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:31:12,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:31:12,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:31:14,579 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:31:14,827 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=582426.6666666666, ans=0.0 2023-09-30 03:31:16,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-30 03:31:18,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:31:18,537 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=582426.6666666666, ans=0.0 2023-09-30 03:31:21,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:31:21,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:31:21,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:31:24,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:31:26,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-30 03:31:26,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:31:29,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:31:29,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:31:31,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-30 03:31:31,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:31:36,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-30 03:31:36,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-30 03:31:42,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-30 03:31:47,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-30 03:31:47,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:31:47,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-30 03:31:49,204 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-30 03:31:49,245 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-30 03:31:51,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-30 03:31:54,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:31:56,576 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=582626.6666666666, ans=0.125 2023-09-30 03:31:57,586 INFO [train.py:1039] (1/4) Epoch 17, batch 2400, loss[loss=0.1819, simple_loss=0.242, pruned_loss=0.06084, over 23576.00 frames. ], tot_loss[loss=0.1832, simple_loss=0.2584, pruned_loss=0.05397, over 4719687.73 frames. ], batch size: 256, lr: 6.08e-03, grad_scale: 32.0 2023-09-30 03:31:57,752 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:32:00,909 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.41 vs. limit=15.0 2023-09-30 03:32:02,840 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:32:04,497 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:32:06,692 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-30 03:32:06,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-30 03:32:13,859 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 1.898e+02 2.064e+02 2.332e+02 3.496e+02, threshold=4.129e+02, percent-clipped=0.0 2023-09-30 03:32:14,083 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 03:32:14,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:32:17,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-30 03:32:17,198 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=582693.3333333334, ans=0.0 2023-09-30 03:32:18,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:32:18,698 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:32:18,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-30 03:32:27,049 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:32:28,657 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-30 03:32:35,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-30 03:32:38,632 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-30 03:32:42,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:32:43,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:32:47,206 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=582826.6666666666, ans=0.125 2023-09-30 03:32:48,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:32:48,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-30 03:32:50,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:32:52,308 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=582826.6666666666, ans=0.125 2023-09-30 03:32:56,429 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:32:59,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:33:03,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:33:03,398 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:33:03,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-30 03:33:03,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:33:04,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:33:04,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:33:04,994 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 03:33:11,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:33:11,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 03:33:11,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-30 03:33:13,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-30 03:33:16,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:33:16,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:33:16,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-30 03:33:18,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-30 03:33:18,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-30 03:33:18,488 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-30 03:33:19,898 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-30 03:33:20,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:33:21,465 INFO [train.py:1039] (1/4) Epoch 17, batch 2450, loss[loss=0.1943, simple_loss=0.2758, pruned_loss=0.0564, over 24370.00 frames. ], tot_loss[loss=0.1824, simple_loss=0.2575, pruned_loss=0.05364, over 4721585.77 frames. ], batch size: 77, lr: 6.08e-03, grad_scale: 32.0 2023-09-30 03:33:21,652 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:33:21,805 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_positive, batch_count=582960.0, ans=0.05 2023-09-30 03:33:23,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:33:23,774 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-30 03:33:25,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:33:25,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-30 03:33:28,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-30 03:33:28,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:33:33,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:33:33,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:33:35,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-30 03:33:41,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:33:41,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:33:42,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:33:44,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:33:44,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:33:44,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-30 03:33:50,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:33:51,781 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 03:33:52,333 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.57 vs. limit=22.5 2023-09-30 03:33:53,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:33:56,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-30 03:33:58,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:33:58,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:34:00,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:34:01,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-30 03:34:03,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:34:11,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:34:12,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:34:13,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:34:13,084 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:34:13,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:34:14,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:34:14,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-30 03:34:18,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:34:18,470 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:34:23,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:34:23,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:34:28,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:34:28,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-30 03:34:28,315 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:34:29,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:34:29,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-30 03:34:31,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:34:33,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:34:36,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:34:38,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:34:39,573 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:34:44,640 INFO [train.py:1039] (1/4) Epoch 17, batch 2500, loss[loss=0.1945, simple_loss=0.2658, pruned_loss=0.06167, over 23867.00 frames. ], tot_loss[loss=0.1821, simple_loss=0.2566, pruned_loss=0.05378, over 4706783.47 frames. ], batch size: 195, lr: 6.08e-03, grad_scale: 8.0 2023-09-30 03:34:44,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-30 03:34:44,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:34:50,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:34:57,662 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=583293.3333333334, ans=0.125 2023-09-30 03:35:00,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:35:00,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:35:01,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:35:01,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-30 03:35:03,352 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.839e+02 2.064e+02 2.322e+02 3.484e+02, threshold=4.127e+02, percent-clipped=0.0 2023-09-30 03:35:08,825 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=583360.0, ans=0.125 2023-09-30 03:35:09,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 03:35:11,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:35:11,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-30 03:35:11,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 03:35:11,774 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-30 03:35:13,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:35:13,547 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 03:35:14,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:35:14,851 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-30 03:35:14,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:35:16,886 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-30 03:35:16,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:35:22,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:35:23,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:35:25,504 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 03:35:27,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-30 03:35:27,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:35:30,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:35:34,977 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:35:39,483 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:35:43,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:35:43,421 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=583493.3333333334, ans=0.5 2023-09-30 03:35:47,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-30 03:35:51,199 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=583560.0, ans=0.125 2023-09-30 03:35:51,638 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.21 vs. limit=15.0 2023-09-30 03:35:52,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-30 03:35:52,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:35:52,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-30 03:35:52,583 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=583560.0, ans=0.125 2023-09-30 03:35:54,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:35:54,481 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 03:35:54,842 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=583560.0, ans=0.125 2023-09-30 03:35:55,941 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-30 03:35:55,942 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-30 03:35:55,951 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-30 03:35:58,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:36:01,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-30 03:36:01,994 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-30 03:36:02,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:36:02,290 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer_na.min_abs, batch_count=583560.0, ans=0.02 2023-09-30 03:36:03,518 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-30 03:36:05,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-30 03:36:08,048 INFO [train.py:1039] (1/4) Epoch 17, batch 2550, loss[loss=0.1907, simple_loss=0.2568, pruned_loss=0.06233, over 23769.00 frames. ], tot_loss[loss=0.1828, simple_loss=0.2572, pruned_loss=0.05416, over 4712482.39 frames. ], batch size: 164, lr: 6.08e-03, grad_scale: 8.0 2023-09-30 03:36:08,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:36:11,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:36:12,703 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:36:14,381 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:36:15,800 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-30 03:36:15,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-30 03:36:21,001 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-30 03:36:22,609 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:36:22,954 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=583693.3333333334, ans=0.125 2023-09-30 03:36:24,270 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:36:27,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:36:27,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 03:36:29,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 03:36:29,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:36:29,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:36:29,847 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=583693.3333333334, ans=0.125 2023-09-30 03:36:33,244 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:36:33,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-30 03:36:33,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-30 03:36:33,334 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:36:33,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-30 03:36:47,044 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.07 vs. limit=10.0 2023-09-30 03:36:47,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:36:52,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:36:53,926 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:36:53,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:36:54,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 03:37:00,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:37:03,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 03:37:03,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:37:03,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:37:03,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-30 03:37:05,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-30 03:37:10,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:37:10,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:37:12,412 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=583826.6666666666, ans=0.125 2023-09-30 03:37:15,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:37:15,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-30 03:37:15,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:37:16,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:37:18,216 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:37:18,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 03:37:19,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:37:24,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:37:27,457 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.85 vs. limit=12.0 2023-09-30 03:37:28,174 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:37:31,105 INFO [train.py:1039] (1/4) Epoch 17, batch 2600, loss[loss=0.2007, simple_loss=0.27, pruned_loss=0.06569, over 23868.00 frames. ], tot_loss[loss=0.1831, simple_loss=0.2578, pruned_loss=0.05421, over 4718476.59 frames. ], batch size: 164, lr: 6.08e-03, grad_scale: 8.0 2023-09-30 03:37:31,291 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-30 03:37:36,302 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-30 03:37:36,329 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:37:36,392 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-30 03:37:36,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-30 03:37:36,555 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-30 03:37:41,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:37:41,375 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-30 03:37:42,413 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=583960.0, ans=0.125 2023-09-30 03:37:43,434 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-30 03:37:44,870 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-30 03:37:46,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:37:48,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-30 03:37:49,477 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 2.083e+02 2.505e+02 2.924e+02 4.278e+02, threshold=5.011e+02, percent-clipped=1.0 2023-09-30 03:37:51,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-30 03:37:52,750 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:37:52,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-30 03:37:55,944 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-30 03:37:55,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-30 03:37:56,453 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=584026.6666666666, ans=0.04949747468305833 2023-09-30 03:37:56,461 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=584026.6666666666, ans=0.125 2023-09-30 03:38:02,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:38:02,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:38:04,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:38:04,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-30 03:38:07,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:38:12,685 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-30 03:38:18,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:38:18,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:38:18,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-30 03:38:18,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:38:18,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:38:20,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-30 03:38:22,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-30 03:38:23,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:38:25,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:38:29,564 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-30 03:38:29,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:38:29,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:38:34,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:38:36,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:38:36,490 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-30 03:38:38,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:38:39,440 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:38:39,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:38:40,414 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.87 vs. limit=15.0 2023-09-30 03:38:46,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-30 03:38:47,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:38:50,658 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 03:38:53,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-30 03:38:53,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:38:54,101 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=584293.3333333334, ans=0.2 2023-09-30 03:38:55,194 INFO [train.py:1039] (1/4) Epoch 17, batch 2650, loss[loss=0.2147, simple_loss=0.2743, pruned_loss=0.07759, over 23681.00 frames. ], tot_loss[loss=0.1855, simple_loss=0.2594, pruned_loss=0.05576, over 4704644.80 frames. ], batch size: 232, lr: 6.07e-03, grad_scale: 8.0 2023-09-30 03:38:55,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 03:38:55,378 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-30 03:38:55,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:38:58,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:38:59,993 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=584293.3333333334, ans=0.0 2023-09-30 03:39:01,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 03:39:02,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:39:05,997 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:39:07,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-30 03:39:07,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 03:39:08,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:39:11,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-30 03:39:12,956 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-30 03:39:15,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:39:20,045 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-30 03:39:20,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:39:21,664 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-30 03:39:27,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:39:27,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-30 03:39:27,079 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:39:27,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:39:30,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-30 03:39:30,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-30 03:39:30,707 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=584426.6666666666, ans=0.125 2023-09-30 03:39:33,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-30 03:39:35,890 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.74 vs. limit=10.0 2023-09-30 03:39:38,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-30 03:39:38,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:39:38,409 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=584426.6666666666, ans=0.0 2023-09-30 03:39:38,441 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=584426.6666666666, ans=0.125 2023-09-30 03:39:41,055 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:39:41,106 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-30 03:39:41,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:39:42,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:39:42,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:39:43,774 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.whiten.whitening_limit, batch_count=584493.3333333334, ans=15.0 2023-09-30 03:39:44,585 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=584493.3333333334, ans=0.125 2023-09-30 03:39:45,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:39:47,339 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:39:49,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-30 03:39:51,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:39:52,234 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.85 vs. limit=15.0 2023-09-30 03:39:52,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:39:53,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:39:53,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:39:55,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:39:56,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-30 03:39:58,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:39:58,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:39:58,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:39:58,851 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=584493.3333333334, ans=0.2 2023-09-30 03:40:00,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-30 03:40:01,218 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=584560.0, ans=0.125 2023-09-30 03:40:02,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:40:05,374 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:40:06,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:40:06,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:40:08,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-30 03:40:09,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:40:12,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:40:12,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-30 03:40:16,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:40:17,483 INFO [train.py:1039] (1/4) Epoch 17, batch 2700, loss[loss=0.1604, simple_loss=0.2376, pruned_loss=0.04156, over 22749.00 frames. ], tot_loss[loss=0.1862, simple_loss=0.2604, pruned_loss=0.05599, over 4709696.71 frames. ], batch size: 50, lr: 6.07e-03, grad_scale: 8.0 2023-09-30 03:40:17,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 03:40:19,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:40:20,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:40:20,885 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:40:22,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:40:22,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:40:22,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:40:22,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-30 03:40:23,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-30 03:40:25,038 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:40:25,225 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=584626.6666666666, ans=0.0 2023-09-30 03:40:26,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-30 03:40:27,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 03:40:27,992 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:40:31,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-30 03:40:32,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-30 03:40:34,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:40:36,042 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.817e+02 2.004e+02 2.215e+02 2.992e+02, threshold=4.008e+02, percent-clipped=0.0 2023-09-30 03:40:40,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:40:40,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:40:45,546 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:40:45,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:40:47,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:40:47,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-30 03:40:49,454 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.36 vs. limit=15.0 2023-09-30 03:40:50,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:40:53,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:40:53,161 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-30 03:40:53,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:40:58,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:40:58,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:41:07,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:41:09,091 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:41:12,152 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:41:12,155 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:41:18,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:41:18,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:41:18,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:41:19,910 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:41:21,391 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:41:21,757 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=584893.3333333334, ans=0.125 2023-09-30 03:41:22,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:41:23,546 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten.whitening_limit, batch_count=584893.3333333334, ans=22.5 2023-09-30 03:41:24,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:41:27,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:41:27,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:41:30,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-30 03:41:32,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:41:33,276 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=8.85 vs. limit=15.0 2023-09-30 03:41:34,593 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:41:34,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-30 03:41:34,863 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=584893.3333333334, ans=0.125 2023-09-30 03:41:36,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-30 03:41:36,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=584893.3333333334, ans=0.1 2023-09-30 03:41:37,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:41:39,350 INFO [train.py:1039] (1/4) Epoch 17, batch 2750, loss[loss=0.1814, simple_loss=0.2524, pruned_loss=0.05514, over 18571.00 frames. ], tot_loss[loss=0.1854, simple_loss=0.2592, pruned_loss=0.05582, over 4706354.32 frames. ], batch size: 39, lr: 6.07e-03, grad_scale: 8.0 2023-09-30 03:41:40,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:41:41,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:41:45,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:41:45,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-30 03:41:45,670 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=584960.0, ans=0.125 2023-09-30 03:41:47,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:41:50,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:41:50,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 03:41:52,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:41:52,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:41:52,203 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-30 03:41:52,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:41:52,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:41:57,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-30 03:42:00,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:42:00,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:42:00,251 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:42:01,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-30 03:42:01,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:42:02,029 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=585026.6666666666, ans=0.125 2023-09-30 03:42:03,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:42:03,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:42:05,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:42:09,181 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=585026.6666666666, ans=0.0 2023-09-30 03:42:09,580 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.31 vs. limit=15.0 2023-09-30 03:42:10,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 03:42:12,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 03:42:12,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:42:13,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:42:15,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 03:42:21,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:42:25,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 03:42:25,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:42:28,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:42:28,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-30 03:42:29,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:42:30,522 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.06 vs. limit=12.0 2023-09-30 03:42:35,874 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:42:35,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:42:35,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-30 03:42:42,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:42:44,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-30 03:42:50,011 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-30 03:42:53,121 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:42:53,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-30 03:42:54,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:42:56,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:42:56,969 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-30 03:42:57,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:42:58,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-30 03:43:00,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:43:00,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:43:00,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-30 03:43:00,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:43:01,823 INFO [train.py:1039] (1/4) Epoch 17, batch 2800, loss[loss=0.1678, simple_loss=0.2116, pruned_loss=0.06201, over 19016.00 frames. ], tot_loss[loss=0.1841, simple_loss=0.2573, pruned_loss=0.05547, over 4689049.43 frames. ], batch size: 388, lr: 6.07e-03, grad_scale: 16.0 2023-09-30 03:43:01,958 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:43:03,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:43:03,803 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=585293.3333333334, ans=0.125 2023-09-30 03:43:04,918 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-30 03:43:04,919 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-30 03:43:08,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:43:09,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 03:43:11,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:43:15,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:43:18,148 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-30 03:43:20,141 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 1.888e+02 2.133e+02 2.598e+02 4.037e+02, threshold=4.266e+02, percent-clipped=1.0 2023-09-30 03:43:20,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-30 03:43:22,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-30 03:43:23,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:43:24,919 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:43:24,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:43:28,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:43:29,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:43:29,555 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-30 03:43:31,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:43:39,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:43:39,669 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=585426.6666666666, ans=0.125 2023-09-30 03:43:42,373 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:43:45,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:43:45,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:43:46,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:43:49,707 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=585493.3333333334, ans=0.125 2023-09-30 03:43:52,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:43:52,400 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-30 03:43:52,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:43:54,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:43:54,016 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:43:59,203 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:43:59,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:44:04,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:44:06,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:44:07,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:44:07,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 03:44:07,685 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 03:44:07,971 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=585560.0, ans=0.125 2023-09-30 03:44:09,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 03:44:10,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:44:10,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-30 03:44:10,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:44:12,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:44:12,257 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:44:14,278 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=5.86 vs. limit=15.0 2023-09-30 03:44:15,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-30 03:44:15,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:44:16,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:44:16,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:44:18,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-30 03:44:22,080 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=585626.6666666666, ans=0.1 2023-09-30 03:44:23,656 INFO [train.py:1039] (1/4) Epoch 17, batch 2850, loss[loss=0.1727, simple_loss=0.2434, pruned_loss=0.05103, over 23604.00 frames. ], tot_loss[loss=0.1832, simple_loss=0.257, pruned_loss=0.05476, over 4701976.13 frames. ], batch size: 232, lr: 6.07e-03, grad_scale: 8.0 2023-09-30 03:44:24,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:44:24,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 03:44:25,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:44:28,789 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:44:32,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:44:32,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:44:32,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:44:35,606 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:44:36,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:44:38,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-30 03:44:40,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-30 03:44:44,179 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=585693.3333333334, ans=0.125 2023-09-30 03:44:44,213 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=585693.3333333334, ans=0.07 2023-09-30 03:44:45,445 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-30 03:44:45,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:44:48,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-30 03:44:49,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:44:51,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-30 03:44:52,158 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.39 vs. limit=15.0 2023-09-30 03:44:52,835 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-30 03:44:54,417 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:45:06,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:45:08,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:45:10,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:45:11,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 03:45:11,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 03:45:11,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-30 03:45:14,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 03:45:14,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-30 03:45:16,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-30 03:45:18,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:45:18,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:45:18,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:45:21,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:45:21,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:45:23,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:45:23,390 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:45:26,326 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:45:26,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:45:26,676 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=585826.6666666666, ans=0.125 2023-09-30 03:45:26,707 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=585826.6666666666, ans=0.125 2023-09-30 03:45:27,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:45:31,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:45:35,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:45:36,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-30 03:45:38,236 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-30 03:45:39,830 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 03:45:40,015 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=585893.3333333334, ans=0.2 2023-09-30 03:45:41,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:45:41,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-30 03:45:41,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:45:41,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:45:41,468 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:45:43,478 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:45:43,479 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-30 03:45:43,555 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-30 03:45:43,560 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:45:43,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:45:46,529 INFO [train.py:1039] (1/4) Epoch 17, batch 2900, loss[loss=0.1567, simple_loss=0.2321, pruned_loss=0.04065, over 24564.00 frames. ], tot_loss[loss=0.1834, simple_loss=0.2575, pruned_loss=0.05463, over 4704112.19 frames. ], batch size: 60, lr: 6.07e-03, grad_scale: 8.0 2023-09-30 03:45:50,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-30 03:45:50,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:45:50,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:45:53,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-30 03:45:56,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:45:56,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-30 03:45:56,923 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=585960.0, ans=0.09899494936611666 2023-09-30 03:45:58,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-30 03:45:59,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:45:59,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-30 03:46:01,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:46:03,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:46:06,433 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.447e+02 1.832e+02 2.093e+02 2.427e+02 4.261e+02, threshold=4.186e+02, percent-clipped=0.0 2023-09-30 03:46:07,301 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:46:08,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:46:09,164 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.33 vs. limit=12.0 2023-09-30 03:46:11,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-30 03:46:11,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-30 03:46:13,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-30 03:46:13,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:46:13,488 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=586026.6666666666, ans=0.125 2023-09-30 03:46:16,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-30 03:46:18,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-30 03:46:21,127 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:46:21,131 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-30 03:46:21,157 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:46:24,842 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:46:24,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-30 03:46:26,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:46:27,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:46:28,523 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.80 vs. limit=12.0 2023-09-30 03:46:32,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:46:35,772 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:46:38,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-30 03:46:38,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-30 03:46:38,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:46:42,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:46:44,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-30 03:46:47,211 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:46:51,981 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:47:02,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:47:02,289 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-30 03:47:02,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-30 03:47:05,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:47:05,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-30 03:47:07,089 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:47:07,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-30 03:47:08,574 INFO [train.py:1039] (1/4) Epoch 17, batch 2950, loss[loss=0.1767, simple_loss=0.2502, pruned_loss=0.05163, over 23195.00 frames. ], tot_loss[loss=0.183, simple_loss=0.2575, pruned_loss=0.05426, over 4712946.22 frames. ], batch size: 93, lr: 6.06e-03, grad_scale: 8.0 2023-09-30 03:47:15,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:47:15,389 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-30 03:47:17,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:47:17,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:47:19,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:47:20,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:47:20,772 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-30 03:47:20,960 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=586293.3333333334, ans=0.0 2023-09-30 03:47:22,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-30 03:47:22,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 03:47:22,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:47:28,050 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.71 vs. limit=6.0 2023-09-30 03:47:30,889 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:47:33,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:47:35,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:47:36,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:47:38,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:47:38,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:47:40,810 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.03 vs. limit=6.0 2023-09-30 03:47:41,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:47:43,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:47:43,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:47:44,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-30 03:47:50,066 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-30 03:47:50,108 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-30 03:47:50,427 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=586426.6666666666, ans=0.125 2023-09-30 03:47:52,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:47:53,589 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-30 03:47:55,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-30 03:47:55,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:47:55,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:47:55,212 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-30 03:47:55,219 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-30 03:47:58,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-30 03:47:59,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:48:00,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:48:00,642 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=586493.3333333334, ans=0.1 2023-09-30 03:48:02,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:48:03,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:48:05,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:48:05,778 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-30 03:48:05,830 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:48:05,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-30 03:48:09,061 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=586493.3333333334, ans=0.1 2023-09-30 03:48:10,680 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=586493.3333333334, ans=0.125 2023-09-30 03:48:13,550 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:48:13,838 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=586560.0, ans=0.125 2023-09-30 03:48:15,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:48:16,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-30 03:48:16,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:48:19,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-30 03:48:21,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:48:23,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:48:24,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:48:26,247 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:48:26,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 03:48:28,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:48:28,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:48:28,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-30 03:48:29,374 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=17.25 vs. limit=22.5 2023-09-30 03:48:30,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-30 03:48:31,995 INFO [train.py:1039] (1/4) Epoch 17, batch 3000, loss[loss=0.1877, simple_loss=0.2596, pruned_loss=0.05789, over 23258.00 frames. ], tot_loss[loss=0.1835, simple_loss=0.2581, pruned_loss=0.05442, over 4719326.17 frames. ], batch size: 105, lr: 6.06e-03, grad_scale: 8.0 2023-09-30 03:48:31,995 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-30 03:48:47,122 INFO [train.py:1071] (1/4) Epoch 17, validation: loss=0.2916, simple_loss=0.2691, pruned_loss=0.1571, over 1125622.00 frames. 2023-09-30 03:48:47,124 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-30 03:48:47,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:48:48,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:48:50,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:48:50,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-30 03:48:51,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:48:53,283 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:48:54,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-30 03:48:54,989 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=586626.6666666666, ans=0.125 2023-09-30 03:49:01,572 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-30 03:49:01,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-30 03:49:01,891 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=586626.6666666666, ans=0.04949747468305833 2023-09-30 03:49:05,867 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:49:05,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:49:07,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-30 03:49:07,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:49:10,632 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 1.835e+02 1.995e+02 2.224e+02 3.286e+02, threshold=3.989e+02, percent-clipped=0.0 2023-09-30 03:49:15,596 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 03:49:16,514 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.19 vs. limit=22.5 2023-09-30 03:49:25,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:49:25,739 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=586760.0, ans=0.0 2023-09-30 03:49:25,911 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=586760.0, ans=0.125 2023-09-30 03:49:31,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-30 03:49:33,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-30 03:49:34,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 03:49:36,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:49:38,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:49:38,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:49:38,713 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-30 03:49:42,424 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-30 03:49:42,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:49:43,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 03:49:47,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:49:47,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:49:47,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:49:47,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:49:50,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 03:49:50,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:49:50,843 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:49:52,552 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=586826.6666666666, ans=0.1 2023-09-30 03:49:53,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:49:56,753 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.89 vs. limit=6.0 2023-09-30 03:49:57,449 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-30 03:49:57,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:49:58,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:49:58,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:50:02,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:50:02,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:50:03,743 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-30 03:50:05,111 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-30 03:50:05,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:50:06,577 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-30 03:50:06,674 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:50:08,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-30 03:50:13,257 INFO [train.py:1039] (1/4) Epoch 17, batch 3050, loss[loss=0.1825, simple_loss=0.2645, pruned_loss=0.05027, over 24658.00 frames. ], tot_loss[loss=0.1846, simple_loss=0.259, pruned_loss=0.0551, over 4708110.67 frames. ], batch size: 65, lr: 6.06e-03, grad_scale: 8.0 2023-09-30 03:50:13,317 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-30 03:50:13,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 03:50:13,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-30 03:50:15,015 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-30 03:50:15,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 03:50:16,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:50:16,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:50:18,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-30 03:50:18,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:50:18,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:50:21,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-30 03:50:23,218 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:50:26,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:50:26,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:50:30,337 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.37 vs. limit=15.0 2023-09-30 03:50:31,242 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:50:33,173 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=587026.6666666666, ans=0.1 2023-09-30 03:50:34,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-30 03:50:39,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-30 03:50:39,176 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-30 03:50:40,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:50:43,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:50:49,078 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:50:49,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:50:51,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:50:52,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:50:54,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-30 03:50:54,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:50:54,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:50:54,307 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:50:55,815 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:50:58,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:51:00,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:51:00,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-30 03:51:00,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:51:02,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 03:51:03,402 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.03 vs. limit=10.0 2023-09-30 03:51:06,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:51:06,092 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:51:07,387 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:51:07,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:51:12,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:51:13,428 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:51:20,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:51:20,534 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=587226.6666666666, ans=0.125 2023-09-30 03:51:21,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:51:21,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:51:21,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:51:23,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 03:51:23,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:51:25,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-30 03:51:27,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:51:27,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:51:27,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-30 03:51:30,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:51:31,240 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.77 vs. limit=12.0 2023-09-30 03:51:33,807 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:51:35,824 INFO [train.py:1039] (1/4) Epoch 17, batch 3100, loss[loss=0.2079, simple_loss=0.2628, pruned_loss=0.0765, over 19794.00 frames. ], tot_loss[loss=0.1851, simple_loss=0.2594, pruned_loss=0.0554, over 4699851.66 frames. ], batch size: 388, lr: 6.06e-03, grad_scale: 8.0 2023-09-30 03:51:36,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:51:37,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 03:51:40,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-30 03:51:43,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-30 03:51:43,836 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=587293.3333333334, ans=0.125 2023-09-30 03:51:45,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-30 03:51:45,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:51:47,062 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=587293.3333333334, ans=0.5 2023-09-30 03:51:48,240 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:51:48,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:51:53,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-30 03:51:55,415 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.922e+02 2.326e+02 2.796e+02 3.777e+02, threshold=4.651e+02, percent-clipped=0.0 2023-09-30 03:51:55,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:52:02,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-30 03:52:05,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 03:52:07,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:52:08,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:52:08,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:52:10,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-30 03:52:10,296 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=587426.6666666666, ans=0.5 2023-09-30 03:52:12,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:52:12,256 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-30 03:52:12,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:52:15,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:52:15,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-30 03:52:18,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:52:20,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:52:21,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-30 03:52:23,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-30 03:52:25,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:52:25,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:52:28,844 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:52:28,871 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:52:30,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:52:32,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-30 03:52:32,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:52:33,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:52:33,871 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:52:33,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:52:33,884 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 03:52:37,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:52:38,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-30 03:52:41,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:52:43,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-30 03:52:43,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:52:45,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:52:45,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-30 03:52:56,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-30 03:52:58,043 INFO [train.py:1039] (1/4) Epoch 17, batch 3150, loss[loss=0.1648, simple_loss=0.2441, pruned_loss=0.04278, over 24351.00 frames. ], tot_loss[loss=0.1831, simple_loss=0.2571, pruned_loss=0.05452, over 4695496.45 frames. ], batch size: 61, lr: 6.06e-03, grad_scale: 8.0 2023-09-30 03:52:58,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:52:59,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:53:01,171 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.63 vs. limit=22.5 2023-09-30 03:53:03,351 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:53:03,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:53:04,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-30 03:53:04,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:53:06,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-30 03:53:07,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-30 03:53:10,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:53:11,755 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-30 03:53:14,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-30 03:53:14,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:53:15,160 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=587693.3333333334, ans=0.125 2023-09-30 03:53:16,413 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-30 03:53:16,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-30 03:53:18,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-30 03:53:18,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-30 03:53:18,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-30 03:53:18,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:53:18,874 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:53:20,541 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:53:22,045 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-30 03:53:25,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:53:25,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:53:25,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:53:28,008 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:53:31,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-30 03:53:31,962 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:53:36,923 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-30 03:53:37,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:53:38,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-30 03:53:42,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-30 03:53:42,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:53:42,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 03:53:42,685 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=587760.0, ans=0.125 2023-09-30 03:53:43,706 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 03:53:43,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:53:43,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:53:45,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-30 03:53:45,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-30 03:53:45,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-30 03:53:47,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 03:53:47,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:53:47,541 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=587826.6666666666, ans=0.1 2023-09-30 03:53:49,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:53:50,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:53:50,136 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-30 03:53:52,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:53:55,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-30 03:53:55,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:53:55,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-30 03:53:56,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-30 03:53:58,453 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:53:59,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:54:00,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-30 03:54:00,718 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=587826.6666666666, ans=15.0 2023-09-30 03:54:01,530 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 03:54:02,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:54:07,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:54:08,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:54:08,078 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:54:14,935 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:54:16,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:54:18,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-30 03:54:21,673 INFO [train.py:1039] (1/4) Epoch 17, batch 3200, loss[loss=0.1721, simple_loss=0.2574, pruned_loss=0.04343, over 24520.00 frames. ], tot_loss[loss=0.182, simple_loss=0.2558, pruned_loss=0.05407, over 4700729.20 frames. ], batch size: 71, lr: 6.06e-03, grad_scale: 16.0 2023-09-30 03:54:23,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:54:23,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-30 03:54:26,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:54:28,437 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:54:28,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-30 03:54:31,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:54:37,714 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-30 03:54:39,609 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=588026.6666666666, ans=0.125 2023-09-30 03:54:40,747 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.992e+02 2.258e+02 2.770e+02 4.284e+02, threshold=4.516e+02, percent-clipped=0.0 2023-09-30 03:54:40,857 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:54:44,009 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.63 vs. limit=15.0 2023-09-30 03:54:49,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-30 03:54:51,023 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=588026.6666666666, ans=0.025 2023-09-30 03:54:56,085 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=588093.3333333334, ans=0.125 2023-09-30 03:54:57,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=588093.3333333334, ans=0.125 2023-09-30 03:55:00,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-30 03:55:00,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:55:00,807 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=588093.3333333334, ans=0.0 2023-09-30 03:55:04,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-30 03:55:05,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 03:55:08,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:55:08,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 03:55:10,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:55:12,139 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=588160.0, ans=0.125 2023-09-30 03:55:15,256 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-30 03:55:17,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-30 03:55:20,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-30 03:55:21,548 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.09 vs. limit=6.0 2023-09-30 03:55:23,464 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-30 03:55:25,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-30 03:55:25,322 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=588226.6666666666, ans=10.0 2023-09-30 03:55:28,960 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=588226.6666666666, ans=0.125 2023-09-30 03:55:30,373 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:55:31,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:55:31,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:55:31,886 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-30 03:55:31,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 03:55:34,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:55:37,364 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-30 03:55:37,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-30 03:55:38,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-30 03:55:40,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-30 03:55:42,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:55:43,572 INFO [train.py:1039] (1/4) Epoch 17, batch 3250, loss[loss=0.2053, simple_loss=0.2538, pruned_loss=0.07841, over 19071.00 frames. ], tot_loss[loss=0.1816, simple_loss=0.2555, pruned_loss=0.05385, over 4705070.81 frames. ], batch size: 388, lr: 6.05e-03, grad_scale: 16.0 2023-09-30 03:55:43,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-30 03:55:43,859 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-30 03:55:43,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:55:45,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:55:47,471 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-30 03:55:50,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:55:53,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:56:01,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:56:01,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-30 03:56:03,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:56:04,802 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:56:04,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:56:04,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:56:05,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 03:56:08,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:56:08,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:56:08,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:56:10,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:56:10,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:56:10,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:56:13,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:56:14,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:56:17,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:56:17,634 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:56:20,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:56:21,173 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:56:21,201 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:56:25,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-30 03:56:26,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:56:26,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:56:28,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:56:28,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-30 03:56:36,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:56:42,699 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:56:42,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:56:42,758 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-30 03:56:42,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:56:42,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 03:56:42,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:56:43,167 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=588493.3333333334, ans=0.0 2023-09-30 03:56:47,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-30 03:56:47,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-30 03:56:48,190 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=588560.0, ans=0.2 2023-09-30 03:56:49,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:56:49,662 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=588560.0, ans=0.2 2023-09-30 03:56:50,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:56:52,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:56:52,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-30 03:56:53,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:56:57,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:56:57,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:56:59,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-30 03:56:59,225 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:57:02,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:57:02,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-30 03:57:05,630 INFO [train.py:1039] (1/4) Epoch 17, batch 3300, loss[loss=0.1819, simple_loss=0.2702, pruned_loss=0.04681, over 24322.00 frames. ], tot_loss[loss=0.1826, simple_loss=0.2568, pruned_loss=0.05421, over 4707285.12 frames. ], batch size: 74, lr: 6.05e-03, grad_scale: 16.0 2023-09-30 03:57:05,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:57:05,846 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-30 03:57:09,475 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-30 03:57:11,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-30 03:57:11,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:57:15,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:57:17,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:57:17,167 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:57:17,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 03:57:18,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 03:57:22,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:57:23,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:57:25,152 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.863e+02 2.079e+02 2.237e+02 3.389e+02, threshold=4.158e+02, percent-clipped=0.0 2023-09-30 03:57:26,860 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-30 03:57:28,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:57:28,927 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:57:29,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:57:30,551 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-30 03:57:30,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:57:32,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 03:57:33,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 03:57:33,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:57:33,659 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-30 03:57:38,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:57:38,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-30 03:57:40,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:57:40,448 WARNING [train.py:1197] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-30 03:57:42,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-30 03:57:42,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:57:44,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:57:44,595 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 03:57:45,738 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-30 03:57:45,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-30 03:57:47,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:57:50,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-30 03:57:52,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:57:54,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:57:54,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:57:56,951 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.28 vs. limit=12.0 2023-09-30 03:57:57,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:57:59,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:57:59,562 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:57:59,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-30 03:58:01,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:58:02,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:58:02,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:58:05,427 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-30 03:58:06,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-30 03:58:08,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-30 03:58:09,980 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:58:09,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:58:12,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:58:12,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:58:13,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 03:58:15,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:58:15,862 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-30 03:58:15,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:58:17,873 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=588893.3333333334, ans=0.125 2023-09-30 03:58:18,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 03:58:20,741 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=588893.3333333334, ans=0.125 2023-09-30 03:58:21,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-30 03:58:22,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:58:23,511 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:58:25,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:58:25,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:58:26,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:58:28,041 INFO [train.py:1039] (1/4) Epoch 17, batch 3350, loss[loss=0.1761, simple_loss=0.2486, pruned_loss=0.05179, over 23411.00 frames. ], tot_loss[loss=0.1839, simple_loss=0.2578, pruned_loss=0.05495, over 4702382.00 frames. ], batch size: 105, lr: 6.05e-03, grad_scale: 16.0 2023-09-30 03:58:30,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:58:30,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:58:34,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:58:35,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:58:36,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:58:39,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:58:41,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:58:41,651 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=588960.0, ans=0.125 2023-09-30 03:58:43,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:58:43,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:58:44,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-30 03:58:46,338 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-30 03:58:46,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:58:51,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-30 03:58:51,729 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-30 03:58:53,215 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:58:53,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:58:54,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:58:54,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-30 03:58:54,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:58:54,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:58:56,576 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:58:59,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:58:59,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:59:00,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:59:04,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:59:06,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:59:06,454 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=589093.3333333334, ans=0.125 2023-09-30 03:59:07,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:59:07,954 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=589093.3333333334, ans=0.125 2023-09-30 03:59:08,282 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.15 vs. limit=15.0 2023-09-30 03:59:12,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:59:13,751 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.25 vs. limit=15.0 2023-09-30 03:59:14,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:59:14,526 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:59:14,541 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:59:17,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:59:20,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-30 03:59:20,647 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 03:59:20,709 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-30 03:59:20,788 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:59:24,321 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-30 03:59:25,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:59:27,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:59:35,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:59:35,543 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-30 03:59:36,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 03:59:37,055 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:59:39,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:59:42,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:59:46,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-30 03:59:46,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 03:59:46,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-30 03:59:49,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:59:49,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-30 03:59:50,546 INFO [train.py:1039] (1/4) Epoch 17, batch 3400, loss[loss=0.1743, simple_loss=0.2451, pruned_loss=0.05176, over 23699.00 frames. ], tot_loss[loss=0.1855, simple_loss=0.2597, pruned_loss=0.05569, over 4701140.17 frames. ], batch size: 149, lr: 6.05e-03, grad_scale: 16.0 2023-09-30 03:59:50,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:59:50,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-30 03:59:52,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:59:52,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:59:53,676 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-30 03:59:55,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:59:55,257 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-30 03:59:58,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-30 03:59:58,839 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-30 03:59:58,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:00:04,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:00:04,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 04:00:05,787 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:00:07,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-30 04:00:10,132 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 1.964e+02 2.182e+02 2.544e+02 4.408e+02, threshold=4.365e+02, percent-clipped=1.0 2023-09-30 04:00:13,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:00:17,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-30 04:00:22,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:00:25,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:00:25,296 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:00:26,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-30 04:00:32,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:00:36,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-30 04:00:43,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:00:43,444 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=589493.3333333334, ans=0.1 2023-09-30 04:00:44,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:00:44,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-30 04:00:46,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:00:46,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:00:46,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:00:47,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:00:51,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:00:52,752 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.95 vs. limit=15.0 2023-09-30 04:00:54,054 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=589493.3333333334, ans=0.125 2023-09-30 04:00:55,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 04:00:55,296 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:01:00,045 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:01:00,386 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=589560.0, ans=0.0 2023-09-30 04:01:02,902 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-30 04:01:07,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 04:01:12,423 INFO [train.py:1039] (1/4) Epoch 17, batch 3450, loss[loss=0.1619, simple_loss=0.2195, pruned_loss=0.05212, over 22786.00 frames. ], tot_loss[loss=0.1849, simple_loss=0.2587, pruned_loss=0.05554, over 4694794.78 frames. ], batch size: 322, lr: 6.05e-03, grad_scale: 16.0 2023-09-30 04:01:14,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-30 04:01:19,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-30 04:01:19,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:01:20,756 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:01:20,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-30 04:01:22,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:01:27,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-30 04:01:32,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:01:33,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:01:33,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:01:33,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:01:35,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:01:35,573 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=589693.3333333334, ans=0.0 2023-09-30 04:01:35,640 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=589693.3333333334, ans=0.1 2023-09-30 04:01:41,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-30 04:01:48,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-30 04:01:48,185 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 04:01:48,262 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:01:48,540 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=589760.0, ans=0.0 2023-09-30 04:01:50,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:01:56,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-30 04:01:57,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:02:00,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:02:00,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:02:01,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-30 04:02:04,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:02:06,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-30 04:02:06,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:02:07,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:02:09,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:02:13,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-30 04:02:17,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:02:22,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:02:23,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:02:23,399 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=589893.3333333334, ans=0.125 2023-09-30 04:02:26,168 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:02:31,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:02:32,012 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:02:33,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:02:33,498 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:02:34,871 INFO [train.py:1039] (1/4) Epoch 17, batch 3500, loss[loss=0.1955, simple_loss=0.2562, pruned_loss=0.06739, over 23857.00 frames. ], tot_loss[loss=0.1831, simple_loss=0.2567, pruned_loss=0.05474, over 4700527.88 frames. ], batch size: 179, lr: 6.05e-03, grad_scale: 16.0 2023-09-30 04:02:38,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:02:41,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:02:42,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-30 04:02:45,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 04:02:47,363 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-30 04:02:51,082 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=590026.6666666666, ans=0.125 2023-09-30 04:02:52,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:02:52,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-30 04:02:53,706 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.944e+02 2.191e+02 2.533e+02 4.328e+02, threshold=4.383e+02, percent-clipped=0.0 2023-09-30 04:02:57,517 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:02:57,840 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=590026.6666666666, ans=0.2 2023-09-30 04:02:59,047 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:03:00,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:03:00,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:03:00,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-30 04:03:00,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:03:00,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:03:02,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-30 04:03:02,578 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=590026.6666666666, ans=0.0 2023-09-30 04:03:05,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:03:05,916 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-30 04:03:08,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:03:10,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:03:11,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-30 04:03:12,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:03:14,992 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:03:15,684 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.47 vs. limit=22.5 2023-09-30 04:03:16,712 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=590093.3333333334, ans=0.1 2023-09-30 04:03:17,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:03:17,884 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:03:19,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:03:19,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:03:21,048 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-30 04:03:21,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-30 04:03:21,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-30 04:03:22,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:03:24,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:03:26,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:03:26,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:03:30,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 04:03:31,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:03:35,086 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=590160.0, ans=0.2 2023-09-30 04:03:37,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:03:38,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-30 04:03:38,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-30 04:03:38,515 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:03:42,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:03:42,527 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:03:42,784 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=590226.6666666666, ans=0.125 2023-09-30 04:03:42,908 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=590226.6666666666, ans=0.1 2023-09-30 04:03:44,031 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:03:47,250 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-30 04:03:48,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:03:48,866 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:03:50,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-30 04:03:53,355 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-30 04:03:53,816 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=590226.6666666666, ans=0.0 2023-09-30 04:03:54,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:03:56,855 INFO [train.py:1039] (1/4) Epoch 17, batch 3550, loss[loss=0.1649, simple_loss=0.2385, pruned_loss=0.04558, over 24405.00 frames. ], tot_loss[loss=0.182, simple_loss=0.2553, pruned_loss=0.05433, over 4698451.39 frames. ], batch size: 58, lr: 6.04e-03, grad_scale: 8.0 2023-09-30 04:03:56,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:03:57,010 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:03:57,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:04:00,367 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=590293.3333333334, ans=0.2 2023-09-30 04:04:02,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:04:09,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:04:13,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 04:04:16,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:04:18,451 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:04:21,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:04:21,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:04:21,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:04:26,101 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:04:26,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:04:26,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:04:26,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-30 04:04:27,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 04:04:33,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:04:33,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:04:34,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:04:34,737 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:04:34,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:04:34,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-30 04:04:34,889 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:04:37,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:04:37,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 04:04:41,985 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=590426.6666666666, ans=0.0 2023-09-30 04:04:43,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:04:45,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:04:47,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:04:48,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-30 04:04:49,246 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=590493.3333333334, ans=0.125 2023-09-30 04:04:50,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:04:51,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-30 04:04:52,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:04:54,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:04:56,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:04:58,019 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-30 04:04:58,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:04:59,081 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.53 vs. limit=15.0 2023-09-30 04:05:04,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:05:04,769 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-30 04:05:06,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:05:11,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:05:11,823 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=590560.0, ans=0.125 2023-09-30 04:05:13,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-30 04:05:17,780 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-30 04:05:19,794 INFO [train.py:1039] (1/4) Epoch 17, batch 3600, loss[loss=0.1862, simple_loss=0.2555, pruned_loss=0.05841, over 23564.00 frames. ], tot_loss[loss=0.1814, simple_loss=0.2555, pruned_loss=0.05366, over 4716477.32 frames. ], batch size: 134, lr: 6.04e-03, grad_scale: 16.0 2023-09-30 04:05:19,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:05:21,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:05:23,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:05:24,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:05:25,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:05:30,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:05:31,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:05:33,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:05:33,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:05:33,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:05:33,242 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-30 04:05:38,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 04:05:39,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:05:40,611 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.55 vs. limit=15.0 2023-09-30 04:05:41,049 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.895e+02 2.111e+02 2.493e+02 3.633e+02, threshold=4.223e+02, percent-clipped=0.0 2023-09-30 04:05:42,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:05:44,891 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:05:46,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:05:47,821 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:05:47,859 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-30 04:05:49,277 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:05:52,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:05:53,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:05:57,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:05:59,494 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:06:00,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:06:02,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-30 04:06:06,195 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=590760.0, ans=0.2 2023-09-30 04:06:07,617 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=590826.6666666666, ans=0.1 2023-09-30 04:06:10,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:06:10,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 04:06:11,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-30 04:06:15,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:06:20,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:06:23,203 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.14 vs. limit=15.0 2023-09-30 04:06:23,730 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:06:29,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-30 04:06:29,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:06:30,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-30 04:06:32,172 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=590893.3333333334, ans=0.125 2023-09-30 04:06:33,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-30 04:06:34,147 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=590893.3333333334, ans=0.125 2023-09-30 04:06:35,375 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-30 04:06:36,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:06:38,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:06:39,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-30 04:06:39,974 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:06:41,296 INFO [train.py:1039] (1/4) Epoch 17, batch 3650, loss[loss=0.1513, simple_loss=0.2268, pruned_loss=0.03789, over 24336.00 frames. ], tot_loss[loss=0.1816, simple_loss=0.2563, pruned_loss=0.05351, over 4731053.80 frames. ], batch size: 56, lr: 6.04e-03, grad_scale: 16.0 2023-09-30 04:06:41,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:06:41,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:06:41,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-30 04:06:42,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-30 04:06:46,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:06:47,754 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-30 04:06:52,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-30 04:06:55,669 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:06:59,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-30 04:07:00,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-30 04:07:05,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:07:05,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-30 04:07:06,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:07:10,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-30 04:07:10,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:07:12,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-30 04:07:13,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:07:13,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:07:13,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-30 04:07:15,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 04:07:16,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:07:16,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:07:18,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:07:19,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-30 04:07:21,371 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-30 04:07:21,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:07:24,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-30 04:07:24,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:07:24,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:07:31,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 04:07:33,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:07:33,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-30 04:07:34,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:07:35,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:07:38,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:07:41,867 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:07:43,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:07:43,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:07:44,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 04:07:44,895 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:07:46,390 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:07:51,374 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-30 04:07:54,395 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:07:54,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:07:56,521 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-30 04:07:56,609 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:07:58,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-30 04:07:58,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:07:59,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-30 04:07:59,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:08:03,221 INFO [train.py:1039] (1/4) Epoch 17, batch 3700, loss[loss=0.2389, simple_loss=0.2973, pruned_loss=0.09026, over 19785.00 frames. ], tot_loss[loss=0.1829, simple_loss=0.2574, pruned_loss=0.05422, over 4721929.60 frames. ], batch size: 388, lr: 6.04e-03, grad_scale: 16.0 2023-09-30 04:08:03,289 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 04:08:04,873 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:08:06,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:08:10,532 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:08:10,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-30 04:08:10,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:08:11,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 04:08:11,451 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 04:08:14,418 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=591293.3333333334, ans=0.015 2023-09-30 04:08:16,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 04:08:16,835 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=591293.3333333334, ans=0.125 2023-09-30 04:08:20,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:08:20,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:08:22,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:08:22,963 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.01 vs. limit=15.0 2023-09-30 04:08:23,660 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 2.066e+02 2.372e+02 2.822e+02 4.453e+02, threshold=4.744e+02, percent-clipped=1.0 2023-09-30 04:08:23,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:08:23,892 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 04:08:25,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:08:29,061 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-30 04:08:32,626 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=591360.0, ans=0.07 2023-09-30 04:08:35,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:08:35,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 04:08:35,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:08:35,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-30 04:08:37,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:08:40,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:08:42,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-30 04:08:42,343 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:08:46,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:08:47,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:08:47,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 04:08:49,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 04:08:53,004 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:08:53,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-30 04:08:54,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:08:55,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-30 04:08:59,252 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=591493.3333333334, ans=0.0 2023-09-30 04:09:02,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:09:02,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:09:05,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:09:06,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-30 04:09:08,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:09:08,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-30 04:09:08,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:09:08,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:09:13,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:09:15,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-30 04:09:16,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-30 04:09:17,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:09:17,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:09:17,946 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=591560.0, ans=0.125 2023-09-30 04:09:19,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:09:19,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:09:22,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:09:22,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:09:24,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:09:25,789 INFO [train.py:1039] (1/4) Epoch 17, batch 3750, loss[loss=0.2074, simple_loss=0.2723, pruned_loss=0.07124, over 23785.00 frames. ], tot_loss[loss=0.185, simple_loss=0.2593, pruned_loss=0.05536, over 4717506.11 frames. ], batch size: 179, lr: 6.04e-03, grad_scale: 16.0 2023-09-30 04:09:26,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-30 04:09:28,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 04:09:31,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-30 04:09:31,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-30 04:09:33,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:09:34,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:09:35,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:09:38,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:09:38,960 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=591626.6666666666, ans=0.1 2023-09-30 04:09:41,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:09:46,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:09:46,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 04:09:49,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:09:52,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:09:53,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-30 04:09:55,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:09:56,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:09:56,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:09:59,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-30 04:09:59,946 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=591760.0, ans=0.0 2023-09-30 04:10:06,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-30 04:10:06,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:10:06,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:10:09,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:10:14,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:10:15,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-30 04:10:21,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-30 04:10:25,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:10:26,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:10:28,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:10:31,952 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 04:10:36,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 04:10:38,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-30 04:10:40,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:10:40,400 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=591893.3333333334, ans=0.0 2023-09-30 04:10:41,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:10:44,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-30 04:10:46,642 INFO [train.py:1039] (1/4) Epoch 17, batch 3800, loss[loss=0.2056, simple_loss=0.2666, pruned_loss=0.07231, over 23855.00 frames. ], tot_loss[loss=0.184, simple_loss=0.2582, pruned_loss=0.05488, over 4719591.71 frames. ], batch size: 195, lr: 6.03e-03, grad_scale: 16.0 2023-09-30 04:10:51,704 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=591960.0, ans=0.125 2023-09-30 04:10:52,925 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:10:57,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:10:58,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 04:11:00,049 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-30 04:11:01,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:11:03,184 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:11:04,067 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=592026.6666666666, ans=0.125 2023-09-30 04:11:05,164 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-30 04:11:05,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=592026.6666666666, ans=0.05 2023-09-30 04:11:06,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 04:11:06,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:11:06,879 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:11:08,674 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 1.935e+02 2.193e+02 2.576e+02 3.771e+02, threshold=4.386e+02, percent-clipped=0.0 2023-09-30 04:11:10,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:11:11,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:11:11,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:11:11,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-30 04:11:16,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-30 04:11:16,556 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:11:18,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:11:20,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:11:20,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:11:22,450 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=592093.3333333334, ans=0.0 2023-09-30 04:11:23,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-30 04:11:23,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:11:26,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:11:28,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:11:34,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 04:11:34,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-30 04:11:36,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:11:43,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:11:48,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:11:50,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-30 04:11:52,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-30 04:11:53,593 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:11:55,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:11:55,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:11:55,929 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.64 vs. limit=15.0 2023-09-30 04:11:58,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-30 04:12:01,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-30 04:12:01,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-30 04:12:01,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:12:03,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:12:09,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:12:10,429 INFO [train.py:1039] (1/4) Epoch 17, batch 3850, loss[loss=0.1821, simple_loss=0.2693, pruned_loss=0.04746, over 24345.00 frames. ], tot_loss[loss=0.1825, simple_loss=0.2567, pruned_loss=0.05417, over 4714194.80 frames. ], batch size: 77, lr: 6.03e-03, grad_scale: 16.0 2023-09-30 04:12:10,599 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 04:12:11,208 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.64 vs. limit=6.0 2023-09-30 04:12:15,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:12:18,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-30 04:12:18,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:12:20,464 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:12:23,552 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 04:12:27,177 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:12:30,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-30 04:12:30,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-30 04:12:35,009 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:12:36,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:12:40,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:12:40,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:12:42,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:12:43,956 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:12:44,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:12:44,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:12:45,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:12:49,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:12:50,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:12:50,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:12:51,195 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=592426.6666666666, ans=0.125 2023-09-30 04:12:52,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-30 04:12:52,376 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-30 04:12:53,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:12:55,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:12:57,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:12:59,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:12:59,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-30 04:13:02,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-30 04:13:03,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:13:05,268 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-30 04:13:08,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-30 04:13:11,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:13:13,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:13:18,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:13:18,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-30 04:13:20,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-30 04:13:23,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:13:23,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:13:26,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 04:13:26,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 04:13:27,868 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:13:27,988 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:13:27,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:13:27,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-30 04:13:29,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:13:29,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-30 04:13:29,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:13:29,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:13:31,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:13:31,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:13:32,018 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:13:33,403 INFO [train.py:1039] (1/4) Epoch 17, batch 3900, loss[loss=0.1774, simple_loss=0.2364, pruned_loss=0.05916, over 22709.00 frames. ], tot_loss[loss=0.1812, simple_loss=0.2551, pruned_loss=0.05366, over 4709581.65 frames. ], batch size: 322, lr: 6.03e-03, grad_scale: 16.0 2023-09-30 04:13:33,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:13:33,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:13:35,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:13:35,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-30 04:13:35,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:13:38,096 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:13:39,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 04:13:39,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:13:41,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:13:44,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 04:13:44,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:13:47,616 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=4.77 vs. limit=12.0 2023-09-30 04:13:48,297 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-30 04:13:49,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-30 04:13:49,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:13:51,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-30 04:13:51,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:13:53,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-30 04:13:54,484 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.932e+02 2.197e+02 2.548e+02 3.814e+02, threshold=4.393e+02, percent-clipped=0.0 2023-09-30 04:13:54,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-30 04:13:59,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:14:03,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:14:03,323 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:14:04,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:14:08,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:14:09,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:14:11,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:14:11,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:14:12,653 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:14:20,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:14:20,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:14:28,386 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.90 vs. limit=8.0 2023-09-30 04:14:28,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 04:14:30,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:14:39,351 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=592893.3333333334, ans=0.2 2023-09-30 04:14:40,521 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:14:42,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=592893.3333333334, ans=0.0 2023-09-30 04:14:43,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:14:45,212 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-30 04:14:45,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-30 04:14:45,307 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:14:46,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-30 04:14:48,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:14:49,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-30 04:14:54,806 INFO [train.py:1039] (1/4) Epoch 17, batch 3950, loss[loss=0.1693, simple_loss=0.2502, pruned_loss=0.04417, over 23367.00 frames. ], tot_loss[loss=0.1811, simple_loss=0.2554, pruned_loss=0.05336, over 4719041.68 frames. ], batch size: 93, lr: 6.03e-03, grad_scale: 16.0 2023-09-30 04:14:58,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:14:58,678 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-30 04:15:00,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:15:03,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:15:04,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:15:11,401 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-30 04:15:11,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:15:13,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-30 04:15:13,669 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-30 04:15:15,096 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:15:18,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:15:18,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:15:19,539 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:15:21,542 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=593026.6666666666, ans=0.125 2023-09-30 04:15:22,575 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-30 04:15:24,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:15:24,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:15:24,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:15:25,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:15:25,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:15:26,265 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=593093.3333333334, ans=0.07 2023-09-30 04:15:35,459 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=593093.3333333334, ans=0.1 2023-09-30 04:15:36,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:15:36,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:15:43,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-30 04:15:48,542 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-30 04:15:48,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-30 04:15:48,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:15:48,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:15:56,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:15:56,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-30 04:15:56,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:15:57,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:15:57,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-30 04:16:00,159 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=593226.6666666666, ans=0.2 2023-09-30 04:16:05,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:16:06,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:16:11,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-30 04:16:18,013 INFO [train.py:1039] (1/4) Epoch 17, batch 4000, loss[loss=0.1983, simple_loss=0.2647, pruned_loss=0.06597, over 23615.00 frames. ], tot_loss[loss=0.1818, simple_loss=0.2562, pruned_loss=0.05375, over 4708999.76 frames. ], batch size: 256, lr: 6.03e-03, grad_scale: 32.0 2023-09-30 04:16:19,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:16:23,641 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=593293.3333333334, ans=0.0 2023-09-30 04:16:28,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:16:32,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:16:34,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:16:34,404 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:16:34,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-30 04:16:36,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-30 04:16:38,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-30 04:16:38,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:16:38,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-30 04:16:40,240 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 1.856e+02 2.093e+02 2.279e+02 3.915e+02, threshold=4.185e+02, percent-clipped=0.0 2023-09-30 04:16:40,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:16:40,867 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=593360.0, ans=0.125 2023-09-30 04:16:43,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:16:43,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:16:43,426 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:16:43,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:16:43,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-30 04:16:46,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:16:47,830 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-30 04:16:49,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:16:51,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:16:54,385 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-30 04:16:55,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 04:16:55,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:17:04,118 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-30 04:17:05,603 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:17:08,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:17:10,219 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-30 04:17:10,423 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:17:10,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-30 04:17:12,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:17:12,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:17:14,148 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=593493.3333333334, ans=0.0 2023-09-30 04:17:15,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:17:16,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:17:16,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-30 04:17:16,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:17:18,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-30 04:17:18,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:17:22,309 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-30 04:17:26,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:17:30,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 04:17:32,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:17:34,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:17:35,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:17:35,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:17:39,920 INFO [train.py:1039] (1/4) Epoch 17, batch 4050, loss[loss=0.1836, simple_loss=0.2577, pruned_loss=0.05475, over 23370.00 frames. ], tot_loss[loss=0.1811, simple_loss=0.256, pruned_loss=0.05309, over 4725943.52 frames. ], batch size: 93, lr: 6.03e-03, grad_scale: 32.0 2023-09-30 04:17:41,634 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:17:43,240 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.min_positive, batch_count=593626.6666666666, ans=0.05 2023-09-30 04:17:44,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-30 04:17:45,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-30 04:17:47,665 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:17:47,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:17:49,740 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:17:51,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-30 04:17:52,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:17:56,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:17:59,466 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:17:59,663 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=593693.3333333334, ans=0.0 2023-09-30 04:18:00,915 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 04:18:02,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:18:03,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:18:06,290 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.98 vs. limit=15.0 2023-09-30 04:18:07,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:18:09,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-30 04:18:12,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 04:18:14,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-30 04:18:14,570 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-30 04:18:17,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:18:17,936 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=593760.0, ans=0.0 2023-09-30 04:18:19,413 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=593760.0, ans=0.2 2023-09-30 04:18:22,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-30 04:18:26,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:18:27,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:18:32,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:18:32,479 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:18:32,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:18:32,794 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=593826.6666666666, ans=0.125 2023-09-30 04:18:35,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:18:38,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-30 04:18:38,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 04:18:42,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:18:43,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-30 04:18:49,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:18:55,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-30 04:18:56,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:18:56,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:19:00,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-30 04:19:00,198 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-30 04:19:00,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:19:01,733 INFO [train.py:1039] (1/4) Epoch 17, batch 4100, loss[loss=0.2005, simple_loss=0.2637, pruned_loss=0.06859, over 23758.00 frames. ], tot_loss[loss=0.1837, simple_loss=0.2581, pruned_loss=0.05469, over 4710287.05 frames. ], batch size: 164, lr: 6.02e-03, grad_scale: 32.0 2023-09-30 04:19:01,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:19:03,401 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:19:03,426 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:19:08,470 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=593960.0, ans=0.2 2023-09-30 04:19:09,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-30 04:19:09,762 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-30 04:19:11,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-30 04:19:11,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-30 04:19:11,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:19:13,637 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:19:13,684 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:19:13,706 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:19:15,185 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-30 04:19:20,211 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:19:20,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:19:20,368 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:19:21,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:19:23,815 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=594026.6666666666, ans=0.125 2023-09-30 04:19:24,773 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.428e+02 1.800e+02 1.954e+02 2.140e+02 3.094e+02, threshold=3.909e+02, percent-clipped=0.0 2023-09-30 04:19:25,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 04:19:26,645 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:19:27,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:19:28,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-30 04:19:30,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:19:30,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:19:30,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:19:32,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:19:32,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-30 04:19:35,431 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:19:35,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-30 04:19:38,742 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:19:40,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:19:40,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-30 04:19:41,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:19:43,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:19:43,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:19:46,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-30 04:19:48,302 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.44 vs. limit=22.5 2023-09-30 04:19:49,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:19:51,293 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:19:53,107 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-30 04:19:55,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:19:55,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:19:58,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:20:01,612 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:20:05,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:20:07,374 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:20:13,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:20:13,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:20:16,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:20:20,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:20:23,694 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:20:25,123 INFO [train.py:1039] (1/4) Epoch 17, batch 4150, loss[loss=0.1808, simple_loss=0.2697, pruned_loss=0.0459, over 24515.00 frames. ], tot_loss[loss=0.1841, simple_loss=0.2581, pruned_loss=0.05502, over 4701581.52 frames. ], batch size: 71, lr: 6.02e-03, grad_scale: 16.0 2023-09-30 04:20:25,178 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:20:25,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:20:25,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:20:29,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-30 04:20:29,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:20:30,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-30 04:20:30,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-30 04:20:30,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-30 04:20:33,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:20:37,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:20:37,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:20:40,093 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=594293.3333333334, ans=0.125 2023-09-30 04:20:41,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:20:41,449 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:20:42,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-30 04:20:42,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 04:20:44,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:20:46,090 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-30 04:20:47,857 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=594360.0, ans=0.0 2023-09-30 04:20:49,585 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.56 vs. limit=15.0 2023-09-30 04:20:50,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:20:55,756 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:20:57,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-30 04:20:59,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-30 04:20:59,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:21:01,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-30 04:21:01,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:21:01,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:21:04,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:21:05,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:21:08,167 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=594426.6666666666, ans=0.125 2023-09-30 04:21:10,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-30 04:21:11,146 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=594426.6666666666, ans=0.125 2023-09-30 04:21:14,013 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-30 04:21:14,352 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=594493.3333333334, ans=0.0 2023-09-30 04:21:15,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:21:17,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-30 04:21:17,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:21:20,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-30 04:21:20,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:21:22,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:21:25,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:21:26,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-30 04:21:26,690 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:21:26,705 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-30 04:21:28,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 04:21:30,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-30 04:21:30,511 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:21:30,517 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 04:21:32,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 04:21:34,033 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-30 04:21:34,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:21:34,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 04:21:34,413 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=594560.0, ans=0.125 2023-09-30 04:21:35,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:21:37,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:21:37,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-30 04:21:38,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-30 04:21:45,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:21:47,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-30 04:21:48,365 INFO [train.py:1039] (1/4) Epoch 17, batch 4200, loss[loss=0.2016, simple_loss=0.2837, pruned_loss=0.05978, over 23958.00 frames. ], tot_loss[loss=0.1832, simple_loss=0.2564, pruned_loss=0.05501, over 4690366.33 frames. ], batch size: 80, lr: 6.02e-03, grad_scale: 16.0 2023-09-30 04:21:48,585 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:21:52,178 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:21:53,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:21:54,095 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=594626.6666666666, ans=0.0 2023-09-30 04:21:55,208 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:21:55,211 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:21:56,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-30 04:22:00,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-30 04:22:00,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:22:03,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:22:08,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:22:12,241 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.960e+02 2.323e+02 2.795e+02 4.279e+02, threshold=4.646e+02, percent-clipped=2.0 2023-09-30 04:22:12,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-30 04:22:15,393 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:22:15,444 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:22:16,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-30 04:22:16,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:22:18,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:22:18,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:22:18,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:22:20,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:22:23,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-30 04:22:23,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:22:27,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-30 04:22:29,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:22:29,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:22:32,098 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.33 vs. limit=12.0 2023-09-30 04:22:32,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:22:34,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:22:34,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-30 04:22:35,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:22:35,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:22:41,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-30 04:22:44,047 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:22:49,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:22:52,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-30 04:22:54,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:22:54,656 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=594893.3333333334, ans=0.2 2023-09-30 04:23:00,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 04:23:00,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:23:04,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-30 04:23:07,852 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.43 vs. limit=22.5 2023-09-30 04:23:10,834 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-30 04:23:12,293 INFO [train.py:1039] (1/4) Epoch 17, batch 4250, loss[loss=0.1921, simple_loss=0.2674, pruned_loss=0.05835, over 24035.00 frames. ], tot_loss[loss=0.1821, simple_loss=0.256, pruned_loss=0.05409, over 4705213.11 frames. ], batch size: 80, lr: 6.02e-03, grad_scale: 16.0 2023-09-30 04:23:14,220 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=594960.0, ans=0.0 2023-09-30 04:23:15,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:23:15,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-30 04:23:17,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:23:23,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-30 04:23:25,074 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-30 04:23:25,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:23:28,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:23:32,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:23:36,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:23:36,904 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:23:40,237 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:23:40,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:23:41,160 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=15.48 vs. limit=15.0 2023-09-30 04:23:41,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:23:41,918 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:23:43,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:23:46,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:23:48,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:23:50,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-30 04:23:53,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-30 04:23:53,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:23:53,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:23:53,477 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:23:54,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:23:54,957 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:23:55,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:23:58,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-30 04:24:00,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:24:05,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:24:07,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:24:07,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-30 04:24:07,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:24:07,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=595160.0, ans=0.125 2023-09-30 04:24:09,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-30 04:24:11,249 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:24:12,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-30 04:24:14,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:24:15,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:24:16,064 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=595160.0, ans=0.0 2023-09-30 04:24:17,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-30 04:24:19,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 04:24:19,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-30 04:24:25,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:24:28,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:24:30,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:24:30,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:24:31,940 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:24:33,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:24:34,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:24:34,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-30 04:24:34,436 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=595293.3333333334, ans=0.0 2023-09-30 04:24:34,879 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.78 vs. limit=12.0 2023-09-30 04:24:35,509 INFO [train.py:1039] (1/4) Epoch 17, batch 4300, loss[loss=0.1934, simple_loss=0.2694, pruned_loss=0.0587, over 22845.00 frames. ], tot_loss[loss=0.1812, simple_loss=0.2549, pruned_loss=0.05375, over 4694616.15 frames. ], batch size: 50, lr: 6.02e-03, grad_scale: 8.0 2023-09-30 04:24:35,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:24:40,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:24:40,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:24:46,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:24:54,380 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=595360.0, ans=0.1 2023-09-30 04:24:55,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:24:55,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-30 04:24:57,052 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:24:59,971 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.465e+02 1.855e+02 2.115e+02 2.472e+02 4.142e+02, threshold=4.230e+02, percent-clipped=0.0 2023-09-30 04:25:00,091 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-30 04:25:00,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:25:00,157 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-30 04:25:03,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 04:25:03,447 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.max_abs, batch_count=595360.0, ans=10.0 2023-09-30 04:25:03,488 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=595360.0, ans=0.125 2023-09-30 04:25:04,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:25:05,042 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=595360.0, ans=0.07 2023-09-30 04:25:08,354 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-30 04:25:08,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:25:09,877 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-30 04:25:11,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 04:25:15,140 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:25:16,916 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=595426.6666666666, ans=0.125 2023-09-30 04:25:18,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:25:18,357 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:25:19,835 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:25:21,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:25:23,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:25:23,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-30 04:25:23,721 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=595493.3333333334, ans=0.0 2023-09-30 04:25:24,911 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-30 04:25:26,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:25:29,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:25:29,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 04:25:29,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:25:29,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:25:31,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-30 04:25:31,813 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-30 04:25:33,376 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-30 04:25:34,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:25:34,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-30 04:25:36,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-30 04:25:39,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:25:41,094 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-30 04:25:41,194 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:25:44,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:25:44,685 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:25:46,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-30 04:25:47,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:25:47,874 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:25:49,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:25:49,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:25:51,434 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:25:53,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:25:53,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:25:54,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:25:54,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:25:55,164 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=595560.0, ans=0.035 2023-09-30 04:25:57,834 INFO [train.py:1039] (1/4) Epoch 17, batch 4350, loss[loss=0.1825, simple_loss=0.2525, pruned_loss=0.05626, over 23656.00 frames. ], tot_loss[loss=0.1815, simple_loss=0.2557, pruned_loss=0.05367, over 4704513.12 frames. ], batch size: 149, lr: 6.02e-03, grad_scale: 8.0 2023-09-30 04:26:00,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-30 04:26:01,523 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-30 04:26:06,604 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:26:09,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:26:09,912 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=595626.6666666666, ans=0.2 2023-09-30 04:26:11,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:26:11,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:26:16,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:26:20,257 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:26:24,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:26:24,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:26:26,490 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 04:26:27,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-30 04:26:30,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:26:32,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-30 04:26:37,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-30 04:26:39,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:26:40,432 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.82 vs. limit=6.0 2023-09-30 04:26:41,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:26:42,100 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.87 vs. limit=6.0 2023-09-30 04:26:44,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:26:47,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-30 04:26:51,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:26:52,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 04:26:52,933 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=595826.6666666666, ans=0.1 2023-09-30 04:26:58,215 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 04:26:59,300 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-30 04:27:00,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:27:00,918 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:27:03,809 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-30 04:27:03,943 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-30 04:27:03,953 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:27:03,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:27:05,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:27:05,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:27:05,852 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=595893.3333333334, ans=0.0 2023-09-30 04:27:07,070 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:27:07,159 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:27:10,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-30 04:27:10,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:27:10,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:27:10,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:27:11,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-30 04:27:13,947 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-30 04:27:13,955 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-30 04:27:13,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-30 04:27:14,230 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=595893.3333333334, ans=0.125 2023-09-30 04:27:14,373 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 04:27:18,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:27:19,862 INFO [train.py:1039] (1/4) Epoch 17, batch 4400, loss[loss=0.1673, simple_loss=0.2397, pruned_loss=0.04747, over 23247.00 frames. ], tot_loss[loss=0.1823, simple_loss=0.2565, pruned_loss=0.05406, over 4712909.79 frames. ], batch size: 105, lr: 6.01e-03, grad_scale: 16.0 2023-09-30 04:27:19,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:27:19,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:27:21,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:27:23,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-30 04:27:24,665 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-30 04:27:24,677 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:27:28,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:27:28,243 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:27:30,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:27:31,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-30 04:27:33,765 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-30 04:27:33,831 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-30 04:27:33,876 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-30 04:27:34,139 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=595960.0, ans=0.0 2023-09-30 04:27:34,812 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.40 vs. limit=15.0 2023-09-30 04:27:35,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 04:27:35,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:27:38,608 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-30 04:27:40,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:27:41,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:27:41,840 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-30 04:27:44,836 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.492e+02 1.901e+02 2.144e+02 2.567e+02 3.604e+02, threshold=4.289e+02, percent-clipped=0.0 2023-09-30 04:27:44,996 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:27:45,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-30 04:27:46,408 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-30 04:27:48,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-30 04:27:48,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-30 04:27:50,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-30 04:27:50,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:27:52,631 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:27:52,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:27:52,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:27:54,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-30 04:27:54,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-30 04:27:55,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:27:56,180 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=596093.3333333334, ans=0.5 2023-09-30 04:27:57,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:27:57,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:27:57,907 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=596093.3333333334, ans=0.2 2023-09-30 04:27:59,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:28:01,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:28:01,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-30 04:28:01,207 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-30 04:28:06,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:28:13,919 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:28:17,019 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-30 04:28:20,044 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:28:23,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:28:27,430 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:28:27,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-30 04:28:27,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:28:27,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:28:27,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:28:28,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-30 04:28:32,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-30 04:28:35,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-30 04:28:36,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-30 04:28:36,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:28:37,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-30 04:28:38,236 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.49 vs. limit=15.0 2023-09-30 04:28:38,253 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.19 vs. limit=10.0 2023-09-30 04:28:38,905 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:28:39,426 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=596226.6666666666, ans=0.0 2023-09-30 04:28:41,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:28:44,023 INFO [train.py:1039] (1/4) Epoch 17, batch 4450, loss[loss=0.2434, simple_loss=0.2972, pruned_loss=0.09481, over 19673.00 frames. ], tot_loss[loss=0.1836, simple_loss=0.2577, pruned_loss=0.05474, over 4701159.40 frames. ], batch size: 389, lr: 6.01e-03, grad_scale: 16.0 2023-09-30 04:28:44,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-30 04:28:44,466 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=596293.3333333334, ans=0.0 2023-09-30 04:28:44,518 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=596293.3333333334, ans=0.125 2023-09-30 04:28:47,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:28:48,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:28:48,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:28:55,617 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:28:55,878 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=596293.3333333334, ans=0.0 2023-09-30 04:28:55,907 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=596293.3333333334, ans=0.125 2023-09-30 04:28:57,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:29:00,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:29:03,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:29:03,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:29:04,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:29:07,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-30 04:29:07,405 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:29:07,536 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:29:07,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:29:07,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:29:07,925 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=596360.0, ans=0.09899494936611666 2023-09-30 04:29:10,738 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 04:29:16,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:29:16,357 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:29:16,544 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:29:18,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:29:18,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:29:22,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 04:29:24,452 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-30 04:29:24,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-30 04:29:24,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:29:26,173 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=596426.6666666666, ans=0.125 2023-09-30 04:29:29,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:29:30,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-30 04:29:34,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-30 04:29:40,244 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:29:41,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-30 04:29:41,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:29:41,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:29:41,796 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:29:41,807 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:29:43,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:29:47,150 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-30 04:29:47,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-30 04:29:50,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 04:29:50,577 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=596560.0, ans=0.5 2023-09-30 04:29:50,589 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=596560.0, ans=0.0 2023-09-30 04:29:51,825 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:29:52,119 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=596560.0, ans=0.2 2023-09-30 04:29:53,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:29:54,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:29:56,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 04:29:59,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-30 04:30:01,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-30 04:30:02,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:30:06,597 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=596626.6666666666, ans=0.125 2023-09-30 04:30:07,627 INFO [train.py:1039] (1/4) Epoch 17, batch 4500, loss[loss=0.1842, simple_loss=0.2411, pruned_loss=0.06361, over 23399.00 frames. ], tot_loss[loss=0.1842, simple_loss=0.2581, pruned_loss=0.05514, over 4700293.40 frames. ], batch size: 285, lr: 6.01e-03, grad_scale: 16.0 2023-09-30 04:30:07,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:30:09,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-30 04:30:09,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-30 04:30:11,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:30:19,887 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:30:19,979 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:30:21,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 04:30:23,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:30:23,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:30:23,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:30:31,164 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=596693.3333333334, ans=0.125 2023-09-30 04:30:32,274 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.445e+02 1.892e+02 2.101e+02 2.322e+02 3.249e+02, threshold=4.203e+02, percent-clipped=0.0 2023-09-30 04:30:35,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:30:35,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:30:38,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:30:40,951 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:30:41,120 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:30:49,170 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 04:30:53,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:30:56,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 04:31:00,116 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:31:01,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-30 04:31:01,663 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:31:03,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:31:04,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:31:06,207 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:31:07,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:31:07,895 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-30 04:31:07,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 04:31:07,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:31:13,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:31:14,018 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:31:19,000 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:31:22,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-30 04:31:22,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:31:25,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-30 04:31:25,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-30 04:31:25,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-30 04:31:30,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-30 04:31:31,551 INFO [train.py:1039] (1/4) Epoch 17, batch 4550, loss[loss=0.1773, simple_loss=0.2649, pruned_loss=0.04486, over 24382.00 frames. ], tot_loss[loss=0.1829, simple_loss=0.257, pruned_loss=0.05437, over 4704123.53 frames. ], batch size: 77, lr: 6.01e-03, grad_scale: 16.0 2023-09-30 04:31:34,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-30 04:31:35,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:31:36,142 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=596960.0, ans=0.125 2023-09-30 04:31:38,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:31:39,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:31:43,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:31:48,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:31:49,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:31:51,987 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:31:51,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:31:51,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:31:53,641 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:31:53,721 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:31:54,070 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=597026.6666666666, ans=0.1 2023-09-30 04:31:57,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:32:00,449 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-30 04:32:00,633 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=597026.6666666666, ans=0.1 2023-09-30 04:32:02,430 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-30 04:32:03,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:32:04,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-30 04:32:06,064 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.48 vs. limit=15.0 2023-09-30 04:32:07,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-30 04:32:08,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:32:11,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-30 04:32:13,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 04:32:14,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:32:16,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:32:16,414 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:32:16,755 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=597093.3333333334, ans=0.2 2023-09-30 04:32:18,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-30 04:32:21,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:32:25,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:32:25,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:32:26,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:32:26,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-30 04:32:28,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-30 04:32:28,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:32:28,941 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=597160.0, ans=0.0 2023-09-30 04:32:30,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-30 04:32:31,810 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-30 04:32:31,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:32:35,255 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:32:35,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:32:37,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:32:37,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:32:39,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 04:32:40,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-30 04:32:40,992 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=597226.6666666666, ans=0.0 2023-09-30 04:32:42,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:32:43,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 04:32:43,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-30 04:32:43,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:32:43,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-30 04:32:44,068 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=597226.6666666666, ans=0.2 2023-09-30 04:32:45,580 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=597226.6666666666, ans=0.125 2023-09-30 04:32:46,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:32:47,009 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:32:50,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:32:50,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:32:50,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-30 04:32:50,702 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.23 vs. limit=6.0 2023-09-30 04:32:52,984 INFO [train.py:1039] (1/4) Epoch 17, batch 4600, loss[loss=0.1759, simple_loss=0.254, pruned_loss=0.04896, over 21136.00 frames. ], tot_loss[loss=0.1823, simple_loss=0.256, pruned_loss=0.05425, over 4699830.59 frames. ], batch size: 46, lr: 6.01e-03, grad_scale: 16.0 2023-09-30 04:32:53,089 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:32:54,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-30 04:32:57,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:32:59,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:33:01,937 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.76 vs. limit=12.0 2023-09-30 04:33:03,145 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:33:03,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:33:03,498 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=597293.3333333334, ans=0.2 2023-09-30 04:33:05,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:33:05,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-30 04:33:08,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:33:10,938 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=597360.0, ans=0.125 2023-09-30 04:33:13,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:33:13,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:33:17,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:33:18,329 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.818e+02 2.016e+02 2.179e+02 3.781e+02, threshold=4.032e+02, percent-clipped=0.0 2023-09-30 04:33:19,282 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=12.96 vs. limit=15.0 2023-09-30 04:33:23,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-30 04:33:23,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:33:25,773 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=6.71 vs. limit=12.0 2023-09-30 04:33:26,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:33:26,623 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=597426.6666666666, ans=0.0 2023-09-30 04:33:29,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:33:29,585 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=597426.6666666666, ans=0.1 2023-09-30 04:33:30,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:33:36,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-30 04:33:36,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 04:33:37,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:33:42,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:33:44,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-30 04:33:46,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:33:50,033 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-30 04:33:51,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-30 04:33:52,032 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=597493.3333333334, ans=0.5 2023-09-30 04:33:56,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:33:56,525 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=597493.3333333334, ans=0.2 2023-09-30 04:33:57,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:34:00,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:34:00,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 04:34:00,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:34:00,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-30 04:34:00,870 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:34:00,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:34:03,837 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:34:03,953 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:34:04,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:34:05,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-30 04:34:06,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-30 04:34:06,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-30 04:34:06,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:34:09,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:34:09,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:34:10,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:34:15,483 INFO [train.py:1039] (1/4) Epoch 17, batch 4650, loss[loss=0.162, simple_loss=0.2066, pruned_loss=0.05868, over 19162.00 frames. ], tot_loss[loss=0.1816, simple_loss=0.2553, pruned_loss=0.05389, over 4694655.09 frames. ], batch size: 388, lr: 6.01e-03, grad_scale: 16.0 2023-09-30 04:34:20,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:34:25,280 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.69 vs. limit=12.0 2023-09-30 04:34:26,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:34:26,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:34:27,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:34:27,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:34:27,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:34:27,897 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:34:32,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-30 04:34:35,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:34:38,825 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-30 04:34:38,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:34:40,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-30 04:34:40,322 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:34:40,406 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-30 04:34:42,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-30 04:34:42,452 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:34:43,905 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:34:45,711 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:34:47,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:34:47,303 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-30 04:34:49,872 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=597760.0, ans=0.1 2023-09-30 04:34:50,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:34:52,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-30 04:34:56,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:34:56,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:34:56,242 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-30 04:34:59,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:35:00,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:35:05,959 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:35:10,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:35:13,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:35:15,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:35:15,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:35:18,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-30 04:35:18,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-30 04:35:20,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 04:35:20,408 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-30 04:35:22,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:35:30,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:35:30,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:35:31,009 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-30 04:35:31,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:35:32,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:35:32,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:35:34,206 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:35:35,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:35:35,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:35:36,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:35:39,573 INFO [train.py:1039] (1/4) Epoch 17, batch 4700, loss[loss=0.1924, simple_loss=0.276, pruned_loss=0.05444, over 24385.00 frames. ], tot_loss[loss=0.1814, simple_loss=0.2556, pruned_loss=0.05354, over 4703558.97 frames. ], batch size: 77, lr: 6.00e-03, grad_scale: 16.0 2023-09-30 04:35:39,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:35:39,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:35:40,796 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.70 vs. limit=10.0 2023-09-30 04:35:41,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 04:35:42,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-30 04:35:44,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:35:44,346 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-30 04:35:44,617 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=597960.0, ans=0.0 2023-09-30 04:35:52,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:35:55,170 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:35:55,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:35:56,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:35:59,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 04:36:03,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-30 04:36:03,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-30 04:36:04,679 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.407e+02 1.853e+02 2.008e+02 2.274e+02 4.210e+02, threshold=4.016e+02, percent-clipped=1.0 2023-09-30 04:36:06,534 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:36:08,056 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:36:08,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:36:08,369 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=598026.6666666666, ans=0.1 2023-09-30 04:36:09,852 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=598026.6666666666, ans=0.125 2023-09-30 04:36:11,270 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=598093.3333333334, ans=0.125 2023-09-30 04:36:11,679 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.19 vs. limit=22.5 2023-09-30 04:36:13,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:36:17,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 04:36:19,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 04:36:22,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:36:32,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-30 04:36:33,825 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:36:35,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:36:38,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-30 04:36:39,485 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:36:43,933 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:36:44,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-30 04:36:46,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:36:46,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:36:47,779 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.75 vs. limit=15.0 2023-09-30 04:36:50,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:36:50,619 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:36:51,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-30 04:36:52,035 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-30 04:36:54,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:36:56,793 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=598226.6666666666, ans=0.2 2023-09-30 04:36:57,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:36:57,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:36:57,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-30 04:36:58,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:37:00,900 INFO [train.py:1039] (1/4) Epoch 17, batch 4750, loss[loss=0.1569, simple_loss=0.2337, pruned_loss=0.04009, over 24356.00 frames. ], tot_loss[loss=0.1819, simple_loss=0.2565, pruned_loss=0.05361, over 4701482.41 frames. ], batch size: 56, lr: 6.00e-03, grad_scale: 16.0 2023-09-30 04:37:03,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-30 04:37:06,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:37:06,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:37:12,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:37:12,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:37:14,010 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=598293.3333333334, ans=0.1 2023-09-30 04:37:15,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-30 04:37:16,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:37:19,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-30 04:37:21,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:37:21,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:37:22,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:37:26,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-30 04:37:32,639 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:37:34,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-30 04:37:36,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:37:37,012 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 04:37:39,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:37:39,699 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:37:39,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:37:39,880 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-30 04:37:39,885 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-30 04:37:46,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-30 04:37:48,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:37:51,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:37:53,428 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=598493.3333333334, ans=0.0 2023-09-30 04:37:54,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 04:37:54,659 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-30 04:37:54,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:37:56,461 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=598493.3333333334, ans=0.0 2023-09-30 04:37:59,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:38:01,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:38:03,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-30 04:38:03,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-30 04:38:03,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:38:04,514 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:38:04,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:38:04,859 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=598493.3333333334, ans=0.025 2023-09-30 04:38:06,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 04:38:06,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-30 04:38:09,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-30 04:38:11,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:38:14,835 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=598560.0, ans=0.0 2023-09-30 04:38:15,963 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:38:15,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-30 04:38:16,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:38:19,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:38:21,156 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-30 04:38:21,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:38:22,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 04:38:24,088 INFO [train.py:1039] (1/4) Epoch 17, batch 4800, loss[loss=0.1685, simple_loss=0.2492, pruned_loss=0.04393, over 24320.00 frames. ], tot_loss[loss=0.1825, simple_loss=0.2571, pruned_loss=0.05396, over 4711382.88 frames. ], batch size: 61, lr: 6.00e-03, grad_scale: 32.0 2023-09-30 04:38:26,478 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:38:26,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-30 04:38:26,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-30 04:38:28,160 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-30 04:38:31,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-30 04:38:32,605 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:38:34,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-30 04:38:39,540 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:38:39,626 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:38:41,374 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 04:38:44,184 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:38:47,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:38:47,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:38:47,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-30 04:38:48,173 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=598693.3333333334, ans=0.0 2023-09-30 04:38:49,115 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.366e+02 1.827e+02 2.030e+02 2.375e+02 4.462e+02, threshold=4.061e+02, percent-clipped=1.0 2023-09-30 04:38:49,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:38:49,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:38:50,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:38:54,443 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:38:56,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:38:56,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:38:57,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:38:57,660 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 04:38:58,351 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:38:59,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:39:02,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:39:05,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:39:07,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:39:07,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-30 04:39:07,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 04:39:09,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:39:11,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-30 04:39:11,074 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-30 04:39:12,684 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:39:12,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:39:12,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:39:12,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:39:12,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:39:16,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:39:16,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:39:19,748 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:39:23,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:39:23,741 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:39:27,011 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=598826.6666666666, ans=0.125 2023-09-30 04:39:28,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-30 04:39:29,846 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:39:29,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:39:29,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 04:39:30,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:39:33,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:39:35,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:39:35,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:39:36,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:39:37,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:39:37,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:39:37,350 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=598893.3333333334, ans=0.125 2023-09-30 04:39:42,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:39:43,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:39:43,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:39:43,884 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 04:39:45,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-30 04:39:48,819 INFO [train.py:1039] (1/4) Epoch 17, batch 4850, loss[loss=0.1879, simple_loss=0.271, pruned_loss=0.05239, over 24434.00 frames. ], tot_loss[loss=0.1831, simple_loss=0.2577, pruned_loss=0.05427, over 4712213.22 frames. ], batch size: 69, lr: 6.00e-03, grad_scale: 32.0 2023-09-30 04:39:48,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-30 04:39:48,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:39:48,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:39:49,050 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:39:49,051 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:39:52,137 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:40:01,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-30 04:40:03,509 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:40:09,987 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:40:11,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 04:40:11,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:40:14,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:40:16,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:40:17,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:40:17,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-30 04:40:23,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:40:24,750 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:40:26,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 04:40:26,802 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 04:40:26,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-30 04:40:29,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:40:29,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:40:30,184 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=599093.3333333334, ans=0.025 2023-09-30 04:40:33,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:40:33,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-30 04:40:33,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-30 04:40:35,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 04:40:42,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:40:43,500 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-30 04:40:45,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:40:45,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:40:48,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-30 04:40:48,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-30 04:40:48,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:40:49,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-30 04:40:49,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:40:51,260 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:40:53,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-30 04:41:03,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:41:08,025 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:41:08,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:41:10,612 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=599293.3333333334, ans=0.0 2023-09-30 04:41:11,702 INFO [train.py:1039] (1/4) Epoch 17, batch 4900, loss[loss=0.1716, simple_loss=0.2415, pruned_loss=0.05085, over 23736.00 frames. ], tot_loss[loss=0.1821, simple_loss=0.2564, pruned_loss=0.05392, over 4718424.28 frames. ], batch size: 149, lr: 6.00e-03, grad_scale: 32.0 2023-09-30 04:41:15,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-30 04:41:15,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:41:19,099 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.76 vs. limit=22.5 2023-09-30 04:41:19,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:41:21,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:41:21,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:41:21,814 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=599293.3333333334, ans=0.125 2023-09-30 04:41:24,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-30 04:41:30,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-30 04:41:35,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-30 04:41:36,397 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.673e+02 1.955e+02 2.270e+02 2.616e+02 3.974e+02, threshold=4.540e+02, percent-clipped=0.0 2023-09-30 04:41:36,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-30 04:41:36,598 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-30 04:41:36,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:41:36,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:41:36,702 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:41:36,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:41:38,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-30 04:41:43,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-30 04:41:43,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 04:41:44,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-30 04:41:45,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-30 04:41:46,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:41:48,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:41:48,373 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:41:48,389 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-30 04:41:50,044 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.23 vs. limit=15.0 2023-09-30 04:41:50,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:41:52,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:41:52,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-30 04:41:52,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-30 04:41:56,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-30 04:41:57,124 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=599426.6666666666, ans=0.025 2023-09-30 04:41:58,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:41:59,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:41:59,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:42:01,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:42:01,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 04:42:01,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:42:03,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-30 04:42:05,398 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:42:07,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-30 04:42:08,807 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=599493.3333333334, ans=0.125 2023-09-30 04:42:10,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:42:15,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-30 04:42:15,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:42:15,245 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-30 04:42:16,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-30 04:42:22,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:42:23,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:42:25,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-30 04:42:25,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 04:42:26,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:42:28,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:42:31,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:42:31,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:42:31,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:42:32,649 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-30 04:42:34,764 INFO [train.py:1039] (1/4) Epoch 17, batch 4950, loss[loss=0.1773, simple_loss=0.2427, pruned_loss=0.056, over 23892.00 frames. ], tot_loss[loss=0.1813, simple_loss=0.2554, pruned_loss=0.05367, over 4721641.75 frames. ], batch size: 195, lr: 6.00e-03, grad_scale: 32.0 2023-09-30 04:42:34,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 04:42:39,933 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:42:39,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 04:42:42,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-30 04:42:44,350 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-30 04:42:44,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-30 04:42:45,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-30 04:42:45,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:42:45,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:42:47,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:42:47,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:42:49,408 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:42:49,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:42:51,008 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:42:52,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:42:55,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:42:55,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:42:59,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 04:43:05,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:43:06,380 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.38 vs. limit=22.5 2023-09-30 04:43:06,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:43:08,357 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:43:08,442 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:43:11,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:43:12,171 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-30 04:43:12,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-30 04:43:15,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:43:18,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:43:18,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:43:20,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:43:21,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:43:21,761 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:43:24,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:43:26,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-30 04:43:28,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:43:28,819 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=599826.6666666666, ans=0.0 2023-09-30 04:43:31,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:43:31,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:43:31,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-30 04:43:33,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:43:35,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:43:38,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:43:39,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:43:39,968 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:43:40,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:43:41,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:43:41,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:43:44,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:43:44,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:43:44,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:43:46,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-30 04:43:51,387 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:43:55,626 INFO [train.py:1039] (1/4) Epoch 17, batch 5000, loss[loss=0.1982, simple_loss=0.2633, pruned_loss=0.06656, over 22727.00 frames. ], tot_loss[loss=0.1809, simple_loss=0.2553, pruned_loss=0.05327, over 4726549.58 frames. ], batch size: 322, lr: 5.99e-03, grad_scale: 16.0 2023-09-30 04:43:57,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-30 04:43:57,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-30 04:43:58,146 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=599960.0, ans=0.0 2023-09-30 04:43:58,252 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=599960.0, ans=0.0 2023-09-30 04:44:05,670 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:44:05,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-30 04:44:07,256 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-30 04:44:07,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-30 04:44:09,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:44:11,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-30 04:44:11,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:44:11,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:44:12,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-30 04:44:14,270 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:44:15,200 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=14.71 vs. limit=22.5 2023-09-30 04:44:16,245 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:44:16,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-30 04:44:17,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:44:17,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:44:20,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-30 04:44:20,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-30 04:44:20,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:44:22,124 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 1.867e+02 2.091e+02 2.376e+02 3.821e+02, threshold=4.182e+02, percent-clipped=0.0 2023-09-30 04:44:22,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-30 04:44:22,324 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 04:44:23,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:44:23,836 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 04:44:23,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-30 04:44:23,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-30 04:44:26,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-30 04:44:27,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:44:27,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:44:29,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-30 04:44:29,305 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-30 04:44:32,323 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:44:34,369 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:44:34,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-30 04:44:36,132 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-30 04:44:36,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:44:39,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:44:42,464 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-30 04:44:44,634 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:44:46,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:44:46,268 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:44:49,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-30 04:44:49,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:44:51,568 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:44:51,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:44:53,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-30 04:44:53,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:44:56,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:44:58,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:45:03,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-30 04:45:03,632 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=600226.6666666666, ans=0.0 2023-09-30 04:45:06,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:45:08,902 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=600226.6666666666, ans=0.0 2023-09-30 04:45:09,114 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.52 vs. limit=22.5 2023-09-30 04:45:18,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:45:19,648 INFO [train.py:1039] (1/4) Epoch 17, batch 5050, loss[loss=0.2018, simple_loss=0.2687, pruned_loss=0.0675, over 23660.00 frames. ], tot_loss[loss=0.1817, simple_loss=0.2558, pruned_loss=0.05379, over 4723122.06 frames. ], batch size: 232, lr: 5.99e-03, grad_scale: 16.0 2023-09-30 04:45:19,740 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:45:19,752 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:45:19,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:45:19,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 04:45:19,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-30 04:45:19,956 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:45:23,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:45:23,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-30 04:45:25,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:45:28,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:45:29,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:45:29,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-30 04:45:31,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:45:31,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:45:35,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 04:45:36,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:45:36,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-30 04:45:46,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-30 04:45:46,317 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-30 04:45:48,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:45:48,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-30 04:45:48,657 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:45:50,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:45:50,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:45:51,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:45:51,788 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-30 04:45:51,953 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-30 04:45:53,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:45:53,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=600426.6666666666, ans=0.125 2023-09-30 04:45:55,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:46:00,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:46:00,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-30 04:46:04,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:46:05,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-30 04:46:07,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:46:07,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:46:07,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:46:08,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:46:09,137 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=600493.3333333334, ans=0.0 2023-09-30 04:46:11,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:46:12,010 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=600493.3333333334, ans=0.125 2023-09-30 04:46:13,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:46:13,614 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=600493.3333333334, ans=0.5 2023-09-30 04:46:15,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:46:15,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:46:15,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:46:15,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-30 04:46:15,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:46:16,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:46:19,625 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.06 vs. limit=8.0 2023-09-30 04:46:20,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:46:20,249 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-30 04:46:20,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-30 04:46:23,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:46:25,282 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:46:25,338 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-30 04:46:27,664 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.64 vs. limit=15.0 2023-09-30 04:46:29,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:46:29,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-30 04:46:29,945 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:46:35,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:46:35,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:46:35,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-30 04:46:37,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-30 04:46:38,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:46:40,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:46:40,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:46:42,014 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-30 04:46:43,304 INFO [train.py:1039] (1/4) Epoch 17, batch 5100, loss[loss=0.1841, simple_loss=0.2556, pruned_loss=0.05627, over 23777.00 frames. ], tot_loss[loss=0.1825, simple_loss=0.2567, pruned_loss=0.05412, over 4721017.10 frames. ], batch size: 164, lr: 5.99e-03, grad_scale: 16.0 2023-09-30 04:46:44,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:46:48,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-30 04:46:48,613 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=600626.6666666666, ans=0.2 2023-09-30 04:46:49,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-30 04:46:51,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:46:52,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:46:55,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:46:55,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-30 04:46:55,987 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-30 04:47:02,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:47:04,285 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:47:04,586 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=600693.3333333334, ans=0.125 2023-09-30 04:47:08,740 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.763e+02 1.981e+02 2.119e+02 3.450e+02, threshold=3.962e+02, percent-clipped=0.0 2023-09-30 04:47:08,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:47:12,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-30 04:47:14,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:47:16,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:47:16,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-30 04:47:18,973 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.59 vs. limit=15.0 2023-09-30 04:47:19,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:47:19,787 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:47:19,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-30 04:47:22,848 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-30 04:47:22,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:47:24,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-30 04:47:24,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-30 04:47:26,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:47:37,866 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:47:38,626 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.22 vs. limit=15.0 2023-09-30 04:47:39,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-30 04:47:39,564 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-30 04:47:39,579 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-30 04:47:41,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-30 04:47:41,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:47:44,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-30 04:47:44,489 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=600826.6666666666, ans=0.125 2023-09-30 04:47:48,186 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=600893.3333333334, ans=0.125 2023-09-30 04:47:50,198 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-30 04:47:53,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 04:47:54,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:47:57,616 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-30 04:47:59,237 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-30 04:47:59,286 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-30 04:48:01,803 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=600893.3333333334, ans=0.0 2023-09-30 04:48:04,772 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=600960.0, ans=0.125 2023-09-30 04:48:05,694 INFO [train.py:1039] (1/4) Epoch 17, batch 5150, loss[loss=0.1978, simple_loss=0.2561, pruned_loss=0.06976, over 23766.00 frames. ], tot_loss[loss=0.1825, simple_loss=0.257, pruned_loss=0.05404, over 4726998.96 frames. ], batch size: 212, lr: 5.99e-03, grad_scale: 16.0 2023-09-30 04:48:05,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:48:05,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:48:05,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:48:05,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:48:07,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 04:48:07,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:48:07,844 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=600960.0, ans=0.125 2023-09-30 04:48:09,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-30 04:48:09,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-30 04:48:09,692 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-30 04:48:09,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:48:09,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-30 04:48:11,239 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:48:12,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 04:48:12,808 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:48:14,361 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:48:14,674 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=600960.0, ans=0.0 2023-09-30 04:48:20,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 04:48:20,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-30 04:48:22,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:48:22,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:48:26,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-30 04:48:26,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:48:26,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:48:26,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:48:26,299 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:48:27,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-30 04:48:28,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:48:29,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:48:31,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 04:48:31,514 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=601026.6666666666, ans=0.125 2023-09-30 04:48:32,791 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-30 04:48:34,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:48:36,545 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=601026.6666666666, ans=0.125 2023-09-30 04:48:40,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:48:43,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-30 04:48:46,042 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:48:52,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:48:53,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:48:54,183 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=601160.0, ans=0.0 2023-09-30 04:49:00,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:49:00,396 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:49:03,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-30 04:49:06,661 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:49:06,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:49:08,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:49:11,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:49:11,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:49:13,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-30 04:49:18,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:49:19,606 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.86 vs. limit=6.0 2023-09-30 04:49:20,313 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 04:49:23,317 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:49:23,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:49:23,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-30 04:49:23,771 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=601226.6666666666, ans=0.125 2023-09-30 04:49:24,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-30 04:49:24,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:49:25,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:49:25,237 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=601226.6666666666, ans=0.125 2023-09-30 04:49:27,922 INFO [train.py:1039] (1/4) Epoch 17, batch 5200, loss[loss=0.2409, simple_loss=0.3015, pruned_loss=0.09013, over 19739.00 frames. ], tot_loss[loss=0.1828, simple_loss=0.2575, pruned_loss=0.05402, over 4733673.18 frames. ], batch size: 388, lr: 5.99e-03, grad_scale: 32.0 2023-09-30 04:49:29,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:49:31,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:49:35,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:49:39,500 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=601293.3333333334, ans=0.2 2023-09-30 04:49:40,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-30 04:49:42,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:49:44,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:49:46,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:49:47,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:49:47,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:49:50,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-30 04:49:53,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 04:49:53,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:49:55,070 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.517e+02 1.818e+02 1.970e+02 2.155e+02 3.146e+02, threshold=3.941e+02, percent-clipped=0.0 2023-09-30 04:49:56,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-30 04:49:56,362 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=601360.0, ans=0.125 2023-09-30 04:49:57,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-30 04:49:59,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:49:59,294 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-30 04:50:00,751 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-30 04:50:03,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-30 04:50:03,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:50:03,835 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-30 04:50:05,867 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:50:06,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:50:06,274 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=601426.6666666666, ans=0.125 2023-09-30 04:50:07,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:50:08,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-30 04:50:08,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:50:10,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:50:14,305 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-30 04:50:14,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-30 04:50:14,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-30 04:50:20,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-30 04:50:21,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 04:50:27,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-30 04:50:27,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:50:29,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-30 04:50:30,572 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:50:30,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-30 04:50:30,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:50:30,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 04:50:35,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:50:36,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:50:40,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:50:42,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:50:42,022 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:50:46,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:50:48,706 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-30 04:50:50,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:50:50,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:50:50,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:50:51,672 INFO [train.py:1039] (1/4) Epoch 17, batch 5250, loss[loss=0.1858, simple_loss=0.2728, pruned_loss=0.04938, over 24591.00 frames. ], tot_loss[loss=0.1822, simple_loss=0.2564, pruned_loss=0.05397, over 4730415.87 frames. ], batch size: 71, lr: 5.99e-03, grad_scale: 32.0 2023-09-30 04:50:51,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-30 04:50:53,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:50:56,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:51:00,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:51:00,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:51:01,783 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:51:07,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:51:07,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:51:10,971 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.90 vs. limit=15.0 2023-09-30 04:51:11,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:51:13,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 04:51:15,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-30 04:51:15,375 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:51:17,867 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.82 vs. limit=15.0 2023-09-30 04:51:18,375 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:51:22,827 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=601760.0, ans=0.0 2023-09-30 04:51:27,675 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=601760.0, ans=0.07 2023-09-30 04:52:06,478 INFO [train.py:1039] (1/4) Epoch 17, batch 5300, loss[loss=0.1556, simple_loss=0.2293, pruned_loss=0.04095, over 24270.00 frames. ], tot_loss[loss=0.1815, simple_loss=0.2555, pruned_loss=0.05379, over 4713953.43 frames. ], batch size: 56, lr: 5.98e-03, grad_scale: 16.0 2023-09-30 04:52:21,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:52:21,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-30 04:52:21,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-30 04:52:21,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:52:22,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:52:22,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:52:22,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:52:22,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:52:22,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:52:22,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:52:22,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-30 04:52:23,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:52:23,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-30 04:52:23,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-30 04:52:23,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-30 04:52:23,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-30 04:52:23,683 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-30 04:52:23,807 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-30 04:52:24,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:52:24,477 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:52:24,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:52:24,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:52:25,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:52:25,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:52:25,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:52:25,792 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:52:25,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:52:25,966 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:52:25,974 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:52:25,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:52:26,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:52:26,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-30 04:52:27,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:52:27,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:52:27,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-30 04:52:27,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-30 04:52:27,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-30 04:52:27,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:52:27,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-30 04:52:28,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-30 04:52:28,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-30 04:52:29,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:52:29,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:52:29,668 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-30 04:52:29,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-30 04:52:29,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-30 04:52:29,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:52:30,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-30 04:52:30,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-30 04:52:30,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-30 04:52:30,570 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-30 04:52:39,900 INFO [train.py:1039] (1/4) Epoch 18, batch 0, loss[loss=0.1923, simple_loss=0.2664, pruned_loss=0.05909, over 23332.00 frames. ], tot_loss[loss=0.1923, simple_loss=0.2664, pruned_loss=0.05909, over 23332.00 frames. ], batch size: 93, lr: 5.81e-03, grad_scale: 32.0 2023-09-30 04:52:39,900 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-30 04:52:53,322 INFO [train.py:1071] (1/4) Epoch 18, validation: loss=0.3168, simple_loss=0.2865, pruned_loss=0.1735, over 1125622.00 frames. 2023-09-30 04:52:53,323 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-30 04:52:57,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-30 04:52:58,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:52:58,909 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=602040.0, ans=0.125 2023-09-30 04:53:00,120 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:53:01,502 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.655e+02 1.872e+02 2.065e+02 2.362e+02 3.138e+02, threshold=4.130e+02, percent-clipped=0.0 2023-09-30 04:53:02,025 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=602040.0, ans=0.125 2023-09-30 04:53:06,812 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:53:06,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:53:06,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:53:08,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-30 04:53:09,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-30 04:53:11,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:53:13,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:53:16,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:53:16,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:53:16,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:53:16,324 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:53:19,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-30 04:53:19,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:53:29,649 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:53:29,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:53:31,232 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-30 04:53:34,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-30 04:53:34,312 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:53:36,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:53:37,921 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=602173.3333333334, ans=0.125 2023-09-30 04:53:41,112 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:53:44,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:53:51,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-30 04:53:55,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-30 04:53:55,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:53:55,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:53:57,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:53:59,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:54:01,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-30 04:54:03,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:54:05,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:54:10,444 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:54:13,556 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-30 04:54:13,888 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=602373.3333333334, ans=0.125 2023-09-30 04:54:15,521 INFO [train.py:1039] (1/4) Epoch 18, batch 50, loss[loss=0.1997, simple_loss=0.2713, pruned_loss=0.06404, over 23302.00 frames. ], tot_loss[loss=0.1821, simple_loss=0.2586, pruned_loss=0.05283, over 1066494.21 frames. ], batch size: 106, lr: 5.81e-03, grad_scale: 32.0 2023-09-30 04:54:17,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:54:20,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:54:20,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:54:21,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-30 04:54:21,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 04:54:21,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:54:23,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:54:23,953 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=10.03 vs. limit=15.0 2023-09-30 04:54:24,978 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:54:26,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:54:28,928 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.58 vs. limit=15.0 2023-09-30 04:54:29,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-30 04:54:29,566 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:54:37,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-30 04:54:38,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-30 04:54:41,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-30 04:54:43,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:54:44,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:54:44,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:54:46,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:54:48,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-30 04:54:48,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 04:54:48,303 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:54:57,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:54:58,901 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:54:58,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:55:00,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-30 04:55:03,470 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:55:03,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:55:03,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-30 04:55:03,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:55:06,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-30 04:55:08,603 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=602573.3333333334, ans=0.125 2023-09-30 04:55:13,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:55:13,859 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=602573.3333333334, ans=0.0 2023-09-30 04:55:15,042 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:55:15,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:55:16,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:55:18,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-30 04:55:21,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-30 04:55:21,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-30 04:55:21,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:55:21,572 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-30 04:55:24,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:55:24,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:55:26,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-30 04:55:28,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-30 04:55:28,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-30 04:55:29,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:55:30,027 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=602640.0, ans=0.2 2023-09-30 04:55:31,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:55:31,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-30 04:55:31,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-30 04:55:32,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:55:34,337 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:55:35,732 INFO [train.py:1039] (1/4) Epoch 18, batch 100, loss[loss=0.1893, simple_loss=0.2541, pruned_loss=0.06221, over 23830.00 frames. ], tot_loss[loss=0.1842, simple_loss=0.26, pruned_loss=0.0542, over 1881585.90 frames. ], batch size: 212, lr: 5.81e-03, grad_scale: 16.0 2023-09-30 04:55:35,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-30 04:55:35,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:55:39,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:55:39,834 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.71 vs. limit=15.0 2023-09-30 04:55:42,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:55:44,526 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.88 vs. limit=15.0 2023-09-30 04:55:44,969 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.863e+02 2.072e+02 2.465e+02 3.411e+02, threshold=4.144e+02, percent-clipped=0.0 2023-09-30 04:55:45,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:55:47,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-30 04:55:47,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:55:51,442 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:55:51,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:55:51,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:55:51,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:55:51,564 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:55:52,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-30 04:55:54,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-30 04:55:55,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:55:55,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:55:55,973 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:55:57,660 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=602773.3333333334, ans=0.125 2023-09-30 04:56:00,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-30 04:56:01,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:56:02,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:56:04,041 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:56:05,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 04:56:10,194 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-30 04:56:10,221 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-30 04:56:11,786 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:56:11,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:56:15,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-30 04:56:17,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:56:19,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:56:22,304 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=602840.0, ans=0.0 2023-09-30 04:56:24,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:56:25,665 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-30 04:56:27,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-30 04:56:27,706 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=602906.6666666666, ans=0.2 2023-09-30 04:56:30,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-30 04:56:31,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:56:35,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:56:38,541 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:56:40,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:56:43,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:56:46,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:56:47,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:56:49,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:56:49,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:56:51,344 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:56:51,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-30 04:56:51,479 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-30 04:56:51,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:56:53,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:56:53,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:56:53,840 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:56:53,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 04:56:55,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 04:56:55,347 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-30 04:56:55,357 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:56:56,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:56:56,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:56:57,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:56:58,448 INFO [train.py:1039] (1/4) Epoch 18, batch 150, loss[loss=0.1674, simple_loss=0.2426, pruned_loss=0.04615, over 20363.00 frames. ], tot_loss[loss=0.1851, simple_loss=0.2601, pruned_loss=0.0551, over 2503403.50 frames. ], batch size: 44, lr: 5.81e-03, grad_scale: 16.0 2023-09-30 04:56:58,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:57:02,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:57:05,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:57:05,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:57:05,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:57:07,115 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 04:57:08,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:57:08,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:57:08,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=603040.0, ans=0.0 2023-09-30 04:57:11,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-30 04:57:11,655 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:57:16,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-30 04:57:16,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-30 04:57:16,538 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-30 04:57:19,596 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:57:19,622 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:57:21,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:57:23,200 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:57:23,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:57:24,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:57:24,710 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:57:24,924 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=603106.6666666666, ans=0.2 2023-09-30 04:57:26,191 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-30 04:57:27,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:57:33,344 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=603173.3333333334, ans=0.0 2023-09-30 04:57:35,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:57:39,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:57:39,733 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-30 04:57:42,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:57:42,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:57:43,047 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:57:44,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:57:48,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:57:49,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:57:51,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:57:52,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-30 04:57:54,499 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=603240.0, ans=0.125 2023-09-30 04:57:58,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:57:58,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:57:58,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:57:59,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:58:01,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:58:02,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 04:58:06,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:58:07,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:58:09,748 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:58:12,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:58:12,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-30 04:58:12,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:58:12,797 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-30 04:58:16,897 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.24 vs. limit=5.0 2023-09-30 04:58:17,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:58:20,319 INFO [train.py:1039] (1/4) Epoch 18, batch 200, loss[loss=0.163, simple_loss=0.2428, pruned_loss=0.0416, over 24568.00 frames. ], tot_loss[loss=0.1865, simple_loss=0.2611, pruned_loss=0.05595, over 2986302.61 frames. ], batch size: 60, lr: 5.81e-03, grad_scale: 16.0 2023-09-30 04:58:20,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:58:20,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:58:23,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-30 04:58:23,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:58:23,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:58:24,228 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.50 vs. limit=22.5 2023-09-30 04:58:27,220 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-30 04:58:30,021 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.448e+02 1.857e+02 2.048e+02 2.282e+02 3.617e+02, threshold=4.095e+02, percent-clipped=0.0 2023-09-30 04:58:30,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-30 04:58:32,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:58:32,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:58:35,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:58:35,432 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:58:35,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:58:56,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:58:57,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:58:58,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:58:59,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:58:59,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 04:58:59,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 04:59:01,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:59:03,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:59:03,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:59:03,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:59:06,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-30 04:59:06,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 04:59:06,914 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:59:11,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:59:13,983 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.29 vs. limit=12.0 2023-09-30 04:59:21,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:59:27,592 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:59:27,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=603640.0, ans=0.125 2023-09-30 04:59:29,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:59:35,380 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:59:36,230 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys.whitening_limit, batch_count=603640.0, ans=6.0 2023-09-30 04:59:38,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-30 04:59:38,979 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:59:38,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:59:39,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:59:40,570 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:59:41,943 INFO [train.py:1039] (1/4) Epoch 18, batch 250, loss[loss=0.1759, simple_loss=0.2475, pruned_loss=0.05209, over 23273.00 frames. ], tot_loss[loss=0.1857, simple_loss=0.2604, pruned_loss=0.05546, over 3372413.57 frames. ], batch size: 119, lr: 5.81e-03, grad_scale: 16.0 2023-09-30 04:59:42,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-30 04:59:44,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:59:44,138 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-30 04:59:45,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:59:47,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:59:47,405 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:59:47,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:59:52,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:59:52,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:59:53,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:59:56,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:00:05,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:00:09,005 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:00:09,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:00:14,117 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=603840.0, ans=0.2 2023-09-30 05:00:18,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-30 05:00:19,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-30 05:00:19,318 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=603840.0, ans=0.125 2023-09-30 05:00:20,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:00:20,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:00:22,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 05:00:22,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:00:22,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:00:23,215 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=603840.0, ans=0.0 2023-09-30 05:00:26,417 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:00:29,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-30 05:00:29,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:00:31,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:00:32,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:00:32,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:00:32,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:00:35,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 05:00:35,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:00:37,257 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:00:38,856 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:00:38,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:00:39,260 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=603906.6666666666, ans=0.05 2023-09-30 05:00:42,168 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:00:47,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:00:48,771 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=603973.3333333334, ans=0.04949747468305833 2023-09-30 05:00:50,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:00:56,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:00:58,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:01:03,425 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-30 05:01:04,803 INFO [train.py:1039] (1/4) Epoch 18, batch 300, loss[loss=0.1725, simple_loss=0.2376, pruned_loss=0.05367, over 23845.00 frames. ], tot_loss[loss=0.1831, simple_loss=0.2581, pruned_loss=0.0541, over 3669462.77 frames. ], batch size: 179, lr: 5.80e-03, grad_scale: 16.0 2023-09-30 05:01:04,922 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:01:04,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:01:06,859 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=604040.0, ans=0.1 2023-09-30 05:01:07,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-30 05:01:08,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-30 05:01:09,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:01:09,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-30 05:01:14,086 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.412e+02 1.790e+02 1.961e+02 2.231e+02 3.675e+02, threshold=3.922e+02, percent-clipped=0.0 2023-09-30 05:01:14,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:01:15,726 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:01:18,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:01:20,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-30 05:01:20,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:01:20,760 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=604106.6666666666, ans=0.125 2023-09-30 05:01:23,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 05:01:23,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-30 05:01:23,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:01:27,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-30 05:01:34,426 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 05:01:34,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-30 05:01:38,051 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-30 05:01:38,135 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:01:39,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:01:42,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:01:42,824 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-30 05:01:42,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:01:45,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:01:46,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:01:47,013 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.27 vs. limit=6.0 2023-09-30 05:01:47,444 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:01:49,277 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=604173.3333333334, ans=0.0 2023-09-30 05:01:49,949 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.88 vs. limit=12.0 2023-09-30 05:01:50,760 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-30 05:01:50,767 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-30 05:01:52,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:01:55,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:01:58,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-30 05:01:59,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:01:59,936 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 05:02:03,370 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:02:06,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:02:06,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-30 05:02:10,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:02:11,615 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 05:02:13,270 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:02:14,712 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:02:16,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-30 05:02:16,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 05:02:16,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:02:16,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-30 05:02:19,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:02:20,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:02:22,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:02:22,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:02:23,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:02:25,337 INFO [train.py:1039] (1/4) Epoch 18, batch 350, loss[loss=0.1905, simple_loss=0.2707, pruned_loss=0.05511, over 23975.00 frames. ], tot_loss[loss=0.1817, simple_loss=0.2566, pruned_loss=0.05339, over 3899286.13 frames. ], batch size: 80, lr: 5.80e-03, grad_scale: 16.0 2023-09-30 05:02:25,773 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=604373.3333333334, ans=0.0 2023-09-30 05:02:28,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:02:28,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 05:02:30,249 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:02:32,592 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.85 vs. limit=15.0 2023-09-30 05:02:35,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:02:39,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:02:40,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:02:42,999 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-30 05:02:44,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:02:44,808 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=604440.0, ans=0.0 2023-09-30 05:02:46,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-30 05:02:47,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:02:47,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-30 05:02:49,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:02:51,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-30 05:02:52,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:02:54,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:02:55,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:02:57,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:02:57,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:02:57,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:02:57,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:02:57,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-30 05:02:59,505 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=8.07 vs. limit=15.0 2023-09-30 05:02:59,678 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.76 vs. limit=12.0 2023-09-30 05:03:00,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:03:00,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:03:09,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:03:09,114 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:03:10,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:03:10,723 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:03:16,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-30 05:03:16,738 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:03:22,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:03:22,621 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:03:22,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:03:24,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-30 05:03:27,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:03:27,257 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-30 05:03:30,127 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-30 05:03:30,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:03:33,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:03:33,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-30 05:03:36,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:03:38,158 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=604640.0, ans=0.125 2023-09-30 05:03:39,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:03:41,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:03:41,608 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=604640.0, ans=0.0 2023-09-30 05:03:43,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:03:43,466 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:03:46,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:03:47,927 INFO [train.py:1039] (1/4) Epoch 18, batch 400, loss[loss=0.1853, simple_loss=0.2636, pruned_loss=0.05352, over 23474.00 frames. ], tot_loss[loss=0.1804, simple_loss=0.256, pruned_loss=0.05233, over 4091859.68 frames. ], batch size: 134, lr: 5.80e-03, grad_scale: 32.0 2023-09-30 05:03:49,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:03:53,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:03:53,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-30 05:03:53,374 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:03:55,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:03:55,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:03:57,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:03:58,494 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.488e+02 1.749e+02 1.897e+02 2.088e+02 3.470e+02, threshold=3.794e+02, percent-clipped=0.0 2023-09-30 05:04:00,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:04:01,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:04:03,238 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-30 05:04:04,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-30 05:04:04,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:04:06,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-30 05:04:06,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:04:11,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:04:11,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:04:12,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-30 05:04:12,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:04:12,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:04:12,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:04:15,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:04:17,327 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-30 05:04:18,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-30 05:04:23,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:04:23,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:04:26,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-30 05:04:27,712 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-30 05:04:30,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:04:32,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:04:36,279 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.90 vs. limit=12.0 2023-09-30 05:04:38,714 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-30 05:04:40,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-30 05:04:41,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-30 05:04:43,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:04:46,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:04:46,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-30 05:04:50,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:04:53,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 05:04:55,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:04:57,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:04:59,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-30 05:05:02,699 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-30 05:05:02,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-30 05:05:05,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 05:05:05,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:05:07,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-30 05:05:10,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:05:10,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:05:10,578 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-30 05:05:11,907 INFO [train.py:1039] (1/4) Epoch 18, batch 450, loss[loss=0.1743, simple_loss=0.2603, pruned_loss=0.04412, over 24358.00 frames. ], tot_loss[loss=0.1804, simple_loss=0.2559, pruned_loss=0.05247, over 4238967.75 frames. ], batch size: 77, lr: 5.80e-03, grad_scale: 32.0 2023-09-30 05:05:12,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-30 05:05:13,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:05:13,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:05:13,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:05:13,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-30 05:05:15,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:05:16,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 05:05:18,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 05:05:28,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:05:29,757 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:05:31,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-30 05:05:33,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-30 05:05:37,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:05:40,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:05:40,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:05:43,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:05:44,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:05:46,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-30 05:05:48,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-30 05:05:48,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-30 05:05:49,711 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:05:49,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:05:49,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:05:51,549 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-30 05:05:51,563 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-30 05:05:53,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:05:54,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:05:56,839 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-30 05:06:01,941 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-30 05:06:01,999 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:06:02,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-30 05:06:04,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-30 05:06:05,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:06:08,907 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-30 05:06:08,995 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 05:06:11,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-30 05:06:16,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:06:16,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-30 05:06:18,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-30 05:06:18,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:06:20,983 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.47 vs. limit=22.5 2023-09-30 05:06:21,750 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=605306.6666666666, ans=0.125 2023-09-30 05:06:24,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:06:26,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:06:28,202 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:06:29,576 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-30 05:06:33,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:06:34,652 INFO [train.py:1039] (1/4) Epoch 18, batch 500, loss[loss=0.1777, simple_loss=0.2627, pruned_loss=0.04631, over 24330.00 frames. ], tot_loss[loss=0.182, simple_loss=0.2571, pruned_loss=0.0534, over 4339706.68 frames. ], batch size: 74, lr: 5.80e-03, grad_scale: 16.0 2023-09-30 05:06:34,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:06:36,372 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:06:36,387 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-30 05:06:38,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-30 05:06:38,691 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:06:40,940 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=605373.3333333334, ans=0.125 2023-09-30 05:06:44,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 05:06:46,375 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=605373.3333333334, ans=0.125 2023-09-30 05:06:47,416 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.482e+02 1.876e+02 2.083e+02 2.292e+02 4.355e+02, threshold=4.166e+02, percent-clipped=1.0 2023-09-30 05:06:48,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 05:06:50,505 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:06:53,670 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:06:53,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:06:55,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:07:03,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:07:03,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-30 05:07:03,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-30 05:07:03,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:07:03,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-30 05:07:04,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 05:07:07,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:07:07,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:07:09,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:07:09,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:07:11,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-30 05:07:14,997 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-30 05:07:18,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:07:20,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:07:21,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:07:21,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:07:21,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-30 05:07:24,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-30 05:07:26,765 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=605573.3333333334, ans=0.0 2023-09-30 05:07:27,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:07:28,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:07:32,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:07:34,740 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=605573.3333333334, ans=0.125 2023-09-30 05:07:35,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:07:42,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:07:46,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-30 05:07:46,953 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:07:46,976 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:07:50,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-30 05:07:52,810 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-30 05:07:55,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:07:58,188 INFO [train.py:1039] (1/4) Epoch 18, batch 550, loss[loss=0.175, simple_loss=0.2507, pruned_loss=0.0496, over 24441.00 frames. ], tot_loss[loss=0.1829, simple_loss=0.2579, pruned_loss=0.05388, over 4413156.88 frames. ], batch size: 63, lr: 5.80e-03, grad_scale: 16.0 2023-09-30 05:08:01,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-30 05:08:02,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-30 05:08:02,800 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:08:02,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-30 05:08:02,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:08:02,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:08:04,486 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:08:04,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:08:04,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:08:04,858 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=605706.6666666666, ans=0.125 2023-09-30 05:08:06,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:08:09,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:08:09,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-30 05:08:09,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:08:13,893 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:08:13,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:08:17,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:08:18,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:08:20,950 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.13 vs. limit=15.0 2023-09-30 05:08:27,432 WARNING [train.py:1197] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-30 05:08:28,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-30 05:08:29,205 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=605840.0, ans=0.2 2023-09-30 05:08:30,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:08:34,171 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=605840.0, ans=0.0 2023-09-30 05:08:36,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:08:37,013 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:08:38,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:08:41,547 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:08:41,556 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-30 05:08:41,689 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:08:43,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 05:08:46,373 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:08:46,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:08:46,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-30 05:08:49,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:08:50,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-30 05:08:52,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-30 05:08:52,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:08:52,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:08:54,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:08:54,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:08:57,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:09:00,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:09:02,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:09:04,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:09:06,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 05:09:06,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 05:09:08,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:09:08,796 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-30 05:09:10,251 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:09:11,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-30 05:09:13,112 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-30 05:09:16,836 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=605973.3333333334, ans=0.2 2023-09-30 05:09:18,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-30 05:09:18,412 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=606040.0, ans=0.0 2023-09-30 05:09:19,463 INFO [train.py:1039] (1/4) Epoch 18, batch 600, loss[loss=0.1912, simple_loss=0.2607, pruned_loss=0.06086, over 23770.00 frames. ], tot_loss[loss=0.183, simple_loss=0.2579, pruned_loss=0.05402, over 4479509.08 frames. ], batch size: 149, lr: 5.79e-03, grad_scale: 16.0 2023-09-30 05:09:22,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-30 05:09:24,152 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:09:24,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 05:09:24,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:09:30,822 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.862e+02 2.137e+02 2.512e+02 3.782e+02, threshold=4.274e+02, percent-clipped=0.0 2023-09-30 05:09:31,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:09:34,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 05:09:34,752 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-30 05:09:36,356 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:09:39,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:09:41,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:09:43,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-30 05:09:43,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:09:49,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-30 05:09:54,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:09:54,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:09:54,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:10:02,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:10:02,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:10:02,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:10:11,010 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:10:12,880 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:10:12,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:10:12,899 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:10:21,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-30 05:10:25,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-30 05:10:25,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:10:27,631 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=606306.6666666666, ans=0.125 2023-09-30 05:10:30,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-30 05:10:32,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:10:35,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-30 05:10:36,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:10:36,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:10:42,050 INFO [train.py:1039] (1/4) Epoch 18, batch 650, loss[loss=0.2006, simple_loss=0.2768, pruned_loss=0.06216, over 23929.00 frames. ], tot_loss[loss=0.1825, simple_loss=0.2566, pruned_loss=0.05426, over 4513170.19 frames. ], batch size: 86, lr: 5.79e-03, grad_scale: 16.0 2023-09-30 05:10:43,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 05:10:45,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-30 05:10:47,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:10:48,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:10:51,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:10:54,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-30 05:10:55,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:10:59,190 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=606440.0, ans=0.0 2023-09-30 05:11:00,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:11:00,393 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:11:03,524 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:11:03,736 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=606440.0, ans=0.125 2023-09-30 05:11:04,275 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.15 vs. limit=15.0 2023-09-30 05:11:10,428 WARNING [train.py:1197] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-30 05:11:12,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:11:12,124 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:11:15,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:11:15,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 05:11:18,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:11:18,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:11:18,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 05:11:20,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:11:22,235 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:11:24,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 05:11:25,671 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-30 05:11:25,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:11:25,730 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:11:28,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:11:28,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:11:29,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:11:30,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:11:30,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-30 05:11:32,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:11:32,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:11:33,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-30 05:11:33,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:11:36,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 05:11:38,320 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-30 05:11:40,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-30 05:11:40,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:11:40,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:11:40,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:11:42,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:11:43,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:11:46,162 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=606573.3333333334, ans=0.09899494936611666 2023-09-30 05:11:49,083 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:11:50,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:11:50,657 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:11:54,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:11:54,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 05:11:54,280 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:12:01,047 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=606640.0, ans=0.125 2023-09-30 05:12:04,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 05:12:04,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:12:04,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:12:04,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:12:05,548 INFO [train.py:1039] (1/4) Epoch 18, batch 700, loss[loss=0.1783, simple_loss=0.2571, pruned_loss=0.04976, over 24000.00 frames. ], tot_loss[loss=0.1817, simple_loss=0.2557, pruned_loss=0.05383, over 4564613.11 frames. ], batch size: 80, lr: 5.79e-03, grad_scale: 16.0 2023-09-30 05:12:10,255 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-30 05:12:11,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-30 05:12:14,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-30 05:12:16,480 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.793e+02 1.992e+02 2.237e+02 3.434e+02, threshold=3.985e+02, percent-clipped=0.0 2023-09-30 05:12:16,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:12:16,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:12:20,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-30 05:12:20,920 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=606773.3333333334, ans=0.125 2023-09-30 05:12:25,252 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:12:26,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:12:29,605 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.90 vs. limit=6.0 2023-09-30 05:12:30,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:12:30,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-30 05:12:32,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:12:35,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:12:37,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 05:12:37,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:12:38,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-30 05:12:41,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-30 05:12:46,085 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-30 05:12:46,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:12:47,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:12:48,425 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.00 vs. limit=6.0 2023-09-30 05:12:53,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:12:53,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-30 05:12:59,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:12:59,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 05:13:01,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-30 05:13:02,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:13:03,207 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=606906.6666666666, ans=0.125 2023-09-30 05:13:04,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:13:05,269 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=606906.6666666666, ans=0.125 2023-09-30 05:13:08,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:13:14,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:13:14,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-30 05:13:15,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-30 05:13:15,967 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-30 05:13:20,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:13:22,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:13:23,837 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:13:26,214 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:13:26,224 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-30 05:13:27,523 INFO [train.py:1039] (1/4) Epoch 18, batch 750, loss[loss=0.1989, simple_loss=0.2739, pruned_loss=0.06192, over 23630.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.2545, pruned_loss=0.05325, over 4598503.97 frames. ], batch size: 94, lr: 5.79e-03, grad_scale: 16.0 2023-09-30 05:13:30,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-30 05:13:30,859 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-30 05:13:32,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-30 05:13:32,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-30 05:13:33,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-30 05:13:34,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:13:35,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-30 05:13:36,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:13:36,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:13:39,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:13:42,018 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:13:42,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-30 05:13:42,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:13:46,581 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:13:48,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 05:13:49,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:13:52,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:13:52,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:13:54,185 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-30 05:13:55,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-30 05:13:55,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:13:57,898 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:14:01,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-30 05:14:01,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-30 05:14:02,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:14:04,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-30 05:14:04,714 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-30 05:14:04,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-30 05:14:04,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:14:06,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 05:14:07,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:14:13,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:14:13,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:14:15,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 05:14:18,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:14:19,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:14:19,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-30 05:14:19,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:14:20,076 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=607240.0, ans=0.2 2023-09-30 05:14:21,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-30 05:14:22,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:14:23,140 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=607240.0, ans=0.125 2023-09-30 05:14:25,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:14:27,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-30 05:14:27,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:14:34,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:14:34,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:14:35,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:14:38,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 05:14:42,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-30 05:14:42,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:14:42,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:14:46,332 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:14:46,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:14:49,875 INFO [train.py:1039] (1/4) Epoch 18, batch 800, loss[loss=0.1916, simple_loss=0.2709, pruned_loss=0.05622, over 24016.00 frames. ], tot_loss[loss=0.1806, simple_loss=0.255, pruned_loss=0.05307, over 4635607.58 frames. ], batch size: 80, lr: 5.79e-03, grad_scale: 32.0 2023-09-30 05:14:50,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:14:50,087 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-30 05:14:57,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:14:57,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:14:59,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:14:59,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:14:59,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:14:59,639 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:15:00,863 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.976e+02 2.273e+02 2.818e+02 4.949e+02, threshold=4.546e+02, percent-clipped=4.0 2023-09-30 05:15:01,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:15:03,316 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.83 vs. limit=10.0 2023-09-30 05:15:06,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:15:07,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:15:09,695 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.80 vs. limit=15.0 2023-09-30 05:15:10,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-30 05:15:11,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:15:13,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:15:13,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:15:13,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:15:14,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-30 05:15:14,708 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:15:14,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-30 05:15:14,979 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=607440.0, ans=0.125 2023-09-30 05:15:19,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:15:22,632 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:15:22,913 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=607506.6666666666, ans=0.125 2023-09-30 05:15:24,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:15:24,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:15:27,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:15:28,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:15:31,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:15:32,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:15:33,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-30 05:15:34,992 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-30 05:15:35,048 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-30 05:15:35,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 05:15:35,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:15:35,920 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.30 vs. limit=6.0 2023-09-30 05:15:37,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:15:38,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:15:43,215 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-30 05:15:43,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-30 05:15:44,129 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=607573.3333333334, ans=0.0 2023-09-30 05:15:46,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-30 05:15:48,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 05:15:52,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:15:56,471 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:15:58,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-30 05:15:58,665 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:16:03,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-30 05:16:10,362 INFO [train.py:1039] (1/4) Epoch 18, batch 850, loss[loss=0.1862, simple_loss=0.2528, pruned_loss=0.05983, over 23831.00 frames. ], tot_loss[loss=0.1818, simple_loss=0.2562, pruned_loss=0.05366, over 4642792.49 frames. ], batch size: 195, lr: 5.79e-03, grad_scale: 16.0 2023-09-30 05:16:10,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:16:13,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:16:13,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-30 05:16:13,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:16:15,613 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:16:17,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-30 05:16:18,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:16:18,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:16:19,744 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=607706.6666666666, ans=0.125 2023-09-30 05:16:20,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:16:21,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 05:16:22,568 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:16:24,111 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-30 05:16:24,173 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-30 05:16:24,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-30 05:16:27,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:16:27,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:16:29,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:16:29,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:16:30,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 05:16:36,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:16:36,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:16:36,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-30 05:16:40,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-30 05:16:43,627 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:16:45,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-30 05:16:47,274 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.22 vs. limit=10.0 2023-09-30 05:16:50,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-30 05:16:50,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-30 05:16:54,206 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-30 05:16:54,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:16:54,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:16:54,248 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 05:16:55,918 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:16:57,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:16:58,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-30 05:17:00,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:17:01,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:17:03,992 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:17:04,052 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-30 05:17:05,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:17:06,021 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=607906.6666666666, ans=0.0 2023-09-30 05:17:07,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-30 05:17:08,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-30 05:17:13,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:17:13,649 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:17:15,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:17:15,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:17:15,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:17:18,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:17:21,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:17:23,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-30 05:17:23,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:17:23,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-30 05:17:27,477 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=607973.3333333334, ans=0.0 2023-09-30 05:17:33,408 INFO [train.py:1039] (1/4) Epoch 18, batch 900, loss[loss=0.2039, simple_loss=0.2622, pruned_loss=0.07283, over 23775.00 frames. ], tot_loss[loss=0.182, simple_loss=0.2566, pruned_loss=0.05368, over 4660812.23 frames. ], batch size: 150, lr: 5.78e-03, grad_scale: 16.0 2023-09-30 05:17:33,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-30 05:17:34,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:17:36,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-30 05:17:36,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:17:36,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:17:38,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-30 05:17:46,196 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.416e+02 1.851e+02 2.083e+02 2.531e+02 4.017e+02, threshold=4.166e+02, percent-clipped=0.0 2023-09-30 05:17:46,292 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:17:47,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:17:49,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-30 05:17:50,071 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=608106.6666666666, ans=0.1 2023-09-30 05:17:52,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:17:52,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-30 05:17:53,074 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-30 05:17:54,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:17:54,678 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:17:56,137 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 05:17:56,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:17:59,059 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.97 vs. limit=15.0 2023-09-30 05:18:01,605 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=608106.6666666666, ans=0.95 2023-09-30 05:18:01,694 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=608106.6666666666, ans=0.125 2023-09-30 05:18:03,162 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=608106.6666666666, ans=0.125 2023-09-30 05:18:07,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:18:08,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:18:08,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 05:18:11,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:18:16,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-30 05:18:16,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:18:19,050 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.10 vs. limit=6.0 2023-09-30 05:18:20,551 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.33 vs. limit=15.0 2023-09-30 05:18:21,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-30 05:18:22,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:18:24,620 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-30 05:18:26,121 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-30 05:18:26,368 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=608240.0, ans=0.125 2023-09-30 05:18:30,959 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-30 05:18:30,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:18:33,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 05:18:39,821 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:18:39,840 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:18:41,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-30 05:18:41,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:18:43,011 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-30 05:18:46,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:18:46,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:18:48,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:18:49,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:18:54,330 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-30 05:18:54,390 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-30 05:18:56,283 INFO [train.py:1039] (1/4) Epoch 18, batch 950, loss[loss=0.1726, simple_loss=0.2547, pruned_loss=0.04524, over 24470.00 frames. ], tot_loss[loss=0.182, simple_loss=0.2569, pruned_loss=0.05354, over 4677987.32 frames. ], batch size: 63, lr: 5.78e-03, grad_scale: 16.0 2023-09-30 05:18:58,029 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-30 05:18:58,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-30 05:18:59,679 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:19:02,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-30 05:19:08,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:19:11,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:19:11,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:19:12,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 05:19:15,070 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-30 05:19:20,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:19:20,402 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:19:21,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:19:21,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:19:21,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-30 05:19:23,438 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-30 05:19:25,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:19:26,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-30 05:19:26,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:19:27,345 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.05 vs. limit=15.0 2023-09-30 05:19:33,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:19:33,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:19:33,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:19:34,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-30 05:19:36,511 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 05:19:38,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:19:39,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:19:44,895 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:19:44,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:19:45,243 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=608573.3333333334, ans=0.125 2023-09-30 05:19:48,433 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-30 05:19:51,367 WARNING [train.py:1197] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 05:19:51,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 05:19:51,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:19:51,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:19:51,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:19:56,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-30 05:19:58,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:20:01,408 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:20:01,492 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:20:01,532 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-30 05:20:01,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:20:01,572 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:20:03,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-30 05:20:08,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:20:08,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:20:15,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:20:15,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-30 05:20:15,512 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-30 05:20:17,107 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=608640.0, ans=0.0 2023-09-30 05:20:19,773 INFO [train.py:1039] (1/4) Epoch 18, batch 1000, loss[loss=0.1801, simple_loss=0.2254, pruned_loss=0.06742, over 19504.00 frames. ], tot_loss[loss=0.1822, simple_loss=0.2565, pruned_loss=0.05393, over 4669044.79 frames. ], batch size: 388, lr: 5.78e-03, grad_scale: 16.0 2023-09-30 05:20:19,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:20:25,533 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-30 05:20:25,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:20:30,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:20:33,043 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 2.036e+02 2.242e+02 3.072e+02 5.531e+02, threshold=4.484e+02, percent-clipped=11.0 2023-09-30 05:20:33,206 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-30 05:20:33,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-30 05:20:36,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:20:36,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:20:38,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:20:43,047 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-30 05:20:46,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-30 05:20:47,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-30 05:20:48,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:20:50,038 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-30 05:20:50,365 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=608773.3333333334, ans=0.2 2023-09-30 05:20:51,558 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-30 05:20:51,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-30 05:20:53,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:20:55,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:20:55,614 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=608840.0, ans=0.0 2023-09-30 05:21:03,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:21:03,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:21:04,089 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=608840.0, ans=10.0 2023-09-30 05:21:05,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:21:06,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:21:06,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-30 05:21:06,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:21:06,890 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:21:08,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:21:08,490 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-30 05:21:13,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-30 05:21:13,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-30 05:21:15,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-30 05:21:15,789 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=608906.6666666666, ans=0.5 2023-09-30 05:21:17,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:21:20,875 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 05:21:25,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:21:26,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:21:26,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:21:28,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:21:30,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-30 05:21:31,720 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:21:33,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-30 05:21:33,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-30 05:21:36,844 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:21:36,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:21:38,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:21:41,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 05:21:42,874 INFO [train.py:1039] (1/4) Epoch 18, batch 1050, loss[loss=0.1814, simple_loss=0.2562, pruned_loss=0.05334, over 23199.00 frames. ], tot_loss[loss=0.1804, simple_loss=0.2545, pruned_loss=0.05312, over 4679394.83 frames. ], batch size: 105, lr: 5.78e-03, grad_scale: 16.0 2023-09-30 05:21:43,103 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:21:47,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:21:47,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:21:48,030 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=609040.0, ans=0.1 2023-09-30 05:21:49,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 05:21:51,327 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:21:53,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:21:56,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:21:56,927 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=609040.0, ans=0.125 2023-09-30 05:21:58,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:22:01,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:22:01,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:22:01,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:22:02,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:22:02,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-30 05:22:05,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:22:06,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-30 05:22:09,500 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:22:09,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-30 05:22:09,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-30 05:22:09,938 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=609106.6666666666, ans=0.2 2023-09-30 05:22:16,502 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=609173.3333333334, ans=0.125 2023-09-30 05:22:17,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:22:17,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-30 05:22:17,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:22:20,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-30 05:22:21,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-30 05:22:21,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:22:26,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-30 05:22:29,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-30 05:22:31,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:22:34,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 05:22:35,155 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=609240.0, ans=0.125 2023-09-30 05:22:36,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-30 05:22:36,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:22:37,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:22:40,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:22:40,930 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.80 vs. limit=12.0 2023-09-30 05:22:44,263 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=8.37 vs. limit=15.0 2023-09-30 05:22:45,373 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-30 05:22:45,787 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=609240.0, ans=0.125 2023-09-30 05:22:45,789 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=609240.0, ans=10.0 2023-09-30 05:22:46,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-30 05:22:47,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-30 05:22:47,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:22:48,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:22:50,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-30 05:22:53,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:22:54,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:22:54,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:22:56,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:22:56,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:22:59,053 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.69 vs. limit=15.0 2023-09-30 05:23:01,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:23:01,819 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-30 05:23:02,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:23:02,046 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-30 05:23:04,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-30 05:23:05,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:23:06,945 INFO [train.py:1039] (1/4) Epoch 18, batch 1100, loss[loss=0.1954, simple_loss=0.2729, pruned_loss=0.05892, over 23358.00 frames. ], tot_loss[loss=0.18, simple_loss=0.254, pruned_loss=0.05299, over 4667633.77 frames. ], batch size: 93, lr: 5.78e-03, grad_scale: 8.0 2023-09-30 05:23:07,561 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=609373.3333333334, ans=0.5 2023-09-30 05:23:08,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:23:14,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:23:20,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 05:23:22,016 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.922e+02 2.115e+02 2.641e+02 4.048e+02, threshold=4.230e+02, percent-clipped=0.0 2023-09-30 05:23:22,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:23:22,212 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:23:22,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-30 05:23:25,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:23:26,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-30 05:23:28,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:23:31,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:23:31,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-30 05:23:33,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 05:23:35,446 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:23:35,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:23:37,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:23:40,714 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-30 05:23:40,975 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=609506.6666666666, ans=0.1 2023-09-30 05:23:45,948 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:23:46,266 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=609506.6666666666, ans=0.125 2023-09-30 05:23:47,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-30 05:23:49,241 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-30 05:23:49,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:23:52,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:23:52,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-30 05:23:54,347 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:23:54,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-30 05:23:56,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:23:56,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:23:56,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:23:57,495 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:23:57,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-30 05:24:05,064 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:24:05,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-30 05:24:06,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:24:12,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 05:24:15,669 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-30 05:24:15,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-30 05:24:17,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:24:19,857 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=609640.0, ans=0.125 2023-09-30 05:24:20,210 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=9.90 vs. limit=15.0 2023-09-30 05:24:20,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:24:22,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:24:22,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-30 05:24:23,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:24:23,869 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:24:25,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-30 05:24:25,441 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:24:26,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-30 05:24:27,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:24:27,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:24:28,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-30 05:24:30,531 INFO [train.py:1039] (1/4) Epoch 18, batch 1150, loss[loss=0.1803, simple_loss=0.253, pruned_loss=0.05383, over 23079.00 frames. ], tot_loss[loss=0.1804, simple_loss=0.2544, pruned_loss=0.05323, over 4674722.62 frames. ], batch size: 105, lr: 5.78e-03, grad_scale: 8.0 2023-09-30 05:24:33,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:24:35,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:24:38,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:24:38,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:24:38,623 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-30 05:24:38,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:24:42,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-30 05:24:42,562 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=609706.6666666666, ans=0.0 2023-09-30 05:24:43,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:24:43,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 05:24:49,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-30 05:24:52,730 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:24:56,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:24:56,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:24:56,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-30 05:24:57,739 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-30 05:24:57,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:25:02,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-30 05:25:02,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:25:04,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:25:15,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:25:16,401 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=9.69 vs. limit=22.5 2023-09-30 05:25:23,838 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:25:23,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-30 05:25:23,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:25:24,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:25:30,978 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-30 05:25:33,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:25:40,789 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-30 05:25:45,235 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:25:46,805 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:25:46,851 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-30 05:25:48,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:25:50,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:25:53,400 INFO [train.py:1039] (1/4) Epoch 18, batch 1200, loss[loss=0.1751, simple_loss=0.2372, pruned_loss=0.05653, over 23791.00 frames. ], tot_loss[loss=0.1801, simple_loss=0.2546, pruned_loss=0.05282, over 4688715.85 frames. ], batch size: 150, lr: 5.78e-03, grad_scale: 16.0 2023-09-30 05:25:57,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:25:57,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-30 05:25:58,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:25:58,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:26:00,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:26:00,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:26:03,378 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 05:26:06,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:26:06,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:26:08,277 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.471e+02 1.757e+02 1.911e+02 2.190e+02 3.350e+02, threshold=3.822e+02, percent-clipped=0.0 2023-09-30 05:26:08,760 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=610106.6666666666, ans=0.1 2023-09-30 05:26:10,037 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-30 05:26:11,723 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-30 05:26:12,495 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.83 vs. limit=10.0 2023-09-30 05:26:16,977 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=610106.6666666666, ans=0.125 2023-09-30 05:26:18,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:26:19,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:26:22,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:26:24,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:26:24,459 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-30 05:26:26,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:26:30,065 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=610173.3333333334, ans=0.0 2023-09-30 05:26:34,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-30 05:26:34,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:26:34,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-30 05:26:36,426 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:26:39,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-30 05:26:44,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-30 05:26:44,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:26:46,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:26:47,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:26:47,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-30 05:26:48,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:26:49,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:26:51,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:26:51,464 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-30 05:26:51,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 05:26:52,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-30 05:26:52,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 05:26:54,625 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:26:55,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:27:00,634 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-30 05:27:02,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:27:06,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-30 05:27:08,579 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=610306.6666666666, ans=0.125 2023-09-30 05:27:11,423 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-30 05:27:14,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:27:15,617 INFO [train.py:1039] (1/4) Epoch 18, batch 1250, loss[loss=0.187, simple_loss=0.2526, pruned_loss=0.06069, over 23559.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.2558, pruned_loss=0.05267, over 4711986.97 frames. ], batch size: 134, lr: 5.77e-03, grad_scale: 16.0 2023-09-30 05:27:17,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-30 05:27:18,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:27:21,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:27:24,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-30 05:27:27,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:27:29,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:27:29,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-30 05:27:31,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:27:34,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 05:27:36,075 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=610440.0, ans=0.0 2023-09-30 05:27:37,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 05:27:37,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:27:40,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:27:40,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:27:42,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-30 05:27:47,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 05:27:47,657 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-30 05:27:47,665 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:27:49,176 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:27:49,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:27:50,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:27:54,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-30 05:27:59,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-30 05:27:59,707 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=610506.6666666666, ans=0.0 2023-09-30 05:28:00,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:28:02,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:28:03,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-30 05:28:05,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:28:05,573 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-30 05:28:07,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:28:07,047 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:28:10,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:28:13,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:28:13,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:28:15,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-30 05:28:15,317 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-30 05:28:15,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-30 05:28:15,529 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=610573.3333333334, ans=0.125 2023-09-30 05:28:18,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:28:19,959 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=610640.0, ans=0.2 2023-09-30 05:28:21,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-30 05:28:21,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:28:22,019 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=610640.0, ans=0.125 2023-09-30 05:28:24,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-30 05:28:25,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:28:26,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-30 05:28:27,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-30 05:28:28,693 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:28:28,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-30 05:28:28,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:28:30,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-30 05:28:34,065 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:28:35,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:28:35,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:28:38,670 INFO [train.py:1039] (1/4) Epoch 18, batch 1300, loss[loss=0.1706, simple_loss=0.248, pruned_loss=0.04657, over 24310.00 frames. ], tot_loss[loss=0.1822, simple_loss=0.2572, pruned_loss=0.05362, over 4699595.90 frames. ], batch size: 61, lr: 5.77e-03, grad_scale: 16.0 2023-09-30 05:28:38,834 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-30 05:28:41,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:28:41,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-30 05:28:45,156 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:28:48,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-30 05:28:48,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:28:52,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:28:52,450 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:28:53,670 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.896e+02 2.139e+02 2.447e+02 3.795e+02, threshold=4.278e+02, percent-clipped=0.0 2023-09-30 05:28:53,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-30 05:28:59,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:28:59,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-30 05:29:00,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-30 05:29:05,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 05:29:10,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:29:10,532 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:29:12,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:29:13,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:29:15,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 05:29:16,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-30 05:29:17,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-30 05:29:18,278 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 05:29:24,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:29:24,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 05:29:26,276 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-30 05:29:26,370 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 05:29:27,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:29:30,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:29:32,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-30 05:29:33,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:29:33,846 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-30 05:29:35,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:29:35,789 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=610906.6666666666, ans=0.0 2023-09-30 05:29:40,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:29:40,679 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:29:43,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-30 05:29:45,346 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-30 05:29:45,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-30 05:29:49,945 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:29:53,032 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-30 05:29:54,607 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:29:56,804 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=610973.3333333334, ans=0.0 2023-09-30 05:30:01,235 INFO [train.py:1039] (1/4) Epoch 18, batch 1350, loss[loss=0.1866, simple_loss=0.2497, pruned_loss=0.06177, over 23828.00 frames. ], tot_loss[loss=0.1822, simple_loss=0.2567, pruned_loss=0.05384, over 4696526.43 frames. ], batch size: 179, lr: 5.77e-03, grad_scale: 16.0 2023-09-30 05:30:01,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-30 05:30:07,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:30:08,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:30:14,079 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:30:14,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:30:16,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:30:18,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:30:21,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:30:23,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-30 05:30:24,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-30 05:30:24,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:30:27,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-30 05:30:29,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:30:31,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:30:31,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-30 05:30:32,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-30 05:30:34,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-30 05:30:35,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:30:35,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-30 05:30:49,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:30:51,510 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=611240.0, ans=0.0 2023-09-30 05:30:59,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:30:59,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:30:59,136 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-30 05:31:02,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:31:02,490 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=611240.0, ans=0.0 2023-09-30 05:31:03,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-30 05:31:03,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-30 05:31:03,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:31:07,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:31:09,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-30 05:31:10,171 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=611306.6666666666, ans=0.0 2023-09-30 05:31:11,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:31:16,473 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=611306.6666666666, ans=0.125 2023-09-30 05:31:17,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-30 05:31:21,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-30 05:31:24,492 INFO [train.py:1039] (1/4) Epoch 18, batch 1400, loss[loss=0.1806, simple_loss=0.2605, pruned_loss=0.05029, over 23998.00 frames. ], tot_loss[loss=0.1808, simple_loss=0.2549, pruned_loss=0.0533, over 4694258.12 frames. ], batch size: 80, lr: 5.77e-03, grad_scale: 16.0 2023-09-30 05:31:24,903 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=611373.3333333334, ans=0.0 2023-09-30 05:31:26,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-30 05:31:26,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:31:28,408 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=611373.3333333334, ans=0.2 2023-09-30 05:31:29,623 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:31:31,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:31:35,750 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-30 05:31:37,327 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-30 05:31:38,681 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.427e+02 1.909e+02 2.025e+02 2.334e+02 3.516e+02, threshold=4.051e+02, percent-clipped=0.0 2023-09-30 05:31:48,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:31:51,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:31:54,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:31:54,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-30 05:31:59,893 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:32:00,032 WARNING [train.py:1197] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 05:32:08,032 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:32:09,469 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:32:12,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-30 05:32:14,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-30 05:32:15,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:32:16,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:32:16,639 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:32:20,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:32:20,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:32:21,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:32:21,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-30 05:32:21,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:32:27,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:32:31,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:32:39,309 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-30 05:32:40,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 05:32:40,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:32:43,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 05:32:44,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:32:45,647 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:32:46,987 INFO [train.py:1039] (1/4) Epoch 18, batch 1450, loss[loss=0.1853, simple_loss=0.2695, pruned_loss=0.0505, over 24544.00 frames. ], tot_loss[loss=0.1799, simple_loss=0.2534, pruned_loss=0.05313, over 4691654.56 frames. ], batch size: 71, lr: 5.77e-03, grad_scale: 16.0 2023-09-30 05:32:49,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-30 05:32:52,804 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:32:52,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:32:52,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-30 05:32:58,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:32:58,147 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 05:32:58,357 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=611706.6666666666, ans=0.125 2023-09-30 05:32:59,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:32:59,762 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-30 05:33:01,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 05:33:03,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-30 05:33:04,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:33:05,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:33:05,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-30 05:33:06,424 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:33:07,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:33:08,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 05:33:08,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:33:10,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:33:12,382 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:33:14,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:33:15,738 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=611773.3333333334, ans=0.125 2023-09-30 05:33:16,409 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.70 vs. limit=15.0 2023-09-30 05:33:18,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:33:18,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:33:21,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:33:21,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:33:24,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:33:24,715 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:33:24,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:33:24,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:33:30,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-30 05:33:32,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:33:36,444 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-30 05:33:38,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:33:40,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-30 05:33:40,181 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:33:40,496 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=611906.6666666666, ans=0.0 2023-09-30 05:33:41,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-30 05:33:46,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:33:47,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-30 05:33:49,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-30 05:33:50,739 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:33:53,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:33:53,891 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:33:55,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-30 05:33:59,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-30 05:34:00,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-30 05:34:00,269 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=611973.3333333334, ans=0.035 2023-09-30 05:34:01,606 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:34:05,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 05:34:09,825 INFO [train.py:1039] (1/4) Epoch 18, batch 1500, loss[loss=0.1733, simple_loss=0.2481, pruned_loss=0.04931, over 23519.00 frames. ], tot_loss[loss=0.1801, simple_loss=0.2542, pruned_loss=0.05299, over 4699816.38 frames. ], batch size: 106, lr: 5.77e-03, grad_scale: 8.0 2023-09-30 05:34:15,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-30 05:34:15,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-30 05:34:15,427 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:34:15,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:34:16,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:34:18,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:34:18,544 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-30 05:34:20,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:34:21,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-30 05:34:21,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:34:22,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:34:24,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:34:24,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:34:25,932 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.869e+02 2.047e+02 2.380e+02 3.622e+02, threshold=4.094e+02, percent-clipped=0.0 2023-09-30 05:34:32,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:34:32,664 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-30 05:34:34,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-30 05:34:34,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:34:35,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:34:41,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-30 05:34:46,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-30 05:34:46,257 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:34:46,530 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=612173.3333333334, ans=0.2 2023-09-30 05:34:47,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-30 05:34:49,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-30 05:34:51,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:34:53,041 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:34:53,063 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:34:54,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-30 05:34:54,464 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:34:55,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:34:55,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-30 05:34:56,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:34:59,826 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.14 vs. limit=15.0 2023-09-30 05:35:03,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:35:03,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-30 05:35:08,878 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:35:10,696 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=612240.0, ans=0.125 2023-09-30 05:35:11,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 05:35:16,372 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-30 05:35:16,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:35:16,463 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-30 05:35:18,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:35:18,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:35:19,714 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-30 05:35:21,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-30 05:35:24,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-30 05:35:26,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:35:27,093 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.89 vs. limit=12.0 2023-09-30 05:35:30,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:35:30,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:35:30,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:35:31,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:35:32,340 INFO [train.py:1039] (1/4) Epoch 18, batch 1550, loss[loss=0.1912, simple_loss=0.2672, pruned_loss=0.0576, over 23643.00 frames. ], tot_loss[loss=0.1801, simple_loss=0.2549, pruned_loss=0.05267, over 4722773.86 frames. ], batch size: 85, lr: 5.76e-03, grad_scale: 8.0 2023-09-30 05:35:32,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:35:32,683 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-30 05:35:34,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-30 05:35:34,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:35:34,501 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=612373.3333333334, ans=0.1 2023-09-30 05:35:35,726 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-30 05:35:35,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-30 05:35:36,106 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=612373.3333333334, ans=0.125 2023-09-30 05:35:38,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:35:40,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:35:41,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:35:41,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:35:43,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:35:45,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:35:46,801 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-30 05:35:47,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:35:47,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:35:49,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 05:35:52,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:35:52,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-30 05:35:52,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:35:53,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-30 05:35:53,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-30 05:35:53,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-30 05:35:55,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:35:57,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:35:59,983 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=9.85 vs. limit=15.0 2023-09-30 05:36:00,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:36:03,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-30 05:36:04,938 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-30 05:36:05,327 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=612506.6666666666, ans=0.2 2023-09-30 05:36:08,485 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=612506.6666666666, ans=0.125 2023-09-30 05:36:10,277 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.81 vs. limit=15.0 2023-09-30 05:36:11,641 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=612506.6666666666, ans=0.125 2023-09-30 05:36:12,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:36:14,554 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=612506.6666666666, ans=0.125 2023-09-30 05:36:15,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:36:15,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:36:15,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:36:17,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-30 05:36:21,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 05:36:24,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:36:27,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:36:30,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:36:30,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:36:30,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-30 05:36:30,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:36:33,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:36:33,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:36:34,039 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=612573.3333333334, ans=0.0 2023-09-30 05:36:35,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-30 05:36:35,138 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-30 05:36:36,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:36:44,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-30 05:36:47,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:36:49,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:36:51,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-30 05:36:52,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:36:53,067 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=612706.6666666666, ans=0.1 2023-09-30 05:36:54,072 INFO [train.py:1039] (1/4) Epoch 18, batch 1600, loss[loss=0.1881, simple_loss=0.2759, pruned_loss=0.05016, over 24036.00 frames. ], tot_loss[loss=0.1811, simple_loss=0.2556, pruned_loss=0.0533, over 4707536.10 frames. ], batch size: 86, lr: 5.76e-03, grad_scale: 16.0 2023-09-30 05:36:54,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:36:54,421 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=612706.6666666666, ans=0.2 2023-09-30 05:36:56,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 05:36:56,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:36:57,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:37:01,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:37:01,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-30 05:37:03,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-30 05:37:05,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-30 05:37:07,399 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.40 vs. limit=15.0 2023-09-30 05:37:08,418 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:37:09,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-30 05:37:11,283 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.776e+02 1.986e+02 2.218e+02 2.701e+02, threshold=3.972e+02, percent-clipped=0.0 2023-09-30 05:37:11,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:37:13,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:37:13,616 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=612773.3333333334, ans=0.125 2023-09-30 05:37:16,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:37:20,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-30 05:37:22,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:37:22,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-30 05:37:22,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:37:24,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-30 05:37:29,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-30 05:37:30,194 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=612840.0, ans=0.125 2023-09-30 05:37:31,970 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_ff2.min_abs, batch_count=612840.0, ans=0.1 2023-09-30 05:37:33,440 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=612840.0, ans=0.125 2023-09-30 05:37:35,453 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=612840.0, ans=0.04949747468305833 2023-09-30 05:37:36,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:37:36,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-30 05:37:38,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:37:38,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:37:38,303 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:37:41,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-30 05:37:41,770 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=612840.0, ans=0.09899494936611666 2023-09-30 05:37:46,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 05:37:49,062 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:37:50,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:37:50,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:37:52,105 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:37:52,221 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=612906.6666666666, ans=0.0 2023-09-30 05:37:53,596 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:37:55,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:37:57,320 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 05:38:06,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:38:06,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:38:10,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-30 05:38:10,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:38:12,004 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-30 05:38:16,813 INFO [train.py:1039] (1/4) Epoch 18, batch 1650, loss[loss=0.1708, simple_loss=0.2425, pruned_loss=0.0496, over 18349.00 frames. ], tot_loss[loss=0.1825, simple_loss=0.2569, pruned_loss=0.05411, over 4695400.18 frames. ], batch size: 40, lr: 5.76e-03, grad_scale: 8.0 2023-09-30 05:38:17,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:38:19,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:38:21,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:38:21,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-30 05:38:21,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-30 05:38:21,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-30 05:38:21,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-30 05:38:26,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:38:26,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:38:27,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:38:27,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-30 05:38:29,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:38:32,344 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-30 05:38:34,823 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=613106.6666666666, ans=0.0 2023-09-30 05:38:35,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:38:35,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:38:35,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:38:35,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:38:37,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-30 05:38:38,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-30 05:38:44,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 05:38:45,026 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=613106.6666666666, ans=0.95 2023-09-30 05:38:45,144 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=613106.6666666666, ans=0.0 2023-09-30 05:38:46,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-30 05:38:53,445 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=613173.3333333334, ans=0.125 2023-09-30 05:38:54,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-30 05:38:56,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:38:57,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-30 05:39:01,014 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=613173.3333333334, ans=0.125 2023-09-30 05:39:02,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:39:03,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:39:06,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:39:06,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:39:06,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:39:06,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:39:07,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:39:09,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:39:09,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:39:09,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:39:11,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:39:13,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:39:14,149 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.09 vs. limit=10.0 2023-09-30 05:39:15,016 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_abs, batch_count=613240.0, ans=0.5 2023-09-30 05:39:16,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:39:16,635 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=613240.0, ans=0.5 2023-09-30 05:39:18,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-30 05:39:19,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:39:20,196 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=613240.0, ans=0.0 2023-09-30 05:39:21,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-30 05:39:21,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-30 05:39:21,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-30 05:39:21,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:39:23,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:39:23,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:39:24,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:39:24,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-30 05:39:28,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:39:33,260 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:39:33,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:39:36,574 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=613306.6666666666, ans=0.2 2023-09-30 05:39:37,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-30 05:39:41,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:39:41,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:39:42,811 INFO [train.py:1039] (1/4) Epoch 18, batch 1700, loss[loss=0.2007, simple_loss=0.2852, pruned_loss=0.05812, over 24340.00 frames. ], tot_loss[loss=0.1831, simple_loss=0.2565, pruned_loss=0.05485, over 4682264.60 frames. ], batch size: 74, lr: 5.76e-03, grad_scale: 8.0 2023-09-30 05:39:42,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-30 05:39:42,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:39:42,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:39:42,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:39:44,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:39:46,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:39:46,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-30 05:39:50,963 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 05:39:56,391 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=613373.3333333334, ans=0.125 2023-09-30 05:40:00,302 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.464e+02 1.875e+02 2.086e+02 2.365e+02 3.763e+02, threshold=4.171e+02, percent-clipped=0.0 2023-09-30 05:40:00,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:40:02,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:40:09,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-30 05:40:10,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:40:10,424 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:40:11,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:40:14,842 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-30 05:40:15,096 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:40:16,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:40:17,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-30 05:40:18,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-30 05:40:20,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-30 05:40:20,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-30 05:40:23,887 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:40:25,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-30 05:40:26,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:40:31,217 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.16 vs. limit=22.5 2023-09-30 05:40:37,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:40:38,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:40:40,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:40:41,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-30 05:40:41,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-30 05:40:41,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:40:43,412 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:40:43,413 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-30 05:40:44,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:40:44,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:40:44,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:40:44,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:40:46,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:40:46,652 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:40:48,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:40:48,360 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=613640.0, ans=0.125 2023-09-30 05:40:48,414 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=613640.0, ans=0.125 2023-09-30 05:40:49,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:40:50,343 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:40:54,583 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:40:56,210 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-30 05:40:58,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:40:58,553 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:41:01,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-30 05:41:05,582 INFO [train.py:1039] (1/4) Epoch 18, batch 1750, loss[loss=0.1933, simple_loss=0.2722, pruned_loss=0.05719, over 23955.00 frames. ], tot_loss[loss=0.1815, simple_loss=0.2553, pruned_loss=0.0539, over 4684713.14 frames. ], batch size: 86, lr: 5.76e-03, grad_scale: 8.0 2023-09-30 05:41:08,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:41:10,869 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.00 vs. limit=10.0 2023-09-30 05:41:11,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:41:11,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-30 05:41:11,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-30 05:41:13,320 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:41:14,088 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.96 vs. limit=15.0 2023-09-30 05:41:16,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:41:16,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:41:21,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-30 05:41:23,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:41:24,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-30 05:41:24,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:41:25,207 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=613773.3333333334, ans=0.125 2023-09-30 05:41:26,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:41:28,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 05:41:31,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-30 05:41:32,590 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:41:34,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-30 05:41:41,949 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:41:45,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:41:45,131 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:41:49,593 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:41:49,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:41:52,718 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:41:54,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:41:56,597 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:41:58,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:41:59,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-30 05:41:59,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:42:04,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-30 05:42:04,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:42:04,514 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=613906.6666666666, ans=0.125 2023-09-30 05:42:04,674 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=613906.6666666666, ans=0.125 2023-09-30 05:42:05,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:42:07,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:42:11,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 05:42:11,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-30 05:42:13,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:42:15,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:42:18,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:42:21,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:42:22,073 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=613973.3333333334, ans=0.09899494936611666 2023-09-30 05:42:23,160 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:42:23,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-30 05:42:23,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:42:24,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-30 05:42:24,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:42:24,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-30 05:42:25,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:42:25,919 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=613973.3333333334, ans=0.125 2023-09-30 05:42:27,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:42:28,514 INFO [train.py:1039] (1/4) Epoch 18, batch 1800, loss[loss=0.1501, simple_loss=0.2266, pruned_loss=0.03675, over 24580.00 frames. ], tot_loss[loss=0.1803, simple_loss=0.2539, pruned_loss=0.05331, over 4683402.65 frames. ], batch size: 60, lr: 5.76e-03, grad_scale: 8.0 2023-09-30 05:42:30,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:42:31,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:42:33,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 05:42:33,536 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=614040.0, ans=0.125 2023-09-30 05:42:34,167 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.86 vs. limit=15.0 2023-09-30 05:42:36,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:42:39,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 05:42:41,588 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:42:45,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:42:46,628 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.378e+02 1.840e+02 1.957e+02 2.222e+02 3.420e+02, threshold=3.915e+02, percent-clipped=0.0 2023-09-30 05:42:48,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:42:48,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:42:50,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:42:52,060 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:42:52,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-30 05:42:53,686 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=614106.6666666666, ans=0.125 2023-09-30 05:42:54,846 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:42:58,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:43:01,313 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-30 05:43:03,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-30 05:43:03,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-30 05:43:05,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:43:06,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:43:06,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:43:08,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:43:08,610 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=614173.3333333334, ans=0.0 2023-09-30 05:43:12,960 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-30 05:43:14,594 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:43:16,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:43:18,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-30 05:43:19,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-30 05:43:21,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-30 05:43:22,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:43:23,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 05:43:30,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-30 05:43:33,393 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=614306.6666666666, ans=0.5 2023-09-30 05:43:36,555 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=614306.6666666666, ans=0.125 2023-09-30 05:43:37,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:43:38,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-30 05:43:39,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:43:39,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:43:40,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:43:41,134 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=13.50 vs. limit=15.0 2023-09-30 05:43:41,622 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-30 05:43:44,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:43:44,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:43:46,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-30 05:43:46,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:43:49,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:43:49,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:43:49,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:43:50,785 INFO [train.py:1039] (1/4) Epoch 18, batch 1850, loss[loss=0.1706, simple_loss=0.2556, pruned_loss=0.04275, over 24483.00 frames. ], tot_loss[loss=0.1801, simple_loss=0.2544, pruned_loss=0.05295, over 4689961.34 frames. ], batch size: 63, lr: 5.76e-03, grad_scale: 8.0 2023-09-30 05:43:52,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:43:52,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:43:52,894 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=614373.3333333334, ans=0.0 2023-09-30 05:43:54,629 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:43:55,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:43:58,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:43:58,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:44:00,357 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=6.336e-03 2023-09-30 05:44:06,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:44:07,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-30 05:44:09,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-30 05:44:14,286 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.39 vs. limit=6.0 2023-09-30 05:44:14,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-30 05:44:17,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:44:17,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-30 05:44:17,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 05:44:25,921 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=614506.6666666666, ans=0.125 2023-09-30 05:44:27,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:44:30,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-30 05:44:31,674 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.55 vs. limit=15.0 2023-09-30 05:44:33,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:44:33,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:44:38,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-30 05:44:38,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:44:39,501 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 05:44:39,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:44:41,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:44:41,609 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=614573.3333333334, ans=0.125 2023-09-30 05:44:44,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:44:46,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-30 05:44:48,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:44:48,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 05:44:48,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:44:48,399 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_ff3.min_abs, batch_count=614573.3333333334, ans=0.2 2023-09-30 05:44:51,069 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:44:52,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:44:54,310 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=614573.3333333334, ans=0.2 2023-09-30 05:44:55,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-30 05:44:55,748 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:44:57,646 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=614640.0, ans=0.1 2023-09-30 05:44:58,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:44:58,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:44:58,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-30 05:44:58,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-30 05:45:02,524 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-30 05:45:04,284 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-30 05:45:04,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 05:45:04,546 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:45:06,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:45:06,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:45:07,998 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-30 05:45:08,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:45:08,094 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:45:08,374 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=614640.0, ans=0.0 2023-09-30 05:45:09,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-30 05:45:09,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 05:45:11,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:45:11,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-30 05:45:11,628 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=614640.0, ans=0.0 2023-09-30 05:45:14,127 INFO [train.py:1039] (1/4) Epoch 18, batch 1900, loss[loss=0.1964, simple_loss=0.2628, pruned_loss=0.06501, over 24000.00 frames. ], tot_loss[loss=0.1816, simple_loss=0.2558, pruned_loss=0.05371, over 4691023.19 frames. ], batch size: 196, lr: 5.75e-03, grad_scale: 8.0 2023-09-30 05:45:14,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:45:14,316 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-30 05:45:14,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:45:15,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:45:20,528 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=614706.6666666666, ans=15.0 2023-09-30 05:45:21,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:45:24,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:45:24,248 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-30 05:45:24,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-30 05:45:25,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:45:27,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:45:27,260 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-30 05:45:28,805 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-30 05:45:30,931 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=614773.3333333334, ans=0.1 2023-09-30 05:45:31,884 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.898e+02 2.159e+02 2.520e+02 3.634e+02, threshold=4.318e+02, percent-clipped=0.0 2023-09-30 05:45:33,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-30 05:45:35,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:45:39,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-30 05:45:40,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-30 05:45:48,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-30 05:45:52,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-30 05:45:54,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:45:54,225 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-30 05:45:54,233 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-30 05:45:54,321 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-30 05:45:54,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-30 05:45:54,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:45:58,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-30 05:46:00,765 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=614840.0, ans=0.125 2023-09-30 05:46:01,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:46:07,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:46:07,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-30 05:46:07,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:46:11,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-30 05:46:11,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:46:19,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:46:19,300 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:46:19,323 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:46:19,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:46:19,829 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=614973.3333333334, ans=0.0 2023-09-30 05:46:21,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 05:46:22,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-30 05:46:22,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:46:25,582 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.96 vs. limit=15.0 2023-09-30 05:46:26,192 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:46:26,197 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:46:29,136 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:46:29,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:46:29,211 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:46:30,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:46:33,800 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 05:46:35,697 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=615040.0, ans=0.125 2023-09-30 05:46:36,671 INFO [train.py:1039] (1/4) Epoch 18, batch 1950, loss[loss=0.1822, simple_loss=0.255, pruned_loss=0.05472, over 23774.00 frames. ], tot_loss[loss=0.1829, simple_loss=0.2573, pruned_loss=0.05426, over 4686321.75 frames. ], batch size: 195, lr: 5.75e-03, grad_scale: 8.0 2023-09-30 05:46:36,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:46:38,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:46:38,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 05:46:40,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-30 05:46:40,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 05:46:40,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:46:42,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:46:43,910 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=615040.0, ans=0.0 2023-09-30 05:46:45,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:46:45,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:46:45,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:46:48,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:46:49,247 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=615040.0, ans=0.125 2023-09-30 05:46:50,726 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 05:46:50,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 05:46:50,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:46:51,085 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=615040.0, ans=0.0 2023-09-30 05:46:52,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:46:54,622 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.68 vs. limit=15.0 2023-09-30 05:46:56,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:46:59,501 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.99 vs. limit=15.0 2023-09-30 05:47:00,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:47:00,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:47:00,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-30 05:47:00,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-30 05:47:01,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 05:47:01,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:47:03,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:47:08,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:47:09,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:47:13,613 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.29 vs. limit=22.5 2023-09-30 05:47:16,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 05:47:19,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:47:20,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:47:21,011 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-30 05:47:22,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:47:26,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:47:28,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:47:29,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:47:39,591 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:47:39,714 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:47:41,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:47:44,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:47:46,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:47:47,465 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:47:47,581 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-30 05:47:47,589 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 05:47:49,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:47:51,209 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-30 05:47:52,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:47:53,080 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=615306.6666666666, ans=0.1 2023-09-30 05:47:55,122 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.34 vs. limit=22.5 2023-09-30 05:47:57,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:47:58,912 INFO [train.py:1039] (1/4) Epoch 18, batch 2000, loss[loss=0.1791, simple_loss=0.2716, pruned_loss=0.04328, over 24309.00 frames. ], tot_loss[loss=0.1824, simple_loss=0.2573, pruned_loss=0.05372, over 4694745.38 frames. ], batch size: 74, lr: 5.75e-03, grad_scale: 8.0 2023-09-30 05:47:59,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:48:00,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:48:02,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:48:04,650 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:48:09,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-30 05:48:09,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:48:12,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:48:16,299 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-30 05:48:16,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 05:48:17,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:48:19,241 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.532e+02 1.900e+02 2.106e+02 2.415e+02 3.499e+02, threshold=4.211e+02, percent-clipped=0.0 2023-09-30 05:48:20,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:48:22,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-30 05:48:24,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:48:27,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:48:27,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:48:29,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-30 05:48:29,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 05:48:29,609 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=615440.0, ans=0.025 2023-09-30 05:48:30,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-30 05:48:30,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:48:34,232 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:48:36,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-30 05:48:36,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:48:37,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:48:39,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:48:39,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-30 05:48:43,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-30 05:48:43,011 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:48:43,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:48:46,409 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=615506.6666666666, ans=0.125 2023-09-30 05:48:49,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:48:51,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:48:51,204 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:48:51,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:48:53,689 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.83 vs. limit=22.5 2023-09-30 05:48:54,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:48:55,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:48:55,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:48:55,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:48:55,969 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:49:00,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:49:00,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-30 05:49:07,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 05:49:08,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:49:12,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:49:12,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:49:15,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:49:16,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:49:16,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:49:19,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 05:49:19,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:49:20,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:49:22,119 INFO [train.py:1039] (1/4) Epoch 18, batch 2050, loss[loss=0.1774, simple_loss=0.2531, pruned_loss=0.05088, over 23453.00 frames. ], tot_loss[loss=0.1815, simple_loss=0.256, pruned_loss=0.05346, over 4694738.80 frames. ], batch size: 106, lr: 5.75e-03, grad_scale: 8.0 2023-09-30 05:49:22,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:49:25,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:49:27,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:49:31,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:49:35,014 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:49:36,449 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:49:37,036 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.28 vs. limit=10.0 2023-09-30 05:49:38,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:49:40,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-30 05:49:40,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:49:40,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:49:41,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-30 05:49:51,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:49:52,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:49:53,178 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-30 05:49:53,476 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=615840.0, ans=0.2 2023-09-30 05:49:56,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:49:58,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-30 05:49:58,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:50:03,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:50:04,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:50:05,068 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-30 05:50:06,410 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:50:06,610 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:50:08,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:50:09,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:50:12,289 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=21.07 vs. limit=22.5 2023-09-30 05:50:14,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:50:16,166 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:50:18,423 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-30 05:50:18,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:50:23,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:50:23,676 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=615906.6666666666, ans=0.125 2023-09-30 05:50:28,536 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:50:30,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-30 05:50:35,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:50:37,155 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:50:40,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:50:43,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-30 05:50:43,550 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=616040.0, ans=0.125 2023-09-30 05:50:44,547 INFO [train.py:1039] (1/4) Epoch 18, batch 2100, loss[loss=0.1669, simple_loss=0.2528, pruned_loss=0.04054, over 24470.00 frames. ], tot_loss[loss=0.1802, simple_loss=0.2544, pruned_loss=0.05304, over 4697302.27 frames. ], batch size: 66, lr: 5.75e-03, grad_scale: 8.0 2023-09-30 05:50:46,744 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-30 05:50:46,744 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:50:48,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:50:48,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 05:50:48,657 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=616040.0, ans=0.125 2023-09-30 05:50:49,834 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:50:49,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-30 05:50:51,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-30 05:50:53,181 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:50:56,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:50:57,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:50:59,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:50:59,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:51:00,910 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-30 05:51:01,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 05:51:01,134 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-30 05:51:01,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-30 05:51:03,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:51:03,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:51:03,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-30 05:51:04,703 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.913e+02 2.117e+02 2.530e+02 3.526e+02, threshold=4.234e+02, percent-clipped=0.0 2023-09-30 05:51:05,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 05:51:10,064 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-30 05:51:10,066 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 05:51:14,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:51:14,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:51:17,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:51:19,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-30 05:51:21,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:51:21,143 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 05:51:22,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-30 05:51:22,910 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:51:22,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-30 05:51:24,249 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-30 05:51:24,336 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-30 05:51:27,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:51:29,570 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:51:31,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 05:51:34,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 05:51:35,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:51:39,085 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:51:39,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-30 05:51:39,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:51:39,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:51:39,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:51:41,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-30 05:51:42,874 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-30 05:51:42,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-30 05:51:47,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:51:50,671 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:51:50,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-30 05:51:55,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:51:57,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:51:59,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:51:59,487 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:51:59,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-30 05:51:59,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 05:52:01,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:52:02,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-30 05:52:02,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:52:02,713 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:52:04,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-30 05:52:05,927 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-30 05:52:05,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:52:07,389 INFO [train.py:1039] (1/4) Epoch 18, batch 2150, loss[loss=0.1805, simple_loss=0.2606, pruned_loss=0.05017, over 24412.00 frames. ], tot_loss[loss=0.1797, simple_loss=0.2541, pruned_loss=0.05269, over 4710584.07 frames. ], batch size: 77, lr: 5.75e-03, grad_scale: 8.0 2023-09-30 05:52:08,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:52:09,001 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:52:09,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:52:09,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:52:16,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 05:52:17,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:52:19,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:52:21,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:52:21,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:52:21,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:52:25,619 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:52:27,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:52:27,616 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:52:30,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:52:30,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-30 05:52:36,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:52:37,612 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:52:39,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:52:39,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:52:39,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:52:39,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-30 05:52:40,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:52:40,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:52:40,776 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:52:42,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-30 05:52:43,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:52:45,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:52:46,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:52:47,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 05:52:47,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:52:50,732 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:52:50,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-30 05:52:52,598 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=616506.6666666666, ans=0.07 2023-09-30 05:52:52,620 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=616506.6666666666, ans=10.0 2023-09-30 05:52:53,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:52:53,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-30 05:52:53,983 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:52:57,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:52:58,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:52:58,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:53:00,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 05:53:00,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:53:00,541 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=616573.3333333334, ans=0.125 2023-09-30 05:53:01,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:53:01,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-30 05:53:04,545 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.05 vs. limit=15.0 2023-09-30 05:53:05,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-30 05:53:05,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:53:05,872 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-30 05:53:05,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:53:05,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:53:07,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-30 05:53:07,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:53:07,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-30 05:53:07,403 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-30 05:53:07,403 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-30 05:53:07,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-30 05:53:10,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:53:10,617 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:53:10,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:53:12,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:53:13,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 05:53:15,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:53:15,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:53:19,857 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=616640.0, ans=0.125 2023-09-30 05:53:19,907 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=616640.0, ans=0.1 2023-09-30 05:53:24,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:53:24,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-30 05:53:29,835 INFO [train.py:1039] (1/4) Epoch 18, batch 2200, loss[loss=0.2065, simple_loss=0.273, pruned_loss=0.06995, over 22919.00 frames. ], tot_loss[loss=0.1794, simple_loss=0.2536, pruned_loss=0.05256, over 4701772.42 frames. ], batch size: 322, lr: 5.74e-03, grad_scale: 8.0 2023-09-30 05:53:29,887 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:53:33,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:53:33,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:53:34,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:53:36,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-30 05:53:40,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:53:40,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:53:40,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-30 05:53:46,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-30 05:53:48,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 05:53:49,584 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.73 vs. limit=12.0 2023-09-30 05:53:49,922 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.966e+02 2.292e+02 2.651e+02 4.144e+02, threshold=4.583e+02, percent-clipped=0.0 2023-09-30 05:53:56,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-30 05:53:59,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:53:59,453 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=616773.3333333334, ans=0.1 2023-09-30 05:54:00,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:54:00,894 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=616840.0, ans=0.2 2023-09-30 05:54:02,106 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:54:03,845 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:54:05,243 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-30 05:54:09,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-30 05:54:11,109 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:54:11,227 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-30 05:54:15,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-30 05:54:18,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:54:18,688 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=616906.6666666666, ans=0.0 2023-09-30 05:54:21,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:54:22,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:54:24,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-30 05:54:26,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:54:26,241 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=616906.6666666666, ans=0.05 2023-09-30 05:54:28,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-30 05:54:30,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:54:30,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-30 05:54:30,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:54:32,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:54:33,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:54:33,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:54:33,551 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:54:35,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-30 05:54:35,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:54:36,779 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 05:54:39,903 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 05:54:40,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:54:42,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-30 05:54:44,434 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-30 05:54:44,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:54:45,043 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=616973.3333333334, ans=0.1 2023-09-30 05:54:46,748 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-30 05:54:46,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-30 05:54:46,992 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-30 05:54:50,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:54:50,617 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-30 05:54:52,041 INFO [train.py:1039] (1/4) Epoch 18, batch 2250, loss[loss=0.1719, simple_loss=0.2585, pruned_loss=0.0427, over 24450.00 frames. ], tot_loss[loss=0.18, simple_loss=0.2549, pruned_loss=0.05255, over 4714232.54 frames. ], batch size: 69, lr: 5.74e-03, grad_scale: 8.0 2023-09-30 05:54:52,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:54:52,575 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=617040.0, ans=0.05 2023-09-30 05:54:53,752 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-30 05:54:55,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:54:57,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:54:57,772 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=617040.0, ans=0.125 2023-09-30 05:55:03,146 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.13 vs. limit=15.0 2023-09-30 05:55:04,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:55:06,893 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-30 05:55:09,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:55:10,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:55:11,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:55:13,455 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=617106.6666666666, ans=0.0 2023-09-30 05:55:14,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-30 05:55:14,756 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:55:14,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:55:17,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-30 05:55:19,332 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:55:19,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:55:20,892 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:55:21,676 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.26 vs. limit=15.0 2023-09-30 05:55:22,825 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=617173.3333333334, ans=0.125 2023-09-30 05:55:26,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:55:27,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 05:55:29,170 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-30 05:55:30,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-30 05:55:30,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:55:33,402 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=617173.3333333334, ans=0.09899494936611666 2023-09-30 05:55:34,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:55:39,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:55:41,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:55:41,596 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=617240.0, ans=0.2 2023-09-30 05:55:42,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:55:42,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:55:44,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:55:45,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:55:50,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:55:53,460 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-30 05:55:59,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 05:55:59,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-30 05:55:59,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:56:04,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 05:56:07,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-30 05:56:07,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-30 05:56:07,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:56:09,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:56:12,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-30 05:56:14,355 INFO [train.py:1039] (1/4) Epoch 18, batch 2300, loss[loss=0.225, simple_loss=0.2852, pruned_loss=0.08237, over 23402.00 frames. ], tot_loss[loss=0.1813, simple_loss=0.2562, pruned_loss=0.0532, over 4712989.31 frames. ], batch size: 285, lr: 5.74e-03, grad_scale: 8.0 2023-09-30 05:56:14,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:56:14,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:56:20,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:56:20,595 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:56:23,532 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-30 05:56:26,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:56:33,609 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 1.928e+02 2.258e+02 2.853e+02 4.796e+02, threshold=4.517e+02, percent-clipped=2.0 2023-09-30 05:56:33,761 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:56:33,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-30 05:56:33,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:56:33,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:56:33,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-30 05:56:35,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:56:38,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:56:38,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:56:40,852 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:56:42,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-30 05:56:47,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:56:48,098 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=617506.6666666666, ans=0.125 2023-09-30 05:56:52,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:56:53,641 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:56:56,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:57:00,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:57:03,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:57:05,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 05:57:05,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:57:05,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-30 05:57:10,506 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 05:57:10,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:57:10,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:57:10,631 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:57:12,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:57:12,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 05:57:12,193 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:57:13,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-30 05:57:13,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:57:13,630 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:57:13,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-30 05:57:13,876 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=617573.3333333334, ans=0.0 2023-09-30 05:57:18,102 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.69 vs. limit=10.0 2023-09-30 05:57:22,450 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:57:27,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:57:30,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:57:30,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:57:31,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-30 05:57:32,507 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.00 vs. limit=15.0 2023-09-30 05:57:35,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 05:57:35,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:57:35,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 05:57:35,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-30 05:57:37,207 INFO [train.py:1039] (1/4) Epoch 18, batch 2350, loss[loss=0.1689, simple_loss=0.2518, pruned_loss=0.04296, over 24691.00 frames. ], tot_loss[loss=0.1811, simple_loss=0.2565, pruned_loss=0.05286, over 4716331.60 frames. ], batch size: 73, lr: 5.74e-03, grad_scale: 8.0 2023-09-30 05:57:42,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:57:42,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-30 05:57:47,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-30 05:57:49,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:57:50,280 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=617706.6666666666, ans=0.2 2023-09-30 05:57:52,825 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:57:52,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:57:52,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:57:54,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:57:56,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-30 05:57:57,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:58:01,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-30 05:58:02,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:58:06,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:58:06,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:58:07,506 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=7.77 vs. limit=10.0 2023-09-30 05:58:11,558 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:58:11,937 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=617840.0, ans=0.1 2023-09-30 05:58:13,193 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-30 05:58:14,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:58:14,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:58:14,909 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:58:16,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:58:19,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:58:22,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-30 05:58:22,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:58:25,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:58:25,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:58:30,299 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.47 vs. limit=15.0 2023-09-30 05:58:30,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-30 05:58:31,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-30 05:58:34,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-30 05:58:34,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-30 05:58:40,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-30 05:58:43,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-30 05:58:45,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:58:45,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-30 05:58:45,371 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-30 05:58:45,410 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-30 05:58:48,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-30 05:58:50,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:58:55,539 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=617973.3333333334, ans=0.125 2023-09-30 05:58:56,641 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:58:59,666 INFO [train.py:1039] (1/4) Epoch 18, batch 2400, loss[loss=0.1754, simple_loss=0.2595, pruned_loss=0.04567, over 24311.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.2555, pruned_loss=0.0528, over 4710008.41 frames. ], batch size: 77, lr: 5.74e-03, grad_scale: 16.0 2023-09-30 05:59:00,552 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:59:02,071 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:59:03,425 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-30 05:59:04,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-30 05:59:13,334 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 05:59:13,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:59:15,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-30 05:59:17,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:59:18,523 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:59:18,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-30 05:59:20,009 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.950e+02 2.202e+02 2.533e+02 3.814e+02, threshold=4.404e+02, percent-clipped=0.0 2023-09-30 05:59:25,374 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:59:27,008 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-30 05:59:31,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-30 05:59:34,899 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-30 05:59:38,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:59:42,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:59:46,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:59:46,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-30 05:59:46,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:59:58,608 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:00:01,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:00:03,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:00:06,405 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:00:06,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-30 06:00:06,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:00:06,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:00:06,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:00:06,588 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 06:00:11,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:00:11,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 06:00:12,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-30 06:00:12,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-30 06:00:15,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:00:16,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:00:16,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-30 06:00:16,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-30 06:00:16,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-30 06:00:16,663 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-30 06:00:18,183 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-30 06:00:20,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:00:20,465 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:00:21,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:00:23,277 INFO [train.py:1039] (1/4) Epoch 18, batch 2450, loss[loss=0.1698, simple_loss=0.2586, pruned_loss=0.0405, over 24316.00 frames. ], tot_loss[loss=0.1794, simple_loss=0.2541, pruned_loss=0.05232, over 4715183.72 frames. ], batch size: 74, lr: 5.74e-03, grad_scale: 16.0 2023-09-30 06:00:23,400 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-30 06:00:24,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:00:24,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-30 06:00:28,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:00:28,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:00:31,086 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.41 vs. limit=12.0 2023-09-30 06:00:33,832 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:00:33,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:00:35,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-30 06:00:38,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:00:38,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:00:43,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:00:43,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:00:43,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:00:44,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-30 06:00:50,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:00:50,314 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=618440.0, ans=0.1 2023-09-30 06:00:51,603 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 06:00:53,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:00:53,831 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=618440.0, ans=0.2 2023-09-30 06:00:56,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-30 06:00:56,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 06:00:58,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 06:00:58,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:01:01,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-30 06:01:03,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:01:08,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:01:10,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:01:10,784 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=618506.6666666666, ans=0.1 2023-09-30 06:01:11,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:01:11,968 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:01:12,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:01:13,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:01:14,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-30 06:01:20,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 06:01:20,199 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:01:24,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:01:24,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:01:27,134 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=618573.3333333334, ans=0.1 2023-09-30 06:01:31,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:01:31,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-30 06:01:32,784 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:01:32,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:01:32,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-30 06:01:34,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:01:34,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:01:36,627 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=618640.0, ans=0.025 2023-09-30 06:01:39,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:01:41,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:01:41,759 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=618640.0, ans=0.0 2023-09-30 06:01:43,449 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:01:46,485 INFO [train.py:1039] (1/4) Epoch 18, batch 2500, loss[loss=0.1771, simple_loss=0.2499, pruned_loss=0.05214, over 23744.00 frames. ], tot_loss[loss=0.1786, simple_loss=0.2529, pruned_loss=0.05212, over 4710579.54 frames. ], batch size: 135, lr: 5.74e-03, grad_scale: 16.0 2023-09-30 06:01:46,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-30 06:01:48,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:01:54,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:02:03,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:02:04,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:02:04,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:02:04,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-30 06:02:06,283 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.482e+02 1.833e+02 2.012e+02 2.316e+02 3.261e+02, threshold=4.025e+02, percent-clipped=0.0 2023-09-30 06:02:13,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 06:02:13,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:02:14,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-30 06:02:14,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 06:02:14,900 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-30 06:02:18,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:02:18,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:02:19,873 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-30 06:02:19,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:02:20,036 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-30 06:02:20,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:02:24,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:02:26,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:02:28,654 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 06:02:29,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-30 06:02:31,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:02:32,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:02:38,137 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:02:41,261 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:02:41,623 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=618906.6666666666, ans=0.125 2023-09-30 06:02:45,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:02:46,002 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.10 vs. limit=15.0 2023-09-30 06:02:46,141 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.15 vs. limit=10.0 2023-09-30 06:02:49,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-30 06:02:53,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-30 06:02:53,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:02:53,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-30 06:02:56,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:02:56,434 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:02:56,625 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-30 06:02:56,625 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-30 06:02:56,645 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-30 06:03:01,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:03:04,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-30 06:03:04,767 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-30 06:03:04,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:03:04,980 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-30 06:03:06,690 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=618973.3333333334, ans=0.125 2023-09-30 06:03:09,323 INFO [train.py:1039] (1/4) Epoch 18, batch 2550, loss[loss=0.1842, simple_loss=0.2613, pruned_loss=0.05356, over 23333.00 frames. ], tot_loss[loss=0.1793, simple_loss=0.2539, pruned_loss=0.05236, over 4714808.54 frames. ], batch size: 93, lr: 5.73e-03, grad_scale: 16.0 2023-09-30 06:03:09,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-30 06:03:11,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:03:11,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:03:13,410 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:03:16,431 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:03:16,562 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-30 06:03:18,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:03:21,829 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-30 06:03:23,423 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:03:26,732 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:03:28,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:03:28,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 06:03:29,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 06:03:29,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:03:31,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:03:33,068 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-30 06:03:34,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-30 06:03:34,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-30 06:03:34,404 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:03:34,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-30 06:03:48,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:03:51,751 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=619173.3333333334, ans=0.125 2023-09-30 06:03:53,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:03:53,656 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:03:53,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:03:55,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 06:04:00,162 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=619240.0, ans=0.0 2023-09-30 06:04:01,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:04:06,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 06:04:06,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:04:06,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:04:07,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-30 06:04:07,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-30 06:04:12,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:04:12,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:04:17,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:04:17,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-30 06:04:17,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:04:19,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:04:20,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-30 06:04:20,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 06:04:22,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:04:31,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:04:32,418 INFO [train.py:1039] (1/4) Epoch 18, batch 2600, loss[loss=0.2049, simple_loss=0.2801, pruned_loss=0.06482, over 24062.00 frames. ], tot_loss[loss=0.1799, simple_loss=0.2545, pruned_loss=0.05259, over 4710973.25 frames. ], batch size: 80, lr: 5.73e-03, grad_scale: 16.0 2023-09-30 06:04:32,680 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:04:35,711 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-30 06:04:38,654 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-30 06:04:38,692 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:04:40,705 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-30 06:04:40,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-30 06:04:40,872 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-30 06:04:44,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:04:44,802 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-30 06:04:45,036 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 06:04:46,313 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-30 06:04:47,862 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-30 06:04:49,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:04:51,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-30 06:04:52,586 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 1.862e+02 2.048e+02 2.291e+02 3.453e+02, threshold=4.097e+02, percent-clipped=0.0 2023-09-30 06:04:52,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-30 06:04:54,283 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-30 06:04:54,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-30 06:04:54,817 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=619440.0, ans=0.125 2023-09-30 06:04:55,963 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-30 06:04:57,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-30 06:05:04,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:05:04,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:05:04,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:05:04,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-30 06:05:07,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:05:15,985 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-30 06:05:24,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:05:24,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:05:25,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-30 06:05:25,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:05:25,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:05:27,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-30 06:05:30,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:05:30,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:05:33,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:05:38,057 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.96 vs. limit=15.0 2023-09-30 06:05:38,607 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-30 06:05:38,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:05:38,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:05:43,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:05:45,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:05:45,493 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-30 06:05:45,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:05:50,029 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:05:50,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:05:52,617 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=619640.0, ans=0.2 2023-09-30 06:05:54,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-30 06:05:55,846 INFO [train.py:1039] (1/4) Epoch 18, batch 2650, loss[loss=0.1845, simple_loss=0.2666, pruned_loss=0.05122, over 23366.00 frames. ], tot_loss[loss=0.1813, simple_loss=0.2559, pruned_loss=0.05332, over 4711277.40 frames. ], batch size: 93, lr: 5.73e-03, grad_scale: 16.0 2023-09-30 06:05:56,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:05:57,669 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 06:05:59,919 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.55 vs. limit=15.0 2023-09-30 06:06:00,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-30 06:06:00,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:06:02,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 06:06:02,463 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-30 06:06:02,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:06:06,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:06:08,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 06:06:09,176 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.26 vs. limit=22.5 2023-09-30 06:06:10,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:06:12,245 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:06:13,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-30 06:06:13,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:06:13,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:06:18,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-30 06:06:20,312 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-30 06:06:23,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:06:26,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-30 06:06:28,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:06:28,292 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-30 06:06:33,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:06:33,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:06:33,629 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:06:34,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:06:39,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-30 06:06:39,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-30 06:06:41,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:06:44,523 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-30 06:06:44,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:06:46,001 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:06:46,068 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-30 06:06:47,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:06:48,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:06:51,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:06:53,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:06:54,664 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:06:54,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-30 06:06:55,000 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=619906.6666666666, ans=0.0 2023-09-30 06:06:56,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:06:57,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:06:59,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:06:59,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:07:01,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:07:02,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-30 06:07:05,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:07:06,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:07:06,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:07:06,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-30 06:07:10,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:07:11,734 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:07:14,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:07:15,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:07:16,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-30 06:07:16,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:07:16,718 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.70 vs. limit=15.0 2023-09-30 06:07:17,508 INFO [train.py:1039] (1/4) Epoch 18, batch 2700, loss[loss=0.1842, simple_loss=0.2487, pruned_loss=0.05984, over 23614.00 frames. ], tot_loss[loss=0.1816, simple_loss=0.2564, pruned_loss=0.05341, over 4706591.91 frames. ], batch size: 256, lr: 5.73e-03, grad_scale: 16.0 2023-09-30 06:07:18,001 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=620040.0, ans=0.125 2023-09-30 06:07:19,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:07:19,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-30 06:07:20,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:07:24,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 06:07:24,768 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=620040.0, ans=0.0 2023-09-30 06:07:24,826 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=620040.0, ans=0.125 2023-09-30 06:07:26,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:07:26,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:07:28,065 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:07:28,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:07:28,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:07:28,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:07:29,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-30 06:07:29,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-30 06:07:29,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:07:31,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:07:34,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 06:07:35,509 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:07:38,516 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.935e+02 2.170e+02 2.390e+02 3.266e+02, threshold=4.340e+02, percent-clipped=0.0 2023-09-30 06:07:38,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:07:40,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-30 06:07:40,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:07:45,040 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=620106.6666666666, ans=0.2 2023-09-30 06:07:47,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:07:47,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:07:51,065 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=620173.3333333334, ans=0.125 2023-09-30 06:07:52,393 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:07:53,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:07:53,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:07:53,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:07:55,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:08:01,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:08:01,123 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:08:01,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:08:04,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:08:04,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-30 06:08:14,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:08:15,122 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=620240.0, ans=0.125 2023-09-30 06:08:16,485 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:08:18,141 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 06:08:18,144 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:08:19,912 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=620240.0, ans=0.1 2023-09-30 06:08:21,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:08:22,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:08:22,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:08:25,818 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:08:27,281 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:08:27,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:08:30,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:08:32,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:08:32,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:08:35,102 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.91 vs. limit=15.0 2023-09-30 06:08:35,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-30 06:08:36,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:08:39,851 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:08:39,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-30 06:08:41,306 INFO [train.py:1039] (1/4) Epoch 18, batch 2750, loss[loss=0.1735, simple_loss=0.256, pruned_loss=0.04549, over 24284.00 frames. ], tot_loss[loss=0.1822, simple_loss=0.257, pruned_loss=0.05366, over 4701426.92 frames. ], batch size: 61, lr: 5.73e-03, grad_scale: 16.0 2023-09-30 06:08:41,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-30 06:08:41,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:08:45,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:08:46,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:08:49,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:08:49,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:08:49,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:08:52,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:08:54,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 06:08:54,679 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=620373.3333333334, ans=0.0 2023-09-30 06:08:55,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:08:55,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:08:55,717 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-30 06:08:55,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:08:55,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:09:01,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-30 06:09:03,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:09:03,477 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:09:05,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:09:07,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-30 06:09:07,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:09:07,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:09:09,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:09:10,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:09:12,269 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=620506.6666666666, ans=0.1 2023-09-30 06:09:15,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 06:09:15,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 06:09:15,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:09:17,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:09:18,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 06:09:25,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:09:28,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 06:09:29,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:09:32,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:09:32,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-30 06:09:32,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:09:39,645 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:09:41,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:09:41,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-30 06:09:46,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:09:47,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-30 06:09:53,108 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-30 06:09:54,859 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:09:56,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-30 06:09:58,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:09:59,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:09:59,871 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-30 06:09:59,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:10:02,957 INFO [train.py:1039] (1/4) Epoch 18, batch 2800, loss[loss=0.1716, simple_loss=0.24, pruned_loss=0.0516, over 23849.00 frames. ], tot_loss[loss=0.1808, simple_loss=0.2553, pruned_loss=0.05319, over 4696734.32 frames. ], batch size: 195, lr: 5.73e-03, grad_scale: 32.0 2023-09-30 06:10:03,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-30 06:10:03,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:10:03,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:10:04,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-30 06:10:04,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:10:06,161 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:10:06,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:10:07,818 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-30 06:10:07,819 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-30 06:10:11,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:10:13,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 06:10:13,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:10:18,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:10:20,050 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-30 06:10:21,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-30 06:10:23,090 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.604e+02 1.862e+02 2.010e+02 2.282e+02 3.813e+02, threshold=4.021e+02, percent-clipped=0.0 2023-09-30 06:10:23,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-30 06:10:24,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:10:24,941 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:10:24,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:10:30,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:10:30,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:10:30,094 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-30 06:10:31,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:10:40,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:10:42,365 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:10:45,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:10:47,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:10:47,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:10:53,852 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=620906.6666666666, ans=0.1 2023-09-30 06:10:54,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:10:54,927 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-30 06:10:55,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:10:56,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:10:56,455 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:11:01,061 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:11:01,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:11:04,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:11:06,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:11:06,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:11:06,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 06:11:07,919 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 06:11:09,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:11:10,130 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=15.49 vs. limit=15.0 2023-09-30 06:11:10,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:11:10,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-30 06:11:10,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:11:12,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:11:12,580 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:11:15,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-30 06:11:15,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:11:15,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:11:15,879 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=620973.3333333334, ans=0.07 2023-09-30 06:11:16,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:11:19,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-30 06:11:25,718 INFO [train.py:1039] (1/4) Epoch 18, batch 2850, loss[loss=0.1817, simple_loss=0.2499, pruned_loss=0.05681, over 23893.00 frames. ], tot_loss[loss=0.18, simple_loss=0.2547, pruned_loss=0.05265, over 4688114.54 frames. ], batch size: 150, lr: 5.72e-03, grad_scale: 32.0 2023-09-30 06:11:27,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:11:27,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 06:11:28,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:11:29,212 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=621040.0, ans=0.125 2023-09-30 06:11:31,159 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:11:34,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:11:34,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:11:35,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:11:39,356 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:11:40,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:11:42,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:11:42,479 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-30 06:11:47,902 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.06 vs. limit=12.0 2023-09-30 06:11:48,590 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-30 06:11:48,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:11:50,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-30 06:11:50,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:11:55,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-30 06:11:55,490 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-30 06:11:56,959 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:12:11,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:12:12,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:12:12,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:12:12,959 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=621173.3333333334, ans=0.1 2023-09-30 06:12:14,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 06:12:14,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 06:12:14,473 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:12:17,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:12:17,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-30 06:12:19,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-30 06:12:19,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:12:19,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:12:19,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:12:20,881 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=621240.0, ans=0.1 2023-09-30 06:12:22,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:12:22,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:12:23,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:12:25,443 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:12:28,311 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:12:28,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:12:28,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:12:31,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:12:37,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:12:40,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-30 06:12:40,728 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-30 06:12:42,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 06:12:43,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:12:43,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-30 06:12:43,390 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=621306.6666666666, ans=0.125 2023-09-30 06:12:44,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:12:44,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:12:46,074 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:12:46,134 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:12:46,135 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-30 06:12:46,967 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.19 vs. limit=22.5 2023-09-30 06:12:47,677 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-30 06:12:47,683 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:12:47,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:12:49,183 INFO [train.py:1039] (1/4) Epoch 18, batch 2900, loss[loss=0.1876, simple_loss=0.2527, pruned_loss=0.06127, over 23429.00 frames. ], tot_loss[loss=0.18, simple_loss=0.255, pruned_loss=0.0525, over 4694003.43 frames. ], batch size: 285, lr: 5.72e-03, grad_scale: 32.0 2023-09-30 06:12:52,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-30 06:12:53,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:12:53,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:12:55,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-30 06:12:58,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:12:58,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-30 06:13:00,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-30 06:13:01,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:13:01,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:13:03,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:13:05,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:13:08,440 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.787e+02 2.109e+02 2.458e+02 3.664e+02, threshold=4.218e+02, percent-clipped=0.0 2023-09-30 06:13:10,533 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:13:10,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:13:13,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-30 06:13:13,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-30 06:13:14,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-30 06:13:16,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:13:17,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-30 06:13:19,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-30 06:13:22,237 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:13:22,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-30 06:13:22,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:13:23,176 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=5.42 vs. limit=12.0 2023-09-30 06:13:25,200 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:13:25,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-30 06:13:28,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:13:28,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:13:30,922 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.42 vs. limit=22.5 2023-09-30 06:13:33,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:13:36,375 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:13:37,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-30 06:13:37,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-30 06:13:37,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:13:43,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:13:48,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-30 06:13:50,407 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:13:54,432 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:14:02,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:14:02,128 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:14:03,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-30 06:14:08,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:14:08,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-30 06:14:08,130 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:14:08,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-30 06:14:10,964 INFO [train.py:1039] (1/4) Epoch 18, batch 2950, loss[loss=0.1712, simple_loss=0.248, pruned_loss=0.0472, over 23695.00 frames. ], tot_loss[loss=0.1814, simple_loss=0.2559, pruned_loss=0.05347, over 4674539.16 frames. ], batch size: 232, lr: 5.72e-03, grad_scale: 32.0 2023-09-30 06:14:14,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:14:16,306 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-30 06:14:18,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:14:18,600 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=621706.6666666666, ans=0.125 2023-09-30 06:14:20,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:14:20,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:14:22,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:14:23,543 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-30 06:14:24,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-30 06:14:25,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:14:25,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:14:32,010 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:14:33,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:14:36,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:14:38,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:14:40,261 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.32 vs. limit=22.5 2023-09-30 06:14:41,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:14:41,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:14:44,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:14:44,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:14:44,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:14:46,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-30 06:14:53,313 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-30 06:14:53,349 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-30 06:14:54,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 06:14:56,270 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-30 06:14:58,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-30 06:14:58,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:14:58,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:14:58,531 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-30 06:14:58,537 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-30 06:15:03,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-30 06:15:03,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:15:03,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:15:06,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:15:08,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:15:08,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:15:08,578 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-30 06:15:09,940 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:15:09,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-30 06:15:14,796 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:15:16,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:15:16,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-30 06:15:16,447 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:15:19,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-30 06:15:22,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:15:26,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:15:26,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 06:15:28,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:15:28,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 06:15:29,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:15:31,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:15:31,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-30 06:15:31,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:15:33,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:15:34,751 INFO [train.py:1039] (1/4) Epoch 18, batch 3000, loss[loss=0.2012, simple_loss=0.2778, pruned_loss=0.06235, over 23664.00 frames. ], tot_loss[loss=0.1825, simple_loss=0.2571, pruned_loss=0.05399, over 4692841.22 frames. ], batch size: 85, lr: 5.72e-03, grad_scale: 32.0 2023-09-30 06:15:34,752 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-30 06:15:43,793 INFO [zipformer.py:1853] (1/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([3.6072, 3.3079, 2.3521, 3.4767, 2.5739, 2.9443, 3.4005, 3.3722], device='cuda:1') 2023-09-30 06:15:49,345 INFO [train.py:1071] (1/4) Epoch 18, validation: loss=0.3403, simple_loss=0.2856, pruned_loss=0.1975, over 1125622.00 frames. 2023-09-30 06:15:49,346 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-30 06:15:49,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:15:51,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:15:51,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-30 06:15:52,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:15:56,244 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:15:56,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-30 06:16:01,508 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-30 06:16:01,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-30 06:16:03,177 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:16:03,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:16:03,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-30 06:16:04,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:16:09,646 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.852e+02 2.145e+02 2.482e+02 3.954e+02, threshold=4.290e+02, percent-clipped=0.0 2023-09-30 06:16:12,851 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 06:16:22,666 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:16:28,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-30 06:16:31,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:16:34,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:16:36,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:16:36,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:16:37,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:16:37,758 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-30 06:16:38,018 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-30 06:16:38,804 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.36 vs. limit=15.0 2023-09-30 06:16:39,682 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=622240.0, ans=0.0 2023-09-30 06:16:39,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=622240.0, ans=0.0 2023-09-30 06:16:41,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:16:41,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 06:16:44,604 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 06:16:44,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:16:44,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:16:44,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:16:48,053 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=622240.0, ans=0.125 2023-09-30 06:16:49,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 06:16:49,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:16:49,349 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:16:50,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:16:53,171 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-30 06:16:53,815 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.83 vs. limit=10.0 2023-09-30 06:16:54,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:16:54,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:16:56,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:16:59,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:17:00,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:17:01,380 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.82 vs. limit=15.0 2023-09-30 06:17:02,875 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-30 06:17:02,935 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-30 06:17:02,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:17:03,040 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-30 06:17:05,101 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:17:07,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-30 06:17:10,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:17:12,318 INFO [train.py:1039] (1/4) Epoch 18, batch 3050, loss[loss=0.1666, simple_loss=0.2431, pruned_loss=0.04502, over 24475.00 frames. ], tot_loss[loss=0.1835, simple_loss=0.2581, pruned_loss=0.0545, over 4694239.96 frames. ], batch size: 63, lr: 5.72e-03, grad_scale: 32.0 2023-09-30 06:17:12,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 06:17:12,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-30 06:17:13,961 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-30 06:17:13,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 06:17:15,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:17:15,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:17:15,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-30 06:17:16,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:17:17,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:17:20,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-30 06:17:22,236 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:17:25,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:17:25,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 06:17:30,372 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:17:33,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-30 06:17:40,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-30 06:17:40,132 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-30 06:17:40,268 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=622440.0, ans=0.0 2023-09-30 06:17:41,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:17:43,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:17:47,029 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:17:48,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:17:48,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:17:52,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:17:53,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:17:53,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:17:54,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:17:54,524 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:17:56,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:17:58,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:18:01,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:18:02,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-30 06:18:02,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:18:03,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:18:05,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:18:05,349 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:18:05,455 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:18:05,788 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=622573.3333333334, ans=0.2 2023-09-30 06:18:06,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:18:10,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:18:10,687 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:18:17,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:18:17,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:18:17,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:18:22,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:18:22,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 06:18:22,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:18:23,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-30 06:18:25,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:18:26,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:18:27,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-30 06:18:29,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:18:34,525 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:18:35,974 INFO [train.py:1039] (1/4) Epoch 18, batch 3100, loss[loss=0.1619, simple_loss=0.2474, pruned_loss=0.03816, over 24650.00 frames. ], tot_loss[loss=0.1827, simple_loss=0.257, pruned_loss=0.05415, over 4706619.95 frames. ], batch size: 68, lr: 5.72e-03, grad_scale: 16.0 2023-09-30 06:18:37,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:18:39,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 06:18:40,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-30 06:18:42,495 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=622706.6666666666, ans=0.125 2023-09-30 06:18:45,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-30 06:18:47,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-30 06:18:49,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:18:52,572 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:18:52,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:18:55,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-30 06:18:57,048 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 1.806e+02 2.048e+02 2.293e+02 3.321e+02, threshold=4.096e+02, percent-clipped=0.0 2023-09-30 06:18:58,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:19:04,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-30 06:19:08,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 06:19:08,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:19:09,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:19:09,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:19:10,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-30 06:19:13,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:19:13,487 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-30 06:19:13,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:19:13,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:19:17,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-30 06:19:18,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:19:22,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-30 06:19:23,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-30 06:19:24,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-30 06:19:25,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:19:25,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:19:28,244 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.67 vs. limit=15.0 2023-09-30 06:19:28,743 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:19:28,761 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:19:28,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:19:30,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-30 06:19:30,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:19:33,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:19:33,389 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:19:33,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:19:33,402 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 06:19:39,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:19:40,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-30 06:19:43,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:19:43,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-30 06:19:45,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:19:45,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:19:45,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-30 06:19:50,316 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=622973.3333333334, ans=0.125 2023-09-30 06:19:57,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-30 06:19:58,738 INFO [train.py:1039] (1/4) Epoch 18, batch 3150, loss[loss=0.1797, simple_loss=0.237, pruned_loss=0.06115, over 22700.00 frames. ], tot_loss[loss=0.1809, simple_loss=0.255, pruned_loss=0.05341, over 4686342.13 frames. ], batch size: 322, lr: 5.72e-03, grad_scale: 16.0 2023-09-30 06:20:00,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:20:00,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:20:00,732 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=623040.0, ans=0.125 2023-09-30 06:20:03,509 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:20:03,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:20:04,300 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.64 vs. limit=12.0 2023-09-30 06:20:04,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-30 06:20:05,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:20:05,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-30 06:20:06,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-30 06:20:08,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:20:09,879 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-30 06:20:14,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-30 06:20:14,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:20:16,402 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-30 06:20:18,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-30 06:20:18,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-30 06:20:20,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-30 06:20:20,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-30 06:20:20,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:20:20,073 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:20:21,648 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:20:23,221 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-30 06:20:27,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:20:27,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:20:27,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:20:30,559 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-30 06:20:33,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-30 06:20:33,998 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:20:36,951 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-30 06:20:38,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:20:38,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-30 06:20:41,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-30 06:20:43,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:20:43,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 06:20:43,182 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 06:20:44,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:20:44,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 06:20:44,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-30 06:20:46,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-30 06:20:47,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-30 06:20:49,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 06:20:49,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:20:52,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:20:52,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:20:52,914 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-30 06:20:54,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:20:56,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-30 06:20:56,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:20:58,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-30 06:20:58,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-30 06:21:01,900 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:21:01,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:21:04,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-30 06:21:04,270 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 06:21:05,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:21:08,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:21:10,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:21:10,423 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:21:15,443 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 06:21:16,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:21:18,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-30 06:21:21,448 INFO [train.py:1039] (1/4) Epoch 18, batch 3200, loss[loss=0.1607, simple_loss=0.2061, pruned_loss=0.05764, over 18930.00 frames. ], tot_loss[loss=0.1801, simple_loss=0.2539, pruned_loss=0.05318, over 4679565.48 frames. ], batch size: 388, lr: 5.71e-03, grad_scale: 32.0 2023-09-30 06:21:23,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:21:23,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-30 06:21:23,519 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=623373.3333333334, ans=0.125 2023-09-30 06:21:25,604 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=623373.3333333334, ans=0.125 2023-09-30 06:21:28,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:21:30,461 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:21:30,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-30 06:21:34,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:21:39,236 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-30 06:21:42,402 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:21:43,653 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.425e+02 1.833e+02 1.996e+02 2.319e+02 3.127e+02, threshold=3.992e+02, percent-clipped=0.0 2023-09-30 06:21:47,483 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=623440.0, ans=0.0 2023-09-30 06:21:50,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:21:59,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-30 06:22:01,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:22:01,786 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=623506.6666666666, ans=0.125 2023-09-30 06:22:04,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-30 06:22:04,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 06:22:08,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:22:08,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:22:10,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:22:15,488 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-30 06:22:17,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-30 06:22:17,547 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=623573.3333333334, ans=0.2 2023-09-30 06:22:20,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-30 06:22:21,838 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-30 06:22:24,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:22:32,769 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:22:32,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 06:22:32,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:22:32,925 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-30 06:22:32,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 06:22:37,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:22:39,015 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-30 06:22:41,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-30 06:22:41,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-30 06:22:43,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-30 06:22:44,462 INFO [train.py:1039] (1/4) Epoch 18, batch 3250, loss[loss=0.1858, simple_loss=0.2558, pruned_loss=0.05787, over 21440.00 frames. ], tot_loss[loss=0.18, simple_loss=0.254, pruned_loss=0.05302, over 4685005.63 frames. ], batch size: 47, lr: 5.71e-03, grad_scale: 32.0 2023-09-30 06:22:44,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:22:47,196 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=623706.6666666666, ans=0.125 2023-09-30 06:22:48,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-30 06:22:48,476 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-30 06:22:48,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:22:48,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:22:50,045 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-30 06:22:53,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:22:53,473 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=623706.6666666666, ans=0.125 2023-09-30 06:22:56,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:23:04,894 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.08 vs. limit=15.0 2023-09-30 06:23:05,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:23:05,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-30 06:23:07,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:23:07,364 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:23:07,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:23:08,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:23:08,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 06:23:09,719 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.68 vs. limit=15.0 2023-09-30 06:23:13,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:23:13,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-30 06:23:13,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:23:13,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:23:13,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:23:13,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:23:17,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:23:19,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:23:21,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:23:21,269 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:23:21,484 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=623840.0, ans=0.0 2023-09-30 06:23:22,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:23:24,257 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:23:24,285 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:23:28,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-30 06:23:30,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:23:30,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:23:32,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:23:32,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-30 06:23:37,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:23:46,629 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:23:46,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:23:46,671 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-30 06:23:46,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:23:46,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 06:23:46,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:23:49,809 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 06:23:51,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-30 06:23:51,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-30 06:23:51,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:23:53,179 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.27 vs. limit=22.5 2023-09-30 06:23:54,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:23:55,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:23:56,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-30 06:23:57,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:24:00,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:24:00,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:24:02,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-30 06:24:03,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:24:06,763 INFO [train.py:1039] (1/4) Epoch 18, batch 3300, loss[loss=0.1577, simple_loss=0.2412, pruned_loss=0.03707, over 24448.00 frames. ], tot_loss[loss=0.18, simple_loss=0.2545, pruned_loss=0.05278, over 4702597.33 frames. ], batch size: 58, lr: 5.71e-03, grad_scale: 16.0 2023-09-30 06:24:06,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:24:06,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-30 06:24:07,666 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.67 vs. limit=15.0 2023-09-30 06:24:09,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:24:09,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-30 06:24:12,851 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-30 06:24:12,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-30 06:24:13,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:24:18,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:24:18,591 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=624040.0, ans=0.125 2023-09-30 06:24:19,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:24:19,841 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:24:22,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 06:24:22,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 06:24:26,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:24:26,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:24:30,035 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.896e+02 2.095e+02 2.468e+02 4.456e+02, threshold=4.189e+02, percent-clipped=2.0 2023-09-30 06:24:33,711 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-30 06:24:33,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:24:33,855 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:24:35,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:24:36,822 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-30 06:24:36,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:24:37,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 06:24:38,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:24:38,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:24:38,557 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-30 06:24:43,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:24:43,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-30 06:24:43,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=624173.3333333334, ans=0.015 2023-09-30 06:24:44,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:24:44,802 WARNING [train.py:1197] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-30 06:24:46,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-30 06:24:46,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:24:47,214 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.46 vs. limit=10.0 2023-09-30 06:24:47,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:24:49,373 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-30 06:24:51,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-30 06:24:52,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:24:53,644 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.90 vs. limit=12.0 2023-09-30 06:24:54,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-30 06:24:56,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:24:59,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-30 06:24:59,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:25:03,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:25:04,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:25:04,266 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:25:04,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-30 06:25:04,507 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=624240.0, ans=0.125 2023-09-30 06:25:07,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:25:07,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:25:07,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:25:07,639 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 06:25:08,821 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-30 06:25:08,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-30 06:25:12,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-30 06:25:12,183 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:25:12,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:25:15,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:25:15,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:25:16,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 06:25:16,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:25:16,810 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-30 06:25:18,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:25:19,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 06:25:21,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-30 06:25:22,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:25:22,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:25:23,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:25:25,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:25:26,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:25:28,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:25:28,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:25:30,156 INFO [train.py:1039] (1/4) Epoch 18, batch 3350, loss[loss=0.1904, simple_loss=0.2505, pruned_loss=0.06516, over 23827.00 frames. ], tot_loss[loss=0.1804, simple_loss=0.2556, pruned_loss=0.0526, over 4714456.12 frames. ], batch size: 164, lr: 5.71e-03, grad_scale: 16.0 2023-09-30 06:25:33,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:25:34,347 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=624373.3333333334, ans=0.125 2023-09-30 06:25:35,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:25:35,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:25:39,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:25:40,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:25:41,312 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=624373.3333333334, ans=0.1 2023-09-30 06:25:44,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:25:44,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:25:44,294 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=624373.3333333334, ans=0.0 2023-09-30 06:25:45,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-30 06:25:47,215 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-30 06:25:47,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:25:49,082 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=624440.0, ans=0.04949747468305833 2023-09-30 06:25:51,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-30 06:25:51,834 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-30 06:25:53,280 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 06:25:53,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:25:54,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:25:55,084 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=624440.0, ans=0.1 2023-09-30 06:25:56,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-30 06:25:56,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:25:56,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:25:58,619 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=624440.0, ans=0.0 2023-09-30 06:25:59,917 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:26:01,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:26:02,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:26:02,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:26:06,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:26:09,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:26:09,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:26:13,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:26:15,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:26:17,430 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:26:17,716 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=624506.6666666666, ans=0.125 2023-09-30 06:26:18,714 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:26:21,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:26:23,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-30 06:26:24,637 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 06:26:24,702 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-30 06:26:24,760 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:26:27,643 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-30 06:26:27,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:26:28,145 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=624573.3333333334, ans=0.0 2023-09-30 06:26:29,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:26:37,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:26:39,233 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-30 06:26:39,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 06:26:40,879 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:26:42,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:26:47,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:26:51,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-30 06:26:51,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 06:26:51,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:26:53,299 INFO [train.py:1039] (1/4) Epoch 18, batch 3400, loss[loss=0.1687, simple_loss=0.2496, pruned_loss=0.04389, over 24080.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.2562, pruned_loss=0.05242, over 4707565.10 frames. ], batch size: 80, lr: 5.71e-03, grad_scale: 16.0 2023-09-30 06:26:53,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:26:53,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-30 06:26:54,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:26:54,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-30 06:26:56,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:26:56,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:26:56,626 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:26:58,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:26:58,121 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-30 06:27:02,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-30 06:27:02,826 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-30 06:27:02,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:27:07,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:27:07,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 06:27:09,515 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:27:10,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-30 06:27:16,036 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.548e+02 1.809e+02 2.054e+02 2.312e+02 3.383e+02, threshold=4.108e+02, percent-clipped=0.0 2023-09-30 06:27:16,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:27:17,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-30 06:27:22,847 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:27:24,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:27:25,147 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:27:27,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-30 06:27:35,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:27:39,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-30 06:27:41,901 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=624906.6666666666, ans=0.0 2023-09-30 06:27:42,317 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=18.27 vs. limit=22.5 2023-09-30 06:27:45,282 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:27:47,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:27:48,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-30 06:27:48,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:27:50,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:27:50,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:27:50,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:27:53,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:27:55,412 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=624906.6666666666, ans=0.125 2023-09-30 06:27:58,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:28:00,083 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:28:04,925 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:28:07,022 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-30 06:28:07,471 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=624973.3333333334, ans=0.125 2023-09-30 06:28:09,329 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.57 vs. limit=15.0 2023-09-30 06:28:13,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 06:28:14,627 INFO [train.py:1039] (1/4) Epoch 18, batch 3450, loss[loss=0.1762, simple_loss=0.2345, pruned_loss=0.05894, over 23751.00 frames. ], tot_loss[loss=0.1802, simple_loss=0.2562, pruned_loss=0.0521, over 4723715.32 frames. ], batch size: 232, lr: 5.71e-03, grad_scale: 16.0 2023-09-30 06:28:16,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-30 06:28:21,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-30 06:28:21,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:28:23,356 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:28:23,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-30 06:28:23,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:28:27,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:28:32,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:28:32,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:28:34,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:28:34,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:28:37,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:28:38,240 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=625106.6666666666, ans=0.0 2023-09-30 06:28:42,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-30 06:28:48,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-30 06:28:48,747 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 06:28:48,822 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:28:52,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:28:57,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-30 06:28:59,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 06:29:04,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:29:05,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:29:07,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-30 06:29:08,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:29:12,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-30 06:29:12,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:29:12,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:29:16,271 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1.whitening_limit, batch_count=625240.0, ans=10.0 2023-09-30 06:29:16,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:29:18,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-30 06:29:23,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:29:25,422 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=625306.6666666666, ans=0.0 2023-09-30 06:29:28,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:29:28,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:29:33,560 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:29:37,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:29:37,273 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:29:38,627 INFO [train.py:1039] (1/4) Epoch 18, batch 3500, loss[loss=0.1762, simple_loss=0.2613, pruned_loss=0.04557, over 24301.00 frames. ], tot_loss[loss=0.1797, simple_loss=0.2546, pruned_loss=0.0524, over 4695225.74 frames. ], batch size: 74, lr: 5.70e-03, grad_scale: 16.0 2023-09-30 06:29:38,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:29:40,645 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:29:45,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:29:47,563 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:29:48,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-30 06:29:50,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 06:29:53,576 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-30 06:29:55,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:29:55,274 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-30 06:30:01,127 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.444e+02 1.872e+02 2.058e+02 2.368e+02 3.255e+02, threshold=4.116e+02, percent-clipped=0.0 2023-09-30 06:30:01,357 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:30:01,512 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:30:03,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:30:03,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:30:03,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-30 06:30:03,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:30:05,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:30:05,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-30 06:30:09,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:30:10,560 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-30 06:30:12,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:30:14,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:30:16,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-30 06:30:16,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:30:18,559 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:30:18,853 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=625506.6666666666, ans=0.0 2023-09-30 06:30:20,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:30:22,881 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:30:24,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:30:24,455 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:30:24,599 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=625506.6666666666, ans=0.125 2023-09-30 06:30:24,636 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=625506.6666666666, ans=0.0 2023-09-30 06:30:25,925 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-30 06:30:27,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-30 06:30:27,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-30 06:30:29,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:30:30,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:30:30,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:30:30,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 06:30:33,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 06:30:34,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:30:41,910 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:30:42,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-30 06:30:42,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-30 06:30:42,071 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:30:46,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:30:46,622 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:30:46,844 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:30:47,119 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=625640.0, ans=0.2 2023-09-30 06:30:49,953 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-30 06:30:50,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:30:52,540 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=625640.0, ans=0.2 2023-09-30 06:30:53,527 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:30:53,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-30 06:30:56,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-30 06:30:59,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:31:00,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:31:01,019 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:31:01,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:31:02,355 INFO [train.py:1039] (1/4) Epoch 18, batch 3550, loss[loss=0.182, simple_loss=0.2497, pruned_loss=0.05713, over 23788.00 frames. ], tot_loss[loss=0.1783, simple_loss=0.253, pruned_loss=0.0518, over 4701589.78 frames. ], batch size: 212, lr: 5.70e-03, grad_scale: 16.0 2023-09-30 06:31:04,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:31:04,438 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=625706.6666666666, ans=0.1 2023-09-30 06:31:08,964 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=625706.6666666666, ans=0.0 2023-09-30 06:31:12,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:31:12,983 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=625706.6666666666, ans=0.125 2023-09-30 06:31:16,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 06:31:18,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:31:19,551 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:31:21,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:31:22,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:31:22,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 06:31:27,555 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:31:27,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:31:27,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:31:29,040 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-30 06:31:29,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 06:31:36,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:31:36,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:31:38,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:31:38,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:31:38,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:31:38,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-30 06:31:38,390 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:31:40,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:31:41,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 06:31:48,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:31:50,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:31:50,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:31:51,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-30 06:31:54,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:31:55,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-30 06:31:57,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:31:59,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:31:59,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:32:02,593 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-30 06:32:04,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:32:07,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:32:09,088 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-30 06:32:09,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:32:15,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:32:16,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-30 06:32:20,486 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=625973.3333333334, ans=0.2 2023-09-30 06:32:23,626 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-30 06:32:23,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:32:24,961 INFO [train.py:1039] (1/4) Epoch 18, batch 3600, loss[loss=0.1848, simple_loss=0.2528, pruned_loss=0.0584, over 23424.00 frames. ], tot_loss[loss=0.1783, simple_loss=0.2532, pruned_loss=0.05171, over 4717180.13 frames. ], batch size: 285, lr: 5.70e-03, grad_scale: 32.0 2023-09-30 06:32:25,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:32:27,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:32:28,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:32:30,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:32:35,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:32:35,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:32:37,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:32:37,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:32:38,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:32:38,729 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-30 06:32:43,498 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 06:32:43,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:32:46,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:32:47,229 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.40 vs. limit=15.0 2023-09-30 06:32:47,985 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.865e+02 1.967e+02 2.260e+02 3.686e+02, threshold=3.933e+02, percent-clipped=0.0 2023-09-30 06:32:49,793 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:32:51,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:32:52,779 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:32:52,825 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-30 06:32:54,353 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:32:56,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:32:58,138 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:32:59,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:33:01,378 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:33:03,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:33:03,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-30 06:33:13,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:33:14,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 06:33:16,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-30 06:33:21,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:33:25,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:33:27,425 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:33:33,136 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=626306.6666666666, ans=0.125 2023-09-30 06:33:34,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:33:34,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:33:34,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-30 06:33:34,948 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=626306.6666666666, ans=0.125 2023-09-30 06:33:34,959 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=626306.6666666666, ans=0.1 2023-09-30 06:33:36,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-30 06:33:38,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-30 06:33:41,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:33:41,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:33:41,359 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=626306.6666666666, ans=0.1 2023-09-30 06:33:42,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-30 06:33:44,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:33:44,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:33:44,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:33:45,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-30 06:33:47,870 INFO [train.py:1039] (1/4) Epoch 18, batch 3650, loss[loss=0.1656, simple_loss=0.246, pruned_loss=0.04257, over 24503.00 frames. ], tot_loss[loss=0.1784, simple_loss=0.2539, pruned_loss=0.05146, over 4721132.61 frames. ], batch size: 63, lr: 5.70e-03, grad_scale: 16.0 2023-09-30 06:33:47,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-30 06:33:49,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:33:50,183 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=626373.3333333334, ans=0.125 2023-09-30 06:33:51,346 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-30 06:33:55,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-30 06:33:57,557 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:34:00,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-30 06:34:02,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-30 06:34:02,636 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=626440.0, ans=0.0 2023-09-30 06:34:07,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:34:07,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-30 06:34:08,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 06:34:13,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-30 06:34:13,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:34:14,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-30 06:34:14,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-30 06:34:14,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:34:14,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-30 06:34:17,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 06:34:19,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:34:19,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:34:21,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:34:24,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-30 06:34:24,435 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-30 06:34:25,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:34:28,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-30 06:34:28,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:34:28,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:34:30,945 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten.whitening_limit, batch_count=626506.6666666666, ans=15.0 2023-09-30 06:34:35,407 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=626573.3333333334, ans=0.0 2023-09-30 06:34:36,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:34:38,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:34:38,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:34:40,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:34:40,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:34:41,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:34:45,489 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:34:45,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:34:45,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:34:49,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 06:34:50,706 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:34:50,833 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:34:58,875 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-30 06:35:03,585 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:35:03,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:35:03,772 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-30 06:35:05,206 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:35:05,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:35:06,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:35:08,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-30 06:35:08,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:35:09,623 INFO [train.py:1039] (1/4) Epoch 18, batch 3700, loss[loss=0.1925, simple_loss=0.2703, pruned_loss=0.05735, over 24387.00 frames. ], tot_loss[loss=0.1794, simple_loss=0.2547, pruned_loss=0.0521, over 4722922.50 frames. ], batch size: 77, lr: 5.70e-03, grad_scale: 16.0 2023-09-30 06:35:11,271 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 06:35:14,852 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:35:14,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:35:18,582 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:35:18,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-30 06:35:18,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:35:20,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 06:35:20,152 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 06:35:24,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 06:35:28,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:35:28,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:35:29,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:35:30,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:35:31,841 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 06:35:34,543 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 1.984e+02 2.156e+02 2.490e+02 5.109e+02, threshold=4.311e+02, percent-clipped=1.0 2023-09-30 06:35:34,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:35:34,909 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-30 06:35:42,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:35:42,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 06:35:45,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 06:35:45,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-30 06:35:45,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:35:47,999 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=626840.0, ans=0.0 2023-09-30 06:35:50,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:35:52,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-30 06:35:52,987 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=626840.0, ans=0.125 2023-09-30 06:35:54,183 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:35:54,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:35:57,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:35:57,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 06:35:59,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 06:36:04,175 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:36:06,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-30 06:36:06,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:36:06,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-30 06:36:10,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:36:12,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:36:14,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:36:15,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-30 06:36:17,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:36:17,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-30 06:36:18,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:36:18,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:36:21,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:36:23,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-30 06:36:25,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-30 06:36:25,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:36:25,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:36:27,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:36:28,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:36:30,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:36:31,894 INFO [train.py:1039] (1/4) Epoch 18, batch 3750, loss[loss=0.1736, simple_loss=0.2608, pruned_loss=0.04318, over 24644.00 frames. ], tot_loss[loss=0.1804, simple_loss=0.256, pruned_loss=0.05244, over 4727221.71 frames. ], batch size: 73, lr: 5.70e-03, grad_scale: 16.0 2023-09-30 06:36:32,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:36:34,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:36:35,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-30 06:36:37,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 06:36:40,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-30 06:36:41,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-30 06:36:42,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:36:43,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:36:45,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:36:46,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:36:50,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:36:51,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:36:53,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:36:57,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:37:00,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:37:00,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-30 06:37:01,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:37:03,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:37:04,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:37:08,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-30 06:37:08,770 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=627173.3333333334, ans=0.125 2023-09-30 06:37:11,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-30 06:37:13,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:37:15,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:37:16,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:37:21,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:37:24,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-30 06:37:28,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-30 06:37:31,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:37:34,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:37:34,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:37:39,335 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 06:37:44,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 06:37:45,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-30 06:37:48,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 06:37:49,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:37:51,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-30 06:37:54,290 INFO [train.py:1039] (1/4) Epoch 18, batch 3800, loss[loss=0.1604, simple_loss=0.2418, pruned_loss=0.0395, over 24667.00 frames. ], tot_loss[loss=0.1811, simple_loss=0.2564, pruned_loss=0.05291, over 4726530.90 frames. ], batch size: 65, lr: 5.70e-03, grad_scale: 16.0 2023-09-30 06:37:59,248 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:38:04,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:38:06,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 06:38:06,444 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-30 06:38:07,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:38:10,857 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:38:10,984 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-30 06:38:13,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 06:38:13,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:38:14,009 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:38:15,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:38:15,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:38:17,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:38:17,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-30 06:38:19,117 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.428e+02 1.827e+02 2.039e+02 2.367e+02 3.749e+02, threshold=4.078e+02, percent-clipped=0.0 2023-09-30 06:38:21,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-30 06:38:23,019 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:38:25,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:38:27,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:38:28,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 06:38:29,919 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.97 vs. limit=22.5 2023-09-30 06:38:30,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-30 06:38:30,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:38:30,885 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=627506.6666666666, ans=0.0 2023-09-30 06:38:32,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:38:33,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:38:40,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 06:38:40,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-30 06:38:42,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:38:48,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:38:55,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:38:56,195 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=627573.3333333334, ans=0.1 2023-09-30 06:38:57,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-30 06:39:00,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-30 06:39:01,992 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:39:03,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:39:03,866 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=627640.0, ans=0.125 2023-09-30 06:39:05,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:39:05,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-30 06:39:08,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-30 06:39:08,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-30 06:39:08,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:39:10,629 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=627640.0, ans=0.0 2023-09-30 06:39:11,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:39:16,694 INFO [train.py:1039] (1/4) Epoch 18, batch 3850, loss[loss=0.1723, simple_loss=0.25, pruned_loss=0.04734, over 24460.00 frames. ], tot_loss[loss=0.18, simple_loss=0.2548, pruned_loss=0.05263, over 4722468.47 frames. ], batch size: 63, lr: 5.69e-03, grad_scale: 16.0 2023-09-30 06:39:18,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:39:18,377 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:39:23,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:39:24,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-30 06:39:24,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 06:39:26,778 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:39:29,871 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 06:39:32,762 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:39:33,466 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten.whitening_limit, batch_count=627773.3333333334, ans=15.0 2023-09-30 06:39:35,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-30 06:39:37,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-30 06:39:42,511 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:39:45,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:39:49,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:39:49,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:39:52,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:39:52,271 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:39:53,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:39:53,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:39:53,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:39:56,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:39:57,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:39:59,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:39:59,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-30 06:39:59,256 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-30 06:40:00,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:40:00,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:40:05,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:40:05,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:40:05,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-30 06:40:07,433 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=627906.6666666666, ans=0.125 2023-09-30 06:40:08,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-30 06:40:10,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:40:11,832 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-30 06:40:14,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-30 06:40:19,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:40:21,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:40:24,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:40:25,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-30 06:40:28,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-30 06:40:30,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:40:31,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:40:34,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 06:40:34,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:40:35,094 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:40:37,171 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:40:37,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:40:37,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-30 06:40:37,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:40:38,624 INFO [train.py:1039] (1/4) Epoch 18, batch 3900, loss[loss=0.161, simple_loss=0.2373, pruned_loss=0.04234, over 24335.00 frames. ], tot_loss[loss=0.179, simple_loss=0.2534, pruned_loss=0.05226, over 4723204.12 frames. ], batch size: 56, lr: 5.69e-03, grad_scale: 16.0 2023-09-30 06:40:38,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-30 06:40:38,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:40:38,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:40:39,269 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=628040.0, ans=0.0 2023-09-30 06:40:40,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:40:42,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:40:43,622 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:40:43,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:40:43,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:40:45,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:40:45,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-30 06:40:45,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:40:48,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:40:49,171 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=628040.0, ans=0.0 2023-09-30 06:40:50,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 06:40:50,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:40:52,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:40:53,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 06:40:53,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:40:55,718 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-30 06:40:57,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-30 06:40:57,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:40:59,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-30 06:41:00,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:41:00,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-30 06:41:02,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-30 06:41:03,819 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 1.888e+02 2.025e+02 2.251e+02 3.863e+02, threshold=4.050e+02, percent-clipped=0.0 2023-09-30 06:41:08,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:41:08,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:41:08,655 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 06:41:10,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:41:16,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:41:18,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:41:21,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:41:21,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:41:23,342 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:41:28,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:41:28,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:41:36,428 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=628240.0, ans=0.0 2023-09-30 06:41:37,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 06:41:39,195 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:41:40,139 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=628240.0, ans=0.0 2023-09-30 06:41:49,677 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:41:51,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:41:51,454 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-30 06:41:52,803 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-30 06:41:52,823 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:41:54,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-30 06:41:56,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:41:57,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-30 06:42:00,988 INFO [train.py:1039] (1/4) Epoch 18, batch 3950, loss[loss=0.1746, simple_loss=0.244, pruned_loss=0.05259, over 23655.00 frames. ], tot_loss[loss=0.1788, simple_loss=0.2528, pruned_loss=0.05234, over 4699199.79 frames. ], batch size: 256, lr: 5.69e-03, grad_scale: 16.0 2023-09-30 06:42:04,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:42:06,313 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-30 06:42:06,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:42:09,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:42:10,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:42:16,497 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-30 06:42:17,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 06:42:18,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-30 06:42:19,466 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-30 06:42:19,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:42:23,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:42:23,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-30 06:42:23,410 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:42:26,429 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-30 06:42:28,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:42:28,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 06:42:28,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 06:42:30,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:42:31,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:42:35,127 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=628506.6666666666, ans=0.125 2023-09-30 06:42:45,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:42:45,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:42:49,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-30 06:42:54,506 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-30 06:42:54,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-30 06:42:55,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:42:56,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:43:04,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:43:04,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-30 06:43:05,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:43:05,221 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer_ff3.min_abs, batch_count=628573.3333333334, ans=0.2 2023-09-30 06:43:06,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:43:06,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-30 06:43:08,929 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.33 vs. limit=22.5 2023-09-30 06:43:11,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:43:12,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:43:18,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-30 06:43:24,664 INFO [train.py:1039] (1/4) Epoch 18, batch 4000, loss[loss=0.184, simple_loss=0.2699, pruned_loss=0.04904, over 23313.00 frames. ], tot_loss[loss=0.1792, simple_loss=0.2539, pruned_loss=0.05228, over 4708874.29 frames. ], batch size: 93, lr: 5.69e-03, grad_scale: 32.0 2023-09-30 06:43:27,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:43:37,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:43:42,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:43:43,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:43:44,039 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:43:44,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-30 06:43:44,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:43:45,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-30 06:43:45,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:43:45,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-30 06:43:47,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:43:48,589 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.924e+02 2.193e+02 2.601e+02 4.615e+02, threshold=4.387e+02, percent-clipped=1.0 2023-09-30 06:43:50,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:43:52,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:43:52,248 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:43:52,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:43:52,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-30 06:43:54,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:43:56,112 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-30 06:43:57,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:43:57,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:43:57,882 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=628840.0, ans=0.1 2023-09-30 06:43:59,439 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-30 06:44:00,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 06:44:00,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:44:06,129 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.67 vs. limit=22.5 2023-09-30 06:44:12,044 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-30 06:44:12,130 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:44:14,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:44:15,904 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-30 06:44:16,103 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:44:18,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-30 06:44:18,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:44:18,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:44:20,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:44:21,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:44:21,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-30 06:44:22,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:44:23,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-30 06:44:25,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:44:26,615 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-30 06:44:33,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 06:44:36,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 06:44:37,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:44:37,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:44:39,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:44:39,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:44:43,133 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:44:45,896 INFO [train.py:1039] (1/4) Epoch 18, batch 4050, loss[loss=0.1985, simple_loss=0.2644, pruned_loss=0.06628, over 23855.00 frames. ], tot_loss[loss=0.1799, simple_loss=0.2548, pruned_loss=0.05251, over 4715256.91 frames. ], batch size: 179, lr: 5.69e-03, grad_scale: 32.0 2023-09-30 06:44:48,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-30 06:44:48,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-30 06:44:51,095 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 06:44:51,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:44:52,609 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:44:53,367 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.whiten.whitening_limit, batch_count=629040.0, ans=12.0 2023-09-30 06:44:54,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:44:54,288 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=629040.0, ans=0.2 2023-09-30 06:44:54,323 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=629040.0, ans=0.1 2023-09-30 06:44:55,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:45:00,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:45:02,326 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:45:04,482 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 06:45:07,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:45:07,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:45:11,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:45:13,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:45:13,783 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=629106.6666666666, ans=0.0 2023-09-30 06:45:16,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 06:45:19,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-30 06:45:19,429 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-30 06:45:22,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:45:31,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-30 06:45:31,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:45:31,442 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=629173.3333333334, ans=0.125 2023-09-30 06:45:34,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:45:37,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:45:37,456 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:45:37,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:45:41,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:45:44,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-30 06:45:44,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 06:45:46,378 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:45:47,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-30 06:45:52,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:46:01,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-30 06:46:01,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:46:01,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 06:46:04,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-30 06:46:04,829 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-30 06:46:04,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:46:07,714 INFO [train.py:1039] (1/4) Epoch 18, batch 4100, loss[loss=0.1852, simple_loss=0.2557, pruned_loss=0.05734, over 23293.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.2556, pruned_loss=0.05275, over 4711075.53 frames. ], batch size: 119, lr: 5.69e-03, grad_scale: 32.0 2023-09-30 06:46:07,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:46:11,242 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:46:11,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:46:18,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-30 06:46:18,213 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=629373.3333333334, ans=0.0 2023-09-30 06:46:19,595 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-30 06:46:21,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-30 06:46:22,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-30 06:46:22,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:46:24,192 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:46:24,246 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:46:24,267 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:46:24,383 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-30 06:46:27,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:46:27,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:46:27,672 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:46:29,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:46:33,103 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.931e+02 2.115e+02 2.277e+02 3.051e+02, threshold=4.229e+02, percent-clipped=0.0 2023-09-30 06:46:34,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:46:36,252 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:46:36,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:46:36,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-30 06:46:37,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:46:37,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:46:37,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:46:39,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:46:40,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-30 06:46:44,459 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:46:47,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-30 06:46:48,842 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:46:49,226 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=629506.6666666666, ans=0.125 2023-09-30 06:46:52,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:46:52,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-30 06:46:52,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:46:54,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:46:54,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:46:55,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-30 06:46:56,208 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=629573.3333333334, ans=0.2 2023-09-30 06:46:57,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-30 06:46:58,913 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:46:59,379 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=629573.3333333334, ans=0.0 2023-09-30 06:47:00,454 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-30 06:47:00,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:47:00,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:47:05,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:47:09,760 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:47:12,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:47:14,386 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:47:16,345 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=629640.0, ans=0.125 2023-09-30 06:47:23,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:47:23,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:47:26,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:47:26,537 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=629640.0, ans=0.125 2023-09-30 06:47:29,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:47:30,943 INFO [train.py:1039] (1/4) Epoch 18, batch 4150, loss[loss=0.2059, simple_loss=0.2578, pruned_loss=0.07699, over 19786.00 frames. ], tot_loss[loss=0.182, simple_loss=0.2566, pruned_loss=0.05371, over 4696534.88 frames. ], batch size: 388, lr: 5.69e-03, grad_scale: 32.0 2023-09-30 06:47:32,564 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:47:32,948 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=629706.6666666666, ans=0.125 2023-09-30 06:47:34,125 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 06:47:35,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:47:35,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:47:38,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-30 06:47:38,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:47:38,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-30 06:47:40,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-30 06:47:40,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-30 06:47:42,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:47:47,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:47:47,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:47:52,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:47:54,057 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:47:56,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-30 06:47:59,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 06:47:59,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:48:00,521 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-30 06:48:02,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:48:05,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:48:05,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-30 06:48:09,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-30 06:48:09,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:48:10,182 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=629840.0, ans=0.1 2023-09-30 06:48:11,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-30 06:48:11,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:48:11,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:48:15,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:48:17,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:48:20,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-30 06:48:23,957 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:48:25,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:48:26,920 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-30 06:48:29,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:48:30,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-30 06:48:33,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:48:33,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:48:35,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:48:36,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-30 06:48:36,706 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:48:36,710 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-30 06:48:36,899 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=629973.3333333334, ans=0.0 2023-09-30 06:48:37,061 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=629973.3333333334, ans=0.125 2023-09-30 06:48:38,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 06:48:41,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-30 06:48:41,310 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:48:41,327 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 06:48:41,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 06:48:42,958 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-30 06:48:43,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:48:43,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 06:48:44,518 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:48:46,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:48:46,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-30 06:48:46,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:48:53,163 INFO [train.py:1039] (1/4) Epoch 18, batch 4200, loss[loss=0.1625, simple_loss=0.2111, pruned_loss=0.05696, over 19567.00 frames. ], tot_loss[loss=0.1802, simple_loss=0.2549, pruned_loss=0.0527, over 4692918.70 frames. ], batch size: 389, lr: 5.68e-03, grad_scale: 32.0 2023-09-30 06:48:53,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:48:56,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-30 06:48:56,598 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 06:49:00,302 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:49:02,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 06:49:03,763 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:49:03,766 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:49:05,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-30 06:49:07,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-30 06:49:07,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:49:08,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:49:11,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:49:13,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-30 06:49:16,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:49:17,451 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 1.914e+02 2.122e+02 2.477e+02 4.078e+02, threshold=4.245e+02, percent-clipped=0.0 2023-09-30 06:49:17,611 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:49:17,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-30 06:49:17,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:49:19,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:49:19,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:49:20,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 06:49:20,579 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.66 vs. limit=15.0 2023-09-30 06:49:21,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 06:49:22,361 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.03 vs. limit=22.5 2023-09-30 06:49:23,429 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=630106.6666666666, ans=0.025 2023-09-30 06:49:25,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-30 06:49:26,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:49:29,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-30 06:49:31,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:49:33,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:49:34,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:49:37,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:49:37,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-30 06:49:37,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:49:38,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:49:44,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-30 06:49:46,075 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:49:49,324 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=630240.0, ans=0.0 2023-09-30 06:49:52,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:49:55,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-30 06:49:58,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:50:04,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 06:50:04,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:50:06,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-30 06:50:11,134 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:50:14,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:50:15,758 INFO [train.py:1039] (1/4) Epoch 18, batch 4250, loss[loss=0.1753, simple_loss=0.2627, pruned_loss=0.044, over 24643.00 frames. ], tot_loss[loss=0.1793, simple_loss=0.2536, pruned_loss=0.05254, over 4694520.22 frames. ], batch size: 73, lr: 5.68e-03, grad_scale: 32.0 2023-09-30 06:50:15,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:50:18,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:50:25,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-30 06:50:25,148 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-30 06:50:26,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:50:28,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:50:32,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:50:34,612 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.83 vs. limit=15.0 2023-09-30 06:50:36,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:50:36,889 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:50:37,183 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=630440.0, ans=0.035 2023-09-30 06:50:39,158 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:50:39,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:50:40,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:50:42,290 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:50:42,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:50:44,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:50:45,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:50:47,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-30 06:50:50,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-30 06:50:51,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:50:53,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:50:53,131 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:50:53,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-30 06:50:53,316 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:50:54,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:50:58,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-30 06:50:59,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:51:03,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:51:04,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:51:06,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-30 06:51:06,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:51:08,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-30 06:51:10,281 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:51:12,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-30 06:51:15,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:51:15,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:51:18,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-30 06:51:20,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 06:51:21,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:51:25,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:51:29,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:51:30,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:51:32,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:51:32,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:51:34,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:51:36,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:51:36,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-30 06:51:37,591 INFO [train.py:1039] (1/4) Epoch 18, batch 4300, loss[loss=0.1658, simple_loss=0.2492, pruned_loss=0.04124, over 24479.00 frames. ], tot_loss[loss=0.1793, simple_loss=0.2539, pruned_loss=0.0523, over 4713821.18 frames. ], batch size: 66, lr: 5.68e-03, grad_scale: 16.0 2023-09-30 06:51:37,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:51:42,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:51:42,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:51:43,721 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.74 vs. limit=15.0 2023-09-30 06:51:47,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:51:54,988 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=630773.3333333334, ans=0.0 2023-09-30 06:51:54,995 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=630773.3333333334, ans=0.2 2023-09-30 06:51:56,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:51:56,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-30 06:51:56,363 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:51:59,267 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-30 06:51:59,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:51:59,314 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-30 06:52:01,104 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=630773.3333333334, ans=0.125 2023-09-30 06:52:02,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 06:52:03,683 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 1.954e+02 2.321e+02 2.799e+02 4.498e+02, threshold=4.642e+02, percent-clipped=1.0 2023-09-30 06:52:05,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 06:52:07,644 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-30 06:52:07,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:52:09,150 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-30 06:52:10,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 06:52:12,281 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:52:13,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:52:14,001 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:52:14,279 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=630840.0, ans=0.2 2023-09-30 06:52:15,503 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:52:17,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:52:19,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:52:19,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-30 06:52:21,853 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-30 06:52:23,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:52:26,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:52:26,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 06:52:26,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:52:28,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:52:28,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-30 06:52:28,122 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-30 06:52:29,547 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-30 06:52:29,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:52:29,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-30 06:52:31,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-30 06:52:34,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:52:35,924 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-30 06:52:38,124 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:52:38,435 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=630906.6666666666, ans=0.0 2023-09-30 06:52:40,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:52:40,978 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:52:42,964 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=630973.3333333334, ans=0.05 2023-09-30 06:52:44,021 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-30 06:52:44,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 06:52:44,142 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:52:45,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:52:45,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:52:45,696 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:52:45,977 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=630973.3333333334, ans=0.125 2023-09-30 06:52:47,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:52:52,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:52:52,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:52:54,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:52:57,911 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.21 vs. limit=22.5 2023-09-30 06:53:00,704 INFO [train.py:1039] (1/4) Epoch 18, batch 4350, loss[loss=0.1901, simple_loss=0.2565, pruned_loss=0.06179, over 23579.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.2549, pruned_loss=0.05305, over 4705696.58 frames. ], batch size: 149, lr: 5.68e-03, grad_scale: 16.0 2023-09-30 06:53:00,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-30 06:53:02,311 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-30 06:53:05,594 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:53:05,931 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=631040.0, ans=0.125 2023-09-30 06:53:09,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:53:12,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:53:12,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:53:17,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 06:53:20,114 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:53:20,441 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=631106.6666666666, ans=0.125 2023-09-30 06:53:23,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:53:23,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:53:26,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-30 06:53:28,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:53:29,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-30 06:53:36,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-30 06:53:37,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:53:39,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:53:44,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:53:47,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-30 06:53:49,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:53:50,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 06:53:56,825 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-30 06:53:58,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:53:58,426 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:53:59,893 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-30 06:54:01,851 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-30 06:54:01,861 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:54:03,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:54:04,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:54:04,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:54:05,019 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=631306.6666666666, ans=0.125 2023-09-30 06:54:06,353 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:54:07,257 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=631306.6666666666, ans=0.1 2023-09-30 06:54:08,414 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:54:10,600 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-30 06:54:10,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:54:10,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:54:12,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:54:12,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-30 06:54:13,629 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-30 06:54:13,637 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-30 06:54:13,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-30 06:54:18,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:54:18,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 06:54:18,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:54:19,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:54:21,160 INFO [train.py:1039] (1/4) Epoch 18, batch 4400, loss[loss=0.1758, simple_loss=0.2518, pruned_loss=0.04994, over 24600.00 frames. ], tot_loss[loss=0.1808, simple_loss=0.256, pruned_loss=0.05282, over 4721521.18 frames. ], batch size: 60, lr: 5.68e-03, grad_scale: 32.0 2023-09-30 06:54:21,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-30 06:54:22,825 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-30 06:54:22,837 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:54:26,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:54:26,609 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:54:29,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:54:32,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-30 06:54:32,570 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-30 06:54:32,662 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-30 06:54:33,983 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-30 06:54:34,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 06:54:34,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:54:37,664 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-30 06:54:39,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:54:40,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:54:40,872 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-30 06:54:46,348 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:54:46,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-30 06:54:47,816 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-30 06:54:49,103 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.515e+02 1.990e+02 2.241e+02 2.697e+02 4.171e+02, threshold=4.482e+02, percent-clipped=0.0 2023-09-30 06:54:50,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-30 06:54:50,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-30 06:54:52,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-30 06:54:52,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:54:52,567 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:54:52,897 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=631506.6666666666, ans=0.2 2023-09-30 06:54:53,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:54:55,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:54:57,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-30 06:54:57,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-30 06:54:58,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:55:00,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:55:00,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:55:02,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:55:02,607 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=631506.6666666666, ans=0.1 2023-09-30 06:55:03,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:55:03,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-30 06:55:05,277 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-30 06:55:07,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:55:10,475 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=631573.3333333334, ans=0.2 2023-09-30 06:55:15,042 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:55:17,350 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-30 06:55:19,050 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:55:20,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:55:25,822 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 06:55:25,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-30 06:55:25,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:55:27,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-30 06:55:27,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:55:28,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-30 06:55:33,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-30 06:55:36,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-30 06:55:37,943 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.20 vs. limit=15.0 2023-09-30 06:55:38,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-30 06:55:38,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:55:38,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-30 06:55:40,035 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:55:41,799 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:55:43,111 INFO [train.py:1039] (1/4) Epoch 18, batch 4450, loss[loss=0.1855, simple_loss=0.2732, pruned_loss=0.04894, over 24533.00 frames. ], tot_loss[loss=0.1812, simple_loss=0.2566, pruned_loss=0.05294, over 4721557.51 frames. ], batch size: 71, lr: 5.68e-03, grad_scale: 16.0 2023-09-30 06:55:43,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-30 06:55:45,736 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.01 vs. limit=15.0 2023-09-30 06:55:48,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:55:50,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:55:50,818 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:55:56,858 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:55:56,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:56:01,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:56:04,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:56:05,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:56:05,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:56:07,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-30 06:56:07,220 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:56:08,640 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:56:10,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:56:10,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-30 06:56:13,172 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 06:56:17,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:56:19,160 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:56:21,193 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:56:21,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:56:24,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:56:27,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 06:56:29,258 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-30 06:56:31,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-30 06:56:31,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:56:34,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:56:34,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-30 06:56:39,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-30 06:56:42,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:56:44,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-30 06:56:44,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:56:44,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:56:44,155 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:56:44,170 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:56:44,345 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=631906.6666666666, ans=0.2 2023-09-30 06:56:44,355 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=631906.6666666666, ans=0.125 2023-09-30 06:56:45,893 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=631906.6666666666, ans=0.2 2023-09-30 06:56:47,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:56:48,906 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=631973.3333333334, ans=0.0 2023-09-30 06:56:50,265 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-30 06:56:51,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-30 06:56:52,615 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.68 vs. limit=15.0 2023-09-30 06:56:53,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 06:56:57,072 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:56:57,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:56:58,923 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=631973.3333333334, ans=0.125 2023-09-30 06:57:00,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:57:00,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 06:57:03,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-30 06:57:05,021 INFO [train.py:1039] (1/4) Epoch 18, batch 4500, loss[loss=0.1903, simple_loss=0.2736, pruned_loss=0.05351, over 24038.00 frames. ], tot_loss[loss=0.1817, simple_loss=0.257, pruned_loss=0.05321, over 4707259.13 frames. ], batch size: 80, lr: 5.67e-03, grad_scale: 16.0 2023-09-30 06:57:05,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-30 06:57:08,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:57:09,829 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.86 vs. limit=22.5 2023-09-30 06:57:14,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:57:15,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-30 06:57:15,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-30 06:57:17,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:57:20,473 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:57:22,081 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:57:22,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:57:23,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:57:24,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:57:25,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:57:33,296 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.852e+02 2.175e+02 2.491e+02 3.622e+02, threshold=4.350e+02, percent-clipped=0.0 2023-09-30 06:57:37,293 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=632173.3333333334, ans=0.125 2023-09-30 06:57:38,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:57:38,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:57:42,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:57:42,511 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:57:44,135 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 06:57:50,993 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 06:57:55,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:57:58,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 06:58:01,857 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:58:01,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-30 06:58:02,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:58:02,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:58:03,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:58:05,728 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:58:08,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:58:09,403 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-30 06:58:09,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 06:58:09,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:58:12,651 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=632306.6666666666, ans=0.125 2023-09-30 06:58:12,775 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=632306.6666666666, ans=0.1 2023-09-30 06:58:15,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:58:15,996 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:58:17,788 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:58:21,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-30 06:58:21,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:58:24,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-30 06:58:25,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-30 06:58:25,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-30 06:58:26,150 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=632306.6666666666, ans=0.125 2023-09-30 06:58:28,738 INFO [train.py:1039] (1/4) Epoch 18, batch 4550, loss[loss=0.1757, simple_loss=0.2284, pruned_loss=0.06147, over 22648.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.2552, pruned_loss=0.05292, over 4695688.84 frames. ], batch size: 322, lr: 5.67e-03, grad_scale: 16.0 2023-09-30 06:58:28,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-30 06:58:30,961 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=632373.3333333334, ans=0.125 2023-09-30 06:58:32,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-30 06:58:32,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:58:36,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:58:36,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:58:40,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:58:45,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:58:47,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:58:47,616 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=632440.0, ans=0.125 2023-09-30 06:58:48,760 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:58:50,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:58:50,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:58:52,437 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:58:54,382 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:58:56,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:58:57,692 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-30 06:58:59,281 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-30 06:58:59,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:59:00,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-30 06:59:03,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-30 06:59:04,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:59:05,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-30 06:59:07,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 06:59:12,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:59:12,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:59:13,019 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:59:15,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-30 06:59:18,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:59:21,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:59:21,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:59:23,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:59:26,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-30 06:59:28,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-30 06:59:28,472 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:59:28,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-30 06:59:33,100 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-30 06:59:33,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:59:33,340 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:59:34,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:59:34,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:59:36,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:59:37,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 06:59:38,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-30 06:59:39,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:59:39,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 06:59:41,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-30 06:59:41,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:59:41,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-30 06:59:46,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:59:46,206 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:59:48,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:59:48,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:59:50,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-30 06:59:50,200 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:59:51,595 INFO [train.py:1039] (1/4) Epoch 18, batch 4600, loss[loss=0.1731, simple_loss=0.2473, pruned_loss=0.04942, over 24459.00 frames. ], tot_loss[loss=0.1802, simple_loss=0.2547, pruned_loss=0.05288, over 4704993.97 frames. ], batch size: 58, lr: 5.67e-03, grad_scale: 16.0 2023-09-30 06:59:53,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:59:54,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:59:56,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:00:01,642 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:00:01,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:00:01,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:00:02,118 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=632706.6666666666, ans=0.125 2023-09-30 07:00:03,827 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-30 07:00:05,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-30 07:00:08,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:00:10,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:00:11,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:00:12,056 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=632773.3333333334, ans=0.5 2023-09-30 07:00:18,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-30 07:00:18,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:00:20,155 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.823e+02 2.071e+02 2.370e+02 3.584e+02, threshold=4.141e+02, percent-clipped=0.0 2023-09-30 07:00:21,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:00:26,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:00:26,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:00:28,705 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=632840.0, ans=0.125 2023-09-30 07:00:31,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-30 07:00:31,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 07:00:33,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:00:35,830 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=632840.0, ans=0.2 2023-09-30 07:00:40,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:00:40,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:00:42,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:00:46,778 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-30 07:00:48,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-30 07:00:53,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:00:54,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:00:58,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:00:58,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 07:00:59,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:00:59,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-30 07:00:59,697 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:00:59,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:01:02,018 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:01:02,141 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:01:02,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:01:03,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-30 07:01:03,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-30 07:01:03,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-30 07:01:03,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:01:03,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:01:05,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:01:05,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:01:05,921 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=632973.3333333334, ans=0.125 2023-09-30 07:01:12,816 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=632973.3333333334, ans=0.0 2023-09-30 07:01:14,432 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=633040.0, ans=0.125 2023-09-30 07:01:15,445 INFO [train.py:1039] (1/4) Epoch 18, batch 4650, loss[loss=0.1779, simple_loss=0.2477, pruned_loss=0.05406, over 23464.00 frames. ], tot_loss[loss=0.1797, simple_loss=0.2547, pruned_loss=0.05237, over 4718557.04 frames. ], batch size: 134, lr: 5.67e-03, grad_scale: 16.0 2023-09-30 07:01:15,987 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=633040.0, ans=0.0 2023-09-30 07:01:17,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:01:19,119 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=633040.0, ans=0.0 2023-09-30 07:01:20,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:01:20,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:01:20,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:01:20,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:01:20,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:01:22,083 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:01:22,291 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=633040.0, ans=0.0 2023-09-30 07:01:25,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-30 07:01:25,774 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=633040.0, ans=0.125 2023-09-30 07:01:30,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:01:32,452 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-30 07:01:33,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:01:35,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-30 07:01:35,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:01:35,561 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-30 07:01:37,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-30 07:01:37,043 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:01:37,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 07:01:39,131 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=633106.6666666666, ans=0.125 2023-09-30 07:01:40,933 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 07:01:42,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:01:42,512 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-30 07:01:46,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:01:47,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-30 07:01:50,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:01:50,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:01:52,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-30 07:01:52,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:01:55,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:01:59,282 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:02:03,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:02:06,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:02:06,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:02:06,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 07:02:09,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-30 07:02:09,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-30 07:02:11,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 07:02:11,473 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-30 07:02:14,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:02:21,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:02:21,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:02:21,137 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-30 07:02:21,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:02:22,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:02:22,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 07:02:24,171 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:02:25,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:02:25,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:02:26,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:02:30,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:02:30,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:02:30,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 07:02:31,229 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=633306.6666666666, ans=0.0 2023-09-30 07:02:32,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-30 07:02:32,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-30 07:02:34,095 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-30 07:02:39,271 INFO [train.py:1039] (1/4) Epoch 18, batch 4700, loss[loss=0.1563, simple_loss=0.2346, pruned_loss=0.03894, over 24306.00 frames. ], tot_loss[loss=0.1803, simple_loss=0.2551, pruned_loss=0.05271, over 4720643.34 frames. ], batch size: 61, lr: 5.67e-03, grad_scale: 16.0 2023-09-30 07:02:43,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:02:45,429 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:02:47,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:02:49,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:02:49,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 07:02:50,124 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=633373.3333333334, ans=0.125 2023-09-30 07:02:54,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-30 07:02:54,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-30 07:02:58,776 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:02:58,935 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:03:00,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:03:05,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:03:06,748 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.494e+02 1.845e+02 2.033e+02 2.292e+02 3.478e+02, threshold=4.067e+02, percent-clipped=0.0 2023-09-30 07:03:13,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:03:15,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 07:03:18,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:03:24,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-30 07:03:24,568 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:03:26,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:03:30,613 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=633573.3333333334, ans=0.125 2023-09-30 07:03:31,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-30 07:03:33,444 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:03:33,611 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=633573.3333333334, ans=0.1 2023-09-30 07:03:39,487 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:03:39,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-30 07:03:41,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:03:41,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:03:43,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:03:44,913 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:03:44,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-30 07:03:46,437 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-30 07:03:48,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:03:48,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:03:48,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:03:48,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-30 07:03:50,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:03:54,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-30 07:03:57,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:03:59,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:03:59,807 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=633706.6666666666, ans=0.1 2023-09-30 07:04:01,454 INFO [train.py:1039] (1/4) Epoch 18, batch 4750, loss[loss=0.1754, simple_loss=0.2456, pruned_loss=0.05264, over 23577.00 frames. ], tot_loss[loss=0.1809, simple_loss=0.2559, pruned_loss=0.05295, over 4726771.65 frames. ], batch size: 149, lr: 5.67e-03, grad_scale: 16.0 2023-09-30 07:04:03,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:04:03,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:04:05,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-30 07:04:05,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:04:09,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-30 07:04:10,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:04:11,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:04:12,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:04:21,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-30 07:04:24,658 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:04:27,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-30 07:04:27,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:04:28,006 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=633773.3333333334, ans=0.125 2023-09-30 07:04:30,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:04:30,835 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:04:30,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:04:33,055 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-30 07:04:33,060 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-30 07:04:33,910 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.55 vs. limit=15.0 2023-09-30 07:04:39,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-30 07:04:41,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:04:44,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:04:46,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:04:46,066 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-30 07:04:46,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:04:49,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:04:50,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:04:51,769 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.79 vs. limit=22.5 2023-09-30 07:04:54,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-30 07:04:54,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-30 07:04:54,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:04:55,882 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:04:56,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:04:58,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 07:04:58,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-30 07:04:59,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-30 07:05:04,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:05:09,371 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:05:09,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-30 07:05:10,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:05:12,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:05:14,069 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:05:14,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:05:16,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:05:16,608 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=633973.3333333334, ans=0.1 2023-09-30 07:05:20,752 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:05:20,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-30 07:05:22,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-30 07:05:23,717 INFO [train.py:1039] (1/4) Epoch 18, batch 4800, loss[loss=0.1871, simple_loss=0.2678, pruned_loss=0.05322, over 23929.00 frames. ], tot_loss[loss=0.1807, simple_loss=0.256, pruned_loss=0.05272, over 4718467.42 frames. ], batch size: 80, lr: 5.67e-03, grad_scale: 32.0 2023-09-30 07:05:23,847 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-30 07:05:25,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-30 07:05:25,529 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:05:27,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-30 07:05:33,726 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:05:33,816 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:05:37,902 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:05:40,231 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.60 vs. limit=6.0 2023-09-30 07:05:40,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:05:40,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:05:42,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-30 07:05:42,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:05:42,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:05:46,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-30 07:05:46,815 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2.whitening_limit, batch_count=634106.6666666666, ans=15.0 2023-09-30 07:05:50,556 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:05:51,854 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.888e+02 2.165e+02 2.522e+02 3.456e+02, threshold=4.330e+02, percent-clipped=0.0 2023-09-30 07:05:54,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:05:54,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:05:55,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:05:55,832 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 07:05:55,853 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:05:55,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:05:59,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:06:02,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:06:03,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:06:03,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-30 07:06:05,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 07:06:07,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:06:09,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-30 07:06:10,018 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-30 07:06:12,273 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:06:12,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:06:12,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:06:12,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:06:12,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:06:15,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:06:15,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:06:16,040 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=634240.0, ans=0.5 2023-09-30 07:06:20,806 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:06:21,750 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.23 vs. limit=15.0 2023-09-30 07:06:22,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:06:24,120 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:06:29,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-30 07:06:29,173 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:06:29,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:06:30,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:06:30,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:06:34,140 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=634306.6666666666, ans=0.0 2023-09-30 07:06:35,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:06:36,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:06:36,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:06:36,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:06:38,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 07:06:38,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 07:06:43,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:06:43,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:06:43,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:06:46,860 INFO [train.py:1039] (1/4) Epoch 18, batch 4850, loss[loss=0.2119, simple_loss=0.2774, pruned_loss=0.07323, over 19691.00 frames. ], tot_loss[loss=0.1803, simple_loss=0.2554, pruned_loss=0.05261, over 4723692.11 frames. ], batch size: 389, lr: 5.66e-03, grad_scale: 16.0 2023-09-30 07:06:47,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-30 07:06:48,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-30 07:06:48,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:06:48,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:06:50,137 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:06:50,139 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:06:52,002 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:07:01,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-30 07:07:01,793 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:07:07,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:07:08,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 07:07:08,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:07:11,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:07:11,996 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 07:07:13,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:07:16,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:07:16,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-30 07:07:19,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:07:21,925 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-30 07:07:21,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 07:07:23,455 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:07:23,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-30 07:07:23,888 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=634506.6666666666, ans=0.125 2023-09-30 07:07:25,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:07:25,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:07:30,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:07:30,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-30 07:07:32,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-30 07:07:33,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 07:07:40,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:07:41,662 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-30 07:07:41,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:07:43,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:07:43,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-30 07:07:43,633 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=634573.3333333334, ans=0.125 2023-09-30 07:07:44,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-30 07:07:44,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:07:48,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-30 07:07:48,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:07:49,900 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:07:51,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-30 07:08:01,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:08:05,954 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.44 vs. limit=6.0 2023-09-30 07:08:08,421 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:08:08,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:08:09,747 INFO [train.py:1039] (1/4) Epoch 18, batch 4900, loss[loss=0.1735, simple_loss=0.2393, pruned_loss=0.05386, over 23670.00 frames. ], tot_loss[loss=0.1795, simple_loss=0.2535, pruned_loss=0.05275, over 4707216.77 frames. ], batch size: 232, lr: 5.66e-03, grad_scale: 16.0 2023-09-30 07:08:13,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-30 07:08:13,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:08:13,297 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=634706.6666666666, ans=0.0 2023-09-30 07:08:18,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:08:20,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:08:20,801 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=634706.6666666666, ans=0.125 2023-09-30 07:08:21,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:08:23,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-30 07:08:28,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-30 07:08:29,099 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=634773.3333333334, ans=0.1 2023-09-30 07:08:34,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-30 07:08:34,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-30 07:08:35,889 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-30 07:08:35,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:08:35,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:08:35,988 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:08:36,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:08:37,616 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-30 07:08:39,892 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.486e+02 1.805e+02 1.986e+02 2.156e+02 3.448e+02, threshold=3.971e+02, percent-clipped=0.0 2023-09-30 07:08:40,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-30 07:08:41,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 07:08:43,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:08:44,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-30 07:08:47,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:08:49,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:08:51,298 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:08:51,314 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-30 07:08:52,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:08:55,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:08:55,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-30 07:08:55,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-30 07:08:57,371 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.31 vs. limit=10.0 2023-09-30 07:08:58,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-30 07:08:58,847 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten.whitening_limit, batch_count=634906.6666666666, ans=15.0 2023-09-30 07:09:00,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:09:01,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:09:01,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:09:03,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:09:03,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 07:09:03,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:09:04,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-30 07:09:07,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:09:09,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-30 07:09:10,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:09:14,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-30 07:09:16,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:09:17,622 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-30 07:09:17,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-30 07:09:19,602 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=634973.3333333334, ans=0.125 2023-09-30 07:09:24,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:09:25,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 07:09:27,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-30 07:09:27,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 07:09:27,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 07:09:29,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:09:32,646 INFO [train.py:1039] (1/4) Epoch 18, batch 4950, loss[loss=0.179, simple_loss=0.2678, pruned_loss=0.04514, over 24608.00 frames. ], tot_loss[loss=0.1786, simple_loss=0.2525, pruned_loss=0.05238, over 4709693.71 frames. ], batch size: 68, lr: 5.66e-03, grad_scale: 16.0 2023-09-30 07:09:32,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:09:32,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:09:32,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:09:32,807 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-30 07:09:33,032 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=635040.0, ans=0.125 2023-09-30 07:09:35,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 07:09:37,951 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:09:37,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 07:09:38,845 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.61 vs. limit=15.0 2023-09-30 07:09:41,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-30 07:09:41,215 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-30 07:09:41,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-30 07:09:42,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-30 07:09:42,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:09:42,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:09:44,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:09:44,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:09:48,044 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:09:48,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:09:49,718 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:09:51,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:09:52,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:09:52,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:09:56,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 07:09:59,203 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=635106.6666666666, ans=0.125 2023-09-30 07:10:02,302 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=635106.6666666666, ans=0.0 2023-09-30 07:10:03,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:10:05,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 07:10:07,142 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:10:08,106 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.88 vs. limit=15.0 2023-09-30 07:10:08,576 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:10:10,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:10:11,849 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-30 07:10:12,128 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=635173.3333333334, ans=0.125 2023-09-30 07:10:13,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-30 07:10:13,584 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=635173.3333333334, ans=0.125 2023-09-30 07:10:14,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:10:17,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:10:17,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-30 07:10:19,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:10:19,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:10:21,003 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-30 07:10:21,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:10:22,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-30 07:10:25,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:10:26,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:10:28,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:10:28,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-30 07:10:28,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:10:28,481 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=635240.0, ans=0.125 2023-09-30 07:10:30,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 07:10:31,400 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.94 vs. limit=15.0 2023-09-30 07:10:35,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:10:35,884 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=10.73 vs. limit=15.0 2023-09-30 07:10:36,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:10:36,599 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:10:38,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:10:38,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 07:10:38,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:10:40,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:10:42,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:10:42,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:10:43,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-30 07:10:44,443 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.82 vs. limit=15.0 2023-09-30 07:10:46,941 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:10:50,513 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=635306.6666666666, ans=0.0 2023-09-30 07:10:51,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-30 07:10:51,718 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-30 07:10:52,243 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.15 vs. limit=22.5 2023-09-30 07:10:55,418 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 07:10:56,579 INFO [train.py:1039] (1/4) Epoch 18, batch 5000, loss[loss=0.1629, simple_loss=0.2386, pruned_loss=0.04366, over 24321.00 frames. ], tot_loss[loss=0.1778, simple_loss=0.2513, pruned_loss=0.0522, over 4689098.02 frames. ], batch size: 61, lr: 5.66e-03, grad_scale: 16.0 2023-09-30 07:10:57,048 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=635373.3333333334, ans=0.2 2023-09-30 07:10:58,383 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:10:58,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:11:01,319 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-30 07:11:01,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-30 07:11:03,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:11:05,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-30 07:11:07,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:11:07,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 07:11:08,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-30 07:11:08,710 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:11:10,153 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:11:11,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-30 07:11:11,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:11:11,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:11:13,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-30 07:11:15,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-30 07:11:16,142 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.08 vs. limit=6.0 2023-09-30 07:11:16,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:11:16,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-30 07:11:16,809 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 07:11:18,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:11:18,319 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 07:11:18,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-30 07:11:18,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-30 07:11:19,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-30 07:11:21,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:11:21,909 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.24 vs. limit=15.0 2023-09-30 07:11:22,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:11:22,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-30 07:11:22,904 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:11:24,520 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:11:26,590 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.857e+02 2.111e+02 2.507e+02 3.855e+02, threshold=4.222e+02, percent-clipped=0.0 2023-09-30 07:11:26,743 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:11:27,104 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=635440.0, ans=0.09899494936611666 2023-09-30 07:11:28,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-30 07:11:29,907 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-30 07:11:31,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:11:32,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:11:36,015 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-30 07:11:39,760 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:11:41,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:11:41,817 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:11:43,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-30 07:11:43,701 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=635506.6666666666, ans=0.1 2023-09-30 07:11:44,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:11:44,993 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:11:45,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:11:47,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-30 07:11:47,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:11:50,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:11:51,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:11:55,162 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=635573.3333333334, ans=0.09899494936611666 2023-09-30 07:11:56,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-30 07:11:58,878 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.28 vs. limit=15.0 2023-09-30 07:12:01,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:12:12,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:12:13,012 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:12:15,109 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:12:15,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:12:15,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 07:12:15,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:12:15,418 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=635640.0, ans=0.0 2023-09-30 07:12:16,770 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:12:20,226 INFO [train.py:1039] (1/4) Epoch 18, batch 5050, loss[loss=0.1876, simple_loss=0.2572, pruned_loss=0.059, over 23795.00 frames. ], tot_loss[loss=0.1785, simple_loss=0.2524, pruned_loss=0.0523, over 4697665.13 frames. ], batch size: 164, lr: 5.66e-03, grad_scale: 16.0 2023-09-30 07:12:20,689 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=635706.6666666666, ans=0.0 2023-09-30 07:12:21,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:12:23,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-30 07:12:24,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:12:26,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:12:27,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-30 07:12:28,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-30 07:12:29,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:12:29,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:12:32,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 07:12:35,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 07:12:35,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-30 07:12:41,107 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=635773.3333333334, ans=0.1 2023-09-30 07:12:45,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-30 07:12:45,626 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-30 07:12:45,994 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=635773.3333333334, ans=0.125 2023-09-30 07:12:47,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:12:47,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-30 07:12:48,993 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 07:12:49,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:12:50,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:12:52,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:12:52,096 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-30 07:12:52,261 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-30 07:12:54,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:12:58,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:12:59,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:12:59,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-30 07:13:01,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:13:02,273 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.04 vs. limit=15.0 2023-09-30 07:13:04,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-30 07:13:05,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:13:05,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:13:07,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:13:08,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:13:12,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:13:15,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:13:16,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:13:16,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:13:16,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:13:16,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-30 07:13:17,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:13:18,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 07:13:23,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:13:23,172 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-30 07:13:23,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-30 07:13:26,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:13:28,131 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:13:28,182 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-30 07:13:29,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:13:29,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-30 07:13:29,940 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:13:35,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:13:35,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:13:35,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-30 07:13:37,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-30 07:13:40,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:13:40,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:13:40,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:13:41,721 INFO [train.py:1039] (1/4) Epoch 18, batch 5100, loss[loss=0.1672, simple_loss=0.2356, pruned_loss=0.04938, over 23455.00 frames. ], tot_loss[loss=0.1791, simple_loss=0.2534, pruned_loss=0.05242, over 4708001.47 frames. ], batch size: 119, lr: 5.66e-03, grad_scale: 16.0 2023-09-30 07:13:43,477 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-30 07:13:44,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:13:50,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-30 07:13:50,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-30 07:13:51,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:13:53,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:13:54,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:13:55,499 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.03 vs. limit=10.0 2023-09-30 07:13:56,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-30 07:13:56,289 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-30 07:14:01,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:14:01,314 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:14:02,062 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.07 vs. limit=15.0 2023-09-30 07:14:06,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:14:08,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-30 07:14:10,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:14:10,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:14:10,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-30 07:14:11,821 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.519e+02 1.826e+02 2.022e+02 2.261e+02 3.082e+02, threshold=4.044e+02, percent-clipped=0.0 2023-09-30 07:14:13,656 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=636173.3333333334, ans=0.2 2023-09-30 07:14:14,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:14:16,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:14:16,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-30 07:14:17,863 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-30 07:14:19,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:14:19,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-30 07:14:19,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-30 07:14:24,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:14:32,493 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:14:33,251 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.16 vs. limit=12.0 2023-09-30 07:14:36,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-30 07:14:36,121 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-30 07:14:36,134 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-30 07:14:37,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-30 07:14:37,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:14:39,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-30 07:14:44,672 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-30 07:14:47,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 07:14:49,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-30 07:14:52,212 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-30 07:14:53,762 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-30 07:14:53,812 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-30 07:14:59,307 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=636306.6666666666, ans=0.0 2023-09-30 07:15:00,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:15:00,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:15:00,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:15:02,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:15:02,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 07:15:02,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:15:03,686 INFO [train.py:1039] (1/4) Epoch 18, batch 5150, loss[loss=0.1817, simple_loss=0.2716, pruned_loss=0.04589, over 24440.00 frames. ], tot_loss[loss=0.1802, simple_loss=0.2546, pruned_loss=0.05296, over 4699124.71 frames. ], batch size: 69, lr: 5.66e-03, grad_scale: 8.0 2023-09-30 07:15:03,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-30 07:15:03,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-30 07:15:05,400 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-30 07:15:05,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:15:06,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-30 07:15:10,132 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:15:10,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 07:15:12,511 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:15:14,133 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:15:17,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 07:15:17,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-30 07:15:17,895 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=636373.3333333334, ans=0.0 2023-09-30 07:15:21,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:15:21,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:15:22,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-30 07:15:22,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:15:22,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:15:22,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:15:22,867 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 07:15:24,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-30 07:15:25,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:15:27,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 07:15:27,750 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=636440.0, ans=0.125 2023-09-30 07:15:30,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 07:15:32,648 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-30 07:15:34,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:15:34,400 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 07:15:39,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:15:40,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-30 07:15:47,823 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:15:54,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:15:56,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:16:00,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:16:00,698 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:16:03,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-30 07:16:06,895 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:16:08,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:16:08,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 07:16:12,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:16:13,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:16:15,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-30 07:16:17,131 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=636640.0, ans=0.2 2023-09-30 07:16:18,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:16:20,510 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 07:16:23,973 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:16:23,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:16:24,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-30 07:16:24,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-30 07:16:24,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:16:24,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:16:27,165 INFO [train.py:1039] (1/4) Epoch 18, batch 5200, loss[loss=0.1828, simple_loss=0.2481, pruned_loss=0.05876, over 23960.00 frames. ], tot_loss[loss=0.1807, simple_loss=0.2554, pruned_loss=0.05305, over 4709248.42 frames. ], batch size: 196, lr: 5.65e-03, grad_scale: 16.0 2023-09-30 07:16:28,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:16:29,159 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=636706.6666666666, ans=0.125 2023-09-30 07:16:30,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:16:35,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:16:39,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-30 07:16:41,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:16:42,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:16:44,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:16:46,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:16:46,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:16:48,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-30 07:16:51,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 07:16:52,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:16:53,655 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=636773.3333333334, ans=0.125 2023-09-30 07:16:55,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-30 07:16:58,250 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.839e+02 2.029e+02 2.263e+02 2.861e+02, threshold=4.059e+02, percent-clipped=0.0 2023-09-30 07:16:58,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-30 07:16:59,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-30 07:17:00,042 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-30 07:17:01,477 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-30 07:17:03,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-30 07:17:03,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:17:03,183 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-30 07:17:03,193 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:17:04,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:17:06,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:17:06,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-30 07:17:07,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:17:09,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:17:13,232 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=636840.0, ans=0.0 2023-09-30 07:17:14,396 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-30 07:17:14,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-30 07:17:14,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-30 07:17:16,465 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=636906.6666666666, ans=0.125 2023-09-30 07:17:19,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-30 07:17:19,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 07:17:20,160 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.03 vs. limit=22.5 2023-09-30 07:17:25,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:17:27,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:17:28,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-30 07:17:28,627 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:17:30,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-30 07:17:30,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:17:30,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 07:17:34,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:17:36,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:17:39,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:17:39,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:17:39,663 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:17:44,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:17:45,281 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=636973.3333333334, ans=0.125 2023-09-30 07:17:46,491 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-30 07:17:47,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:17:47,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:17:49,440 INFO [train.py:1039] (1/4) Epoch 18, batch 5250, loss[loss=0.1779, simple_loss=0.2564, pruned_loss=0.04973, over 23128.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.2549, pruned_loss=0.05308, over 4706081.98 frames. ], batch size: 105, lr: 5.65e-03, grad_scale: 16.0 2023-09-30 07:17:49,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:17:49,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-30 07:17:51,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:17:53,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:17:57,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:17:58,159 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=637040.0, ans=0.1 2023-09-30 07:17:59,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:18:01,803 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:18:06,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:18:08,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 07:18:09,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:18:11,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 07:18:11,762 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=637106.6666666666, ans=0.0 2023-09-30 07:18:13,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-30 07:18:13,033 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:18:14,540 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:18:30,437 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=637173.3333333334, ans=0.0 2023-09-30 07:18:31,193 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.61 vs. limit=22.5 2023-09-30 07:18:41,563 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=637240.0, ans=0.0 2023-09-30 07:18:45,803 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=637240.0, ans=0.0 2023-09-30 07:18:47,260 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=637240.0, ans=0.025 2023-09-30 07:18:55,474 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.45 vs. limit=15.0 2023-09-30 07:19:04,840 INFO [train.py:1039] (1/4) Epoch 18, batch 5300, loss[loss=0.1748, simple_loss=0.2627, pruned_loss=0.04347, over 24336.00 frames. ], tot_loss[loss=0.1794, simple_loss=0.2538, pruned_loss=0.0525, over 4704846.73 frames. ], batch size: 74, lr: 5.65e-03, grad_scale: 16.0 2023-09-30 07:19:11,079 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.94 vs. limit=6.0 2023-09-30 07:19:13,428 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=637373.3333333334, ans=0.1 2023-09-30 07:19:20,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:19:20,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-30 07:19:20,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-30 07:19:20,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:19:21,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:19:21,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:19:21,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:19:21,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:19:21,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:19:21,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:19:21,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-30 07:19:22,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:19:22,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-30 07:19:22,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-30 07:19:22,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-30 07:19:22,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-30 07:19:22,769 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-30 07:19:22,897 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-30 07:19:23,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:19:23,962 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:19:24,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:19:24,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:19:24,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:19:24,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:19:24,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:19:24,931 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:19:25,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:19:25,122 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:19:25,131 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:19:25,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:19:25,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:19:26,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-30 07:19:26,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:19:26,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:19:26,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-30 07:19:26,730 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-30 07:19:26,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:19:26,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:19:26,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-30 07:19:27,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-30 07:19:27,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-30 07:19:28,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:19:28,740 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:19:28,889 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-30 07:19:29,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-30 07:19:29,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-30 07:19:29,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:19:29,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-30 07:19:29,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-30 07:19:29,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-30 07:19:29,739 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-30 07:19:38,650 INFO [train.py:1039] (1/4) Epoch 19, batch 0, loss[loss=0.1952, simple_loss=0.2614, pruned_loss=0.06453, over 23560.00 frames. ], tot_loss[loss=0.1952, simple_loss=0.2614, pruned_loss=0.06453, over 23560.00 frames. ], batch size: 256, lr: 5.50e-03, grad_scale: 32.0 2023-09-30 07:19:38,651 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-30 07:19:52,807 INFO [train.py:1071] (1/4) Epoch 19, validation: loss=0.3241, simple_loss=0.2677, pruned_loss=0.1902, over 1125622.00 frames. 2023-09-30 07:19:52,808 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-30 07:19:54,698 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=637460.0, ans=0.125 2023-09-30 07:19:55,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-30 07:19:55,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:19:58,934 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:20:01,918 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.506e+02 1.881e+02 2.156e+02 2.381e+02 5.566e+02, threshold=4.312e+02, percent-clipped=3.0 2023-09-30 07:20:06,343 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:20:06,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:20:06,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:20:07,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-30 07:20:09,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-30 07:20:12,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:20:12,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:20:17,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:20:19,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:20:19,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:20:19,186 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:20:20,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-30 07:20:20,942 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=637526.6666666666, ans=0.125 2023-09-30 07:20:22,283 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:20:22,513 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=637593.3333333334, ans=0.0 2023-09-30 07:20:31,156 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 07:20:31,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:20:34,370 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.28 vs. limit=15.0 2023-09-30 07:20:34,918 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-30 07:20:39,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:20:39,423 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:20:41,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:20:45,720 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:20:48,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:20:49,154 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=637660.0, ans=0.125 2023-09-30 07:20:54,343 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=637660.0, ans=0.125 2023-09-30 07:20:55,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-30 07:20:58,070 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.39 vs. limit=15.0 2023-09-30 07:20:59,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-30 07:20:59,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:20:59,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:20:59,293 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 07:21:01,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:21:02,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:21:04,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-30 07:21:05,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:21:07,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:21:12,341 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:21:13,701 INFO [train.py:1039] (1/4) Epoch 19, batch 50, loss[loss=0.211, simple_loss=0.279, pruned_loss=0.0715, over 23336.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.2561, pruned_loss=0.05249, over 1075483.93 frames. ], batch size: 93, lr: 5.50e-03, grad_scale: 16.0 2023-09-30 07:21:15,525 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-30 07:21:17,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:21:21,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:21:23,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:21:23,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-30 07:21:23,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:21:24,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:21:25,004 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=637793.3333333334, ans=0.125 2023-09-30 07:21:26,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:21:28,405 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:21:28,624 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=637860.0, ans=0.1 2023-09-30 07:21:28,634 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=637860.0, ans=0.0 2023-09-30 07:21:30,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:21:35,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-30 07:21:35,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:21:42,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-30 07:21:45,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-30 07:21:45,912 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=637926.6666666666, ans=0.0 2023-09-30 07:21:47,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-30 07:21:48,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:21:48,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:21:48,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:21:50,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:21:51,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-30 07:21:51,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 07:21:51,826 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:22:01,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:22:01,822 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:22:01,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 07:22:03,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-30 07:22:04,960 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:22:06,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 07:22:06,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-30 07:22:07,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:22:08,369 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 07:22:10,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-30 07:22:16,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:22:16,768 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:22:16,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:22:19,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:22:19,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-30 07:22:22,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-30 07:22:22,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-30 07:22:23,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:22:25,034 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-30 07:22:26,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:22:28,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:22:29,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-30 07:22:29,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-30 07:22:30,923 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-30 07:22:32,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:22:33,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:22:34,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-30 07:22:34,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-30 07:22:36,021 INFO [train.py:1039] (1/4) Epoch 19, batch 100, loss[loss=0.1989, simple_loss=0.2821, pruned_loss=0.05781, over 24568.00 frames. ], tot_loss[loss=0.1809, simple_loss=0.2561, pruned_loss=0.05285, over 1888406.05 frames. ], batch size: 71, lr: 5.49e-03, grad_scale: 16.0 2023-09-30 07:22:36,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:22:36,933 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.19 vs. limit=12.0 2023-09-30 07:22:37,598 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:22:39,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-30 07:22:39,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:22:41,198 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=638126.6666666666, ans=0.1 2023-09-30 07:22:42,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:22:45,175 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.75 vs. limit=15.0 2023-09-30 07:22:45,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:22:47,052 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.406e+02 1.850e+02 1.971e+02 2.245e+02 4.662e+02, threshold=3.942e+02, percent-clipped=2.0 2023-09-30 07:22:50,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:22:50,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-30 07:22:51,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:22:54,196 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=638193.3333333334, ans=0.2 2023-09-30 07:22:56,926 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:22:56,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:22:57,181 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=638193.3333333334, ans=0.125 2023-09-30 07:22:58,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:22:58,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:22:58,911 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:22:59,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-30 07:23:02,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-30 07:23:02,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:23:02,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:23:02,156 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:23:06,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-30 07:23:06,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:23:08,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:23:09,780 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-30 07:23:10,746 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.23 vs. limit=22.5 2023-09-30 07:23:12,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 07:23:13,832 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=638260.0, ans=0.125 2023-09-30 07:23:15,116 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-30 07:23:15,143 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-30 07:23:15,455 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=638260.0, ans=0.0 2023-09-30 07:23:16,795 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:23:16,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 07:23:21,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-30 07:23:23,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:23:26,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:23:30,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:23:31,703 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-30 07:23:33,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-30 07:23:35,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:23:36,121 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=638326.6666666666, ans=0.0 2023-09-30 07:23:37,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:23:40,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:23:44,016 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:23:47,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:23:47,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:23:50,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:23:51,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:23:53,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:23:53,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:23:55,152 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:23:55,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-30 07:23:55,288 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-30 07:23:55,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:23:56,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 07:23:56,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:23:56,842 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:23:58,204 INFO [train.py:1039] (1/4) Epoch 19, batch 150, loss[loss=0.2077, simple_loss=0.2828, pruned_loss=0.0663, over 23945.00 frames. ], tot_loss[loss=0.1817, simple_loss=0.2574, pruned_loss=0.05304, over 2505628.85 frames. ], batch size: 86, lr: 5.49e-03, grad_scale: 16.0 2023-09-30 07:23:58,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 07:23:58,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 07:23:58,386 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-30 07:23:58,395 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:24:00,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:24:01,860 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:24:01,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:24:03,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:24:06,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:24:06,981 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=638460.0, ans=0.125 2023-09-30 07:24:11,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:24:11,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:24:12,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:24:14,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:24:14,734 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.45 vs. limit=15.0 2023-09-30 07:24:15,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:24:17,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:24:17,553 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:24:21,490 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=638526.6666666666, ans=0.125 2023-09-30 07:24:22,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-30 07:24:22,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-30 07:24:22,773 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-30 07:24:25,751 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:24:25,759 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 07:24:27,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:24:29,406 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:24:29,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:24:29,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:24:29,584 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:24:32,376 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-30 07:24:33,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:24:39,575 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=638593.3333333334, ans=0.0 2023-09-30 07:24:40,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:24:45,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:24:45,936 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-30 07:24:50,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:24:50,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:24:50,979 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:24:52,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:24:53,231 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.53 vs. limit=22.5 2023-09-30 07:24:54,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:24:54,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:24:57,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:24:57,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-30 07:24:58,193 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=638660.0, ans=0.0 2023-09-30 07:25:01,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:25:03,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:25:03,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:25:03,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:25:06,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:25:06,529 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=638726.6666666666, ans=0.0 2023-09-30 07:25:09,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 07:25:11,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:25:13,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 07:25:14,501 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:25:16,114 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:25:16,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-30 07:25:16,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:25:16,215 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-30 07:25:16,466 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=638726.6666666666, ans=0.125 2023-09-30 07:25:18,055 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=638726.6666666666, ans=0.125 2023-09-30 07:25:19,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:25:21,410 INFO [train.py:1039] (1/4) Epoch 19, batch 200, loss[loss=0.1655, simple_loss=0.2483, pruned_loss=0.04132, over 24306.00 frames. ], tot_loss[loss=0.1824, simple_loss=0.2584, pruned_loss=0.05317, over 3004085.59 frames. ], batch size: 61, lr: 5.49e-03, grad_scale: 16.0 2023-09-30 07:25:24,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:25:24,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:25:28,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-30 07:25:28,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:25:28,635 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=638793.3333333334, ans=0.125 2023-09-30 07:25:29,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:25:32,715 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.428e+02 1.866e+02 2.060e+02 2.341e+02 3.608e+02, threshold=4.119e+02, percent-clipped=0.0 2023-09-30 07:25:33,008 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-30 07:25:34,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-30 07:25:36,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:25:38,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:25:40,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:25:40,032 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:25:40,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:25:59,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:26:00,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:26:00,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:26:00,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:26:02,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 07:26:02,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 07:26:03,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:26:03,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 07:26:05,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:26:05,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:26:08,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-30 07:26:08,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 07:26:08,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:26:08,479 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=638926.6666666666, ans=0.125 2023-09-30 07:26:09,947 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=638993.3333333334, ans=0.125 2023-09-30 07:26:13,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:26:17,413 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.min_positive, batch_count=638993.3333333334, ans=0.025 2023-09-30 07:26:18,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:26:23,804 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=638993.3333333334, ans=0.125 2023-09-30 07:26:25,398 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:26:26,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:26:33,584 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:26:35,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-30 07:26:36,611 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:26:36,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:26:36,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:26:39,520 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 07:26:41,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-30 07:26:42,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:26:42,602 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-30 07:26:43,981 INFO [train.py:1039] (1/4) Epoch 19, batch 250, loss[loss=0.18, simple_loss=0.2409, pruned_loss=0.0595, over 23748.00 frames. ], tot_loss[loss=0.1822, simple_loss=0.2573, pruned_loss=0.0535, over 3380579.57 frames. ], batch size: 179, lr: 5.49e-03, grad_scale: 16.0 2023-09-30 07:26:45,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:26:47,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:26:50,676 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:26:50,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:26:54,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:26:54,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:26:56,611 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.33 vs. limit=15.0 2023-09-30 07:26:57,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:26:57,649 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=639126.6666666666, ans=0.125 2023-09-30 07:27:00,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:27:02,914 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=639193.3333333334, ans=0.0 2023-09-30 07:27:12,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:27:12,517 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=639193.3333333334, ans=0.0 2023-09-30 07:27:13,881 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:27:15,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:27:18,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-30 07:27:20,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-30 07:27:21,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:27:21,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:27:23,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 07:27:25,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:27:27,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:27:30,111 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:27:33,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-30 07:27:33,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:27:35,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:27:35,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:27:35,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:27:36,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 07:27:37,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 07:27:37,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:27:40,053 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:27:41,609 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:27:43,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:27:46,650 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:27:46,883 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=639326.6666666666, ans=0.035 2023-09-30 07:27:51,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:27:51,399 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=639393.3333333334, ans=0.1 2023-09-30 07:27:54,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:27:56,261 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=639393.3333333334, ans=0.1 2023-09-30 07:27:57,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:27:59,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:28:03,569 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-30 07:28:03,986 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=639393.3333333334, ans=0.125 2023-09-30 07:28:05,160 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:28:05,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 07:28:05,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-30 07:28:07,409 INFO [train.py:1039] (1/4) Epoch 19, batch 300, loss[loss=0.1681, simple_loss=0.2562, pruned_loss=0.03999, over 24436.00 frames. ], tot_loss[loss=0.18, simple_loss=0.2545, pruned_loss=0.05274, over 3676356.80 frames. ], batch size: 69, lr: 5.49e-03, grad_scale: 16.0 2023-09-30 07:28:07,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-30 07:28:09,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:28:09,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-30 07:28:09,472 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=639460.0, ans=0.1 2023-09-30 07:28:11,016 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=639460.0, ans=0.125 2023-09-30 07:28:12,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:28:12,585 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=639460.0, ans=0.1 2023-09-30 07:28:13,678 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:28:16,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:28:18,699 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.819e+02 2.024e+02 2.204e+02 2.893e+02, threshold=4.048e+02, percent-clipped=0.0 2023-09-30 07:28:18,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-30 07:28:20,399 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:28:21,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 07:28:21,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-30 07:28:21,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:28:26,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-30 07:28:32,374 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 07:28:33,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-30 07:28:37,568 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-30 07:28:37,640 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:28:41,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:28:42,208 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.64 vs. limit=15.0 2023-09-30 07:28:43,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:28:43,006 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-30 07:28:43,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:28:44,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:28:46,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:28:47,758 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:28:52,374 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-30 07:28:52,382 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-30 07:28:52,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:28:56,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:28:56,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-30 07:28:57,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:29:00,236 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.29 vs. limit=15.0 2023-09-30 07:29:02,470 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:29:06,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:29:06,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-30 07:29:10,143 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.76 vs. limit=10.0 2023-09-30 07:29:11,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:29:11,067 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 07:29:14,055 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:29:17,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:29:17,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-30 07:29:17,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 07:29:19,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:29:19,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-30 07:29:22,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:29:22,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:29:22,452 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=639726.6666666666, ans=0.125 2023-09-30 07:29:23,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:29:23,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:29:25,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:29:28,625 INFO [train.py:1039] (1/4) Epoch 19, batch 350, loss[loss=0.1599, simple_loss=0.2447, pruned_loss=0.03753, over 24337.00 frames. ], tot_loss[loss=0.1785, simple_loss=0.2524, pruned_loss=0.0523, over 3892000.96 frames. ], batch size: 61, lr: 5.49e-03, grad_scale: 16.0 2023-09-30 07:29:30,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:29:30,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 07:29:33,384 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:29:35,729 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.40 vs. limit=15.0 2023-09-30 07:29:40,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:29:41,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:29:43,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:29:46,917 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-30 07:29:47,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:29:48,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-30 07:29:50,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:29:52,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-30 07:29:53,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:29:55,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-30 07:29:55,838 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.65 vs. limit=15.0 2023-09-30 07:29:58,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:29:59,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:30:01,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:30:01,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:30:01,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:30:03,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:30:03,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:30:03,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:30:06,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:30:06,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:30:12,570 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.39 vs. limit=12.0 2023-09-30 07:30:14,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:30:14,930 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-30 07:30:15,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:30:17,030 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:30:24,124 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer_ff2.min_abs, batch_count=639993.3333333334, ans=0.1 2023-09-30 07:30:26,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-30 07:30:26,725 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:30:30,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:30:30,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:30:30,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:30:33,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-30 07:30:35,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:30:36,587 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-30 07:30:36,756 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-30 07:30:36,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:30:39,063 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.18 vs. limit=15.0 2023-09-30 07:30:40,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:30:40,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-30 07:30:41,077 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.88 vs. limit=22.5 2023-09-30 07:30:43,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:30:48,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:30:49,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:30:50,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:30:50,682 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:30:52,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:30:55,829 INFO [train.py:1039] (1/4) Epoch 19, batch 400, loss[loss=0.1775, simple_loss=0.2615, pruned_loss=0.04671, over 23741.00 frames. ], tot_loss[loss=0.1783, simple_loss=0.2522, pruned_loss=0.05216, over 4060710.47 frames. ], batch size: 85, lr: 5.49e-03, grad_scale: 32.0 2023-09-30 07:30:56,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:30:59,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:30:59,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-30 07:30:59,382 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:31:00,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:31:01,099 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=640126.6666666666, ans=0.125 2023-09-30 07:31:03,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:31:03,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:31:06,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:31:06,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:31:07,644 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.862e+02 2.041e+02 2.218e+02 3.370e+02, threshold=4.083e+02, percent-clipped=0.0 2023-09-30 07:31:07,964 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-30 07:31:10,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-30 07:31:10,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:31:13,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-30 07:31:13,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:31:15,867 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=640193.3333333334, ans=0.0 2023-09-30 07:31:17,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:31:17,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:31:17,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-30 07:31:19,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:31:19,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:31:19,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:31:19,512 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=640193.3333333334, ans=0.0 2023-09-30 07:31:20,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:31:22,368 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-30 07:31:25,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-30 07:31:30,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:31:31,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:31:32,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-30 07:31:32,837 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-30 07:31:35,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:31:37,613 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:31:40,626 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.63 vs. limit=15.0 2023-09-30 07:31:46,098 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-30 07:31:47,821 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-30 07:31:49,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-30 07:31:51,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:31:53,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:31:54,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-30 07:31:57,322 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=640326.6666666666, ans=0.0 2023-09-30 07:31:59,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:32:03,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 07:32:05,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:32:06,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:32:08,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-30 07:32:08,473 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=640393.3333333334, ans=0.0 2023-09-30 07:32:11,089 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-30 07:32:11,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-30 07:32:12,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 07:32:12,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:32:15,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-30 07:32:19,081 INFO [train.py:1039] (1/4) Epoch 19, batch 450, loss[loss=0.1989, simple_loss=0.2572, pruned_loss=0.0703, over 23788.00 frames. ], tot_loss[loss=0.1781, simple_loss=0.2527, pruned_loss=0.05173, over 4216532.45 frames. ], batch size: 179, lr: 5.48e-03, grad_scale: 32.0 2023-09-30 07:32:19,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 07:32:20,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:32:20,689 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-30 07:32:22,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-30 07:32:22,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:32:23,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:32:25,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:32:25,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-30 07:32:25,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:32:25,592 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=640460.0, ans=0.125 2023-09-30 07:32:26,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 07:32:30,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:32:30,943 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=640460.0, ans=0.2 2023-09-30 07:32:39,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:32:39,298 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:32:40,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-30 07:32:42,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-30 07:32:46,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-30 07:32:49,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:32:51,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:32:55,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:32:55,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:32:57,072 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=640593.3333333334, ans=0.1 2023-09-30 07:32:57,113 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=640593.3333333334, ans=0.2 2023-09-30 07:32:58,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-30 07:32:58,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-30 07:32:58,734 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=640593.3333333334, ans=0.0 2023-09-30 07:33:01,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-30 07:33:03,425 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:33:04,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:33:06,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:33:06,574 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-30 07:33:06,588 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-30 07:33:08,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:33:10,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:33:11,746 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-30 07:33:12,090 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=640660.0, ans=0.125 2023-09-30 07:33:13,765 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=640660.0, ans=0.1 2023-09-30 07:33:14,746 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-30 07:33:14,790 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:33:14,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-30 07:33:16,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-30 07:33:17,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:33:20,941 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-30 07:33:20,998 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 07:33:22,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-30 07:33:25,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:33:27,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-30 07:33:29,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-30 07:33:29,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:33:34,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:33:36,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:33:37,784 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:33:39,161 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-30 07:33:40,601 INFO [train.py:1039] (1/4) Epoch 19, batch 500, loss[loss=0.2033, simple_loss=0.2668, pruned_loss=0.06995, over 23555.00 frames. ], tot_loss[loss=0.1782, simple_loss=0.2533, pruned_loss=0.0516, over 4335060.24 frames. ], batch size: 256, lr: 5.48e-03, grad_scale: 32.0 2023-09-30 07:33:44,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:33:45,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:33:46,728 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:33:46,744 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-30 07:33:49,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-30 07:33:49,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:33:52,508 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.803e+02 2.032e+02 2.368e+02 3.527e+02, threshold=4.065e+02, percent-clipped=0.0 2023-09-30 07:33:52,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 07:33:57,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 07:33:58,823 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-30 07:34:00,421 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:34:00,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:34:00,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:34:05,917 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=640860.0, ans=0.5 2023-09-30 07:34:12,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:34:12,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-30 07:34:14,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-30 07:34:14,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:34:14,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-30 07:34:15,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 07:34:17,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:34:19,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:34:20,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:34:20,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:34:22,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-30 07:34:25,900 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-30 07:34:30,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:34:30,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:34:31,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:34:32,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:34:32,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-30 07:34:35,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-30 07:34:38,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:34:39,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:34:42,240 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=640993.3333333334, ans=0.04949747468305833 2023-09-30 07:34:44,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:34:46,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:34:53,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:34:57,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-30 07:34:57,256 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:34:57,274 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:35:00,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-30 07:35:00,479 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-30 07:35:02,181 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=641126.6666666666, ans=0.1 2023-09-30 07:35:03,217 INFO [train.py:1039] (1/4) Epoch 19, batch 550, loss[loss=0.1885, simple_loss=0.2553, pruned_loss=0.06084, over 21232.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.2552, pruned_loss=0.05295, over 4399943.18 frames. ], batch size: 46, lr: 5.48e-03, grad_scale: 32.0 2023-09-30 07:35:03,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:35:07,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-30 07:35:09,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-30 07:35:09,641 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:35:09,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-30 07:35:11,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:35:11,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:35:11,281 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:35:12,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:35:12,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:35:14,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:35:15,273 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=641126.6666666666, ans=0.125 2023-09-30 07:35:17,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:35:18,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-30 07:35:18,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:35:20,102 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=641193.3333333334, ans=0.125 2023-09-30 07:35:24,317 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:35:24,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:35:26,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:35:28,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:35:33,123 WARNING [train.py:1197] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-30 07:35:33,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-30 07:35:36,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:35:42,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:35:42,493 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:35:42,834 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=641260.0, ans=0.0 2023-09-30 07:35:43,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-30 07:35:47,163 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:35:47,174 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-30 07:35:48,614 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:35:50,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 07:35:52,331 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:35:53,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 07:35:53,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-30 07:35:55,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:35:56,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-30 07:35:57,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-30 07:35:59,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:35:59,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:35:59,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:35:59,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:36:05,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:36:06,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:36:06,962 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=641326.6666666666, ans=0.1 2023-09-30 07:36:08,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:36:09,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:36:09,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 07:36:12,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 07:36:12,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:36:14,310 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:36:14,399 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:36:15,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-30 07:36:15,923 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-30 07:36:16,180 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=641393.3333333334, ans=0.125 2023-09-30 07:36:16,204 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=641393.3333333334, ans=0.025 2023-09-30 07:36:16,663 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.26 vs. limit=10.0 2023-09-30 07:36:22,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-30 07:36:24,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-30 07:36:25,845 INFO [train.py:1039] (1/4) Epoch 19, batch 600, loss[loss=0.1546, simple_loss=0.2348, pruned_loss=0.03721, over 24567.00 frames. ], tot_loss[loss=0.1806, simple_loss=0.2551, pruned_loss=0.05307, over 4478143.59 frames. ], batch size: 60, lr: 5.48e-03, grad_scale: 16.0 2023-09-30 07:36:26,098 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:36:27,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 07:36:27,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:36:36,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:36:37,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 07:36:39,266 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.814e+02 2.073e+02 2.344e+02 3.797e+02, threshold=4.146e+02, percent-clipped=0.0 2023-09-30 07:36:39,383 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-30 07:36:41,019 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-30 07:36:42,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:36:45,855 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:36:47,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-30 07:36:47,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:36:55,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-30 07:36:58,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:36:58,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:36:58,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:37:04,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:37:04,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:37:06,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:37:11,812 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:37:15,669 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.04 vs. limit=22.5 2023-09-30 07:37:17,866 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:37:17,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:37:17,889 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:37:25,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-30 07:37:27,721 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=641660.0, ans=0.1 2023-09-30 07:37:31,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-30 07:37:31,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:37:36,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-30 07:37:36,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:37:40,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-30 07:37:40,066 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:37:40,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:37:46,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 07:37:48,296 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-30 07:37:49,748 INFO [train.py:1039] (1/4) Epoch 19, batch 650, loss[loss=0.1798, simple_loss=0.2593, pruned_loss=0.05011, over 24665.00 frames. ], tot_loss[loss=0.1799, simple_loss=0.254, pruned_loss=0.05293, over 4518034.96 frames. ], batch size: 68, lr: 5.48e-03, grad_scale: 16.0 2023-09-30 07:37:49,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:37:51,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:37:53,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:37:56,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-30 07:37:57,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:38:03,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:38:03,877 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:38:08,933 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:38:14,745 WARNING [train.py:1197] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-30 07:38:16,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:38:16,377 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:38:20,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:38:20,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 07:38:23,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:38:23,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:38:24,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 07:38:24,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:38:26,358 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:38:27,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:38:27,976 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-30 07:38:27,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:38:29,477 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:38:32,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:38:32,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:38:34,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:38:34,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:38:35,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-30 07:38:37,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:38:37,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:38:37,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-30 07:38:37,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:38:39,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 07:38:41,143 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-30 07:38:43,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-30 07:38:43,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:38:43,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:38:43,641 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=641993.3333333334, ans=0.125 2023-09-30 07:38:43,701 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=641993.3333333334, ans=0.125 2023-09-30 07:38:44,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:38:44,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:38:46,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:38:51,791 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:38:53,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:38:54,666 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:38:59,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:38:59,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 07:39:00,629 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:39:01,458 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.51 vs. limit=6.0 2023-09-30 07:39:01,498 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.66 vs. limit=22.5 2023-09-30 07:39:02,410 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=642060.0, ans=0.025 2023-09-30 07:39:08,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 07:39:08,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:39:08,509 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:39:08,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:39:11,459 INFO [train.py:1039] (1/4) Epoch 19, batch 700, loss[loss=0.1788, simple_loss=0.2504, pruned_loss=0.05361, over 23406.00 frames. ], tot_loss[loss=0.1789, simple_loss=0.2528, pruned_loss=0.0525, over 4545206.84 frames. ], batch size: 105, lr: 5.48e-03, grad_scale: 16.0 2023-09-30 07:39:14,652 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-30 07:39:16,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-30 07:39:19,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-30 07:39:20,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:39:22,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:39:22,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-30 07:39:25,588 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.802e+02 1.961e+02 2.175e+02 2.904e+02, threshold=3.922e+02, percent-clipped=0.0 2023-09-30 07:39:27,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:39:29,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:39:29,387 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=642193.3333333334, ans=0.125 2023-09-30 07:39:30,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:39:32,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-30 07:39:33,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:39:36,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:39:39,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 07:39:39,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:39:40,330 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=642193.3333333334, ans=0.2 2023-09-30 07:39:41,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-30 07:39:44,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-30 07:39:47,828 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-30 07:39:47,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:39:50,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-30 07:39:52,637 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=642260.0, ans=0.125 2023-09-30 07:39:54,313 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=642260.0, ans=0.125 2023-09-30 07:39:56,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:39:57,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-30 07:40:03,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:40:03,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:40:04,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-30 07:40:09,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:40:09,740 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=642326.6666666666, ans=0.125 2023-09-30 07:40:10,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:40:13,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:40:17,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:40:18,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-30 07:40:19,752 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=13.23 vs. limit=15.0 2023-09-30 07:40:21,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-30 07:40:23,313 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-30 07:40:27,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:40:29,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:40:29,813 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:40:33,589 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:40:33,598 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-30 07:40:34,993 INFO [train.py:1039] (1/4) Epoch 19, batch 750, loss[loss=0.1778, simple_loss=0.2609, pruned_loss=0.04738, over 23383.00 frames. ], tot_loss[loss=0.1784, simple_loss=0.2525, pruned_loss=0.0521, over 4577340.99 frames. ], batch size: 93, lr: 5.48e-03, grad_scale: 16.0 2023-09-30 07:40:38,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-30 07:40:38,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-30 07:40:38,470 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=642460.0, ans=0.125 2023-09-30 07:40:39,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-30 07:40:41,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-30 07:40:41,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-30 07:40:41,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:40:42,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-30 07:40:44,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:40:44,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-30 07:40:45,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:40:47,520 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:40:48,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-30 07:40:48,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:40:50,600 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:40:50,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 07:40:53,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:40:54,148 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=642526.6666666666, ans=0.125 2023-09-30 07:40:55,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:40:55,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:40:56,724 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-30 07:40:58,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:40:58,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:41:00,401 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:41:03,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-30 07:41:04,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-30 07:41:04,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:41:07,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-30 07:41:07,717 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-30 07:41:09,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-30 07:41:09,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:41:09,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 07:41:11,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:41:11,468 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=642593.3333333334, ans=0.1 2023-09-30 07:41:19,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-30 07:41:19,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:41:19,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 07:41:20,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:41:23,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:41:23,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-30 07:41:23,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:41:25,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-30 07:41:27,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:41:27,589 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=642660.0, ans=0.04949747468305833 2023-09-30 07:41:30,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:41:30,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-30 07:41:31,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:41:37,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:41:39,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:41:41,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:41:43,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:41:47,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-30 07:41:47,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:41:49,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:41:53,664 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:41:53,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:41:55,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:41:56,731 INFO [train.py:1039] (1/4) Epoch 19, batch 800, loss[loss=0.1806, simple_loss=0.2663, pruned_loss=0.04747, over 24461.00 frames. ], tot_loss[loss=0.1791, simple_loss=0.2536, pruned_loss=0.05236, over 4608897.25 frames. ], batch size: 69, lr: 5.47e-03, grad_scale: 32.0 2023-09-30 07:41:56,783 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-30 07:42:03,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:42:03,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:42:05,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:42:05,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:42:06,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:42:07,021 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:42:09,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:42:10,448 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.608e+02 1.847e+02 2.108e+02 2.482e+02 4.355e+02, threshold=4.217e+02, percent-clipped=1.0 2023-09-30 07:42:14,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:42:14,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:42:15,155 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.31 vs. limit=15.0 2023-09-30 07:42:19,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-30 07:42:19,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:42:20,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:42:20,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:42:22,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:42:22,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-30 07:42:22,640 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:42:23,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-30 07:42:27,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:42:29,111 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=642926.6666666666, ans=0.125 2023-09-30 07:42:30,215 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:42:33,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:42:33,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:42:34,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:42:34,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:42:41,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:42:41,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:42:42,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-30 07:42:43,003 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-30 07:42:43,039 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-30 07:42:44,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 07:42:44,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:42:46,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:42:46,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:42:52,271 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-30 07:42:52,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-30 07:42:55,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:42:56,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 07:42:58,705 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=642993.3333333334, ans=0.1 2023-09-30 07:43:01,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:43:04,415 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:43:04,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-30 07:43:05,993 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:43:07,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-30 07:43:14,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:43:17,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:43:19,039 INFO [train.py:1039] (1/4) Epoch 19, batch 850, loss[loss=0.1917, simple_loss=0.2577, pruned_loss=0.0628, over 23817.00 frames. ], tot_loss[loss=0.1804, simple_loss=0.2552, pruned_loss=0.05283, over 4628725.81 frames. ], batch size: 195, lr: 5.47e-03, grad_scale: 16.0 2023-09-30 07:43:19,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-30 07:43:19,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:43:19,319 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:43:21,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-30 07:43:21,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:43:24,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:43:26,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:43:26,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 07:43:28,159 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:43:29,738 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-30 07:43:29,814 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-30 07:43:29,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-30 07:43:32,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:43:32,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:43:34,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:43:34,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:43:35,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:43:39,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:43:40,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:43:40,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-30 07:43:43,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-30 07:43:47,431 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:43:48,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-30 07:43:53,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-30 07:43:55,476 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-30 07:43:57,811 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-30 07:43:57,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:43:57,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:43:59,252 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 07:44:01,373 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:44:02,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:44:02,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-30 07:44:05,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:44:05,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:44:06,734 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.69 vs. limit=15.0 2023-09-30 07:44:07,369 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:44:08,264 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.06 vs. limit=22.5 2023-09-30 07:44:08,790 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-30 07:44:10,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:44:11,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-30 07:44:13,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-30 07:44:17,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:44:17,810 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:44:19,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:44:19,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:44:19,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:44:21,408 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=643326.6666666666, ans=0.0 2023-09-30 07:44:24,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:44:24,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:44:26,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-30 07:44:27,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:44:28,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-30 07:44:32,187 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=643393.3333333334, ans=0.2 2023-09-30 07:44:36,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-30 07:44:38,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:44:38,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-30 07:44:38,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:44:39,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:44:41,363 INFO [train.py:1039] (1/4) Epoch 19, batch 900, loss[loss=0.1912, simple_loss=0.2655, pruned_loss=0.0584, over 23704.00 frames. ], tot_loss[loss=0.181, simple_loss=0.2559, pruned_loss=0.05304, over 4653459.56 frames. ], batch size: 85, lr: 5.47e-03, grad_scale: 16.0 2023-09-30 07:44:42,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-30 07:44:49,013 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:44:50,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:44:52,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-30 07:44:52,389 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=643460.0, ans=0.125 2023-09-30 07:44:53,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:44:55,223 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.525e+02 1.916e+02 2.182e+02 2.478e+02 5.058e+02, threshold=4.365e+02, percent-clipped=1.0 2023-09-30 07:44:55,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-30 07:44:55,482 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-30 07:44:55,708 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=643526.6666666666, ans=0.125 2023-09-30 07:44:57,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:44:57,011 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:44:59,010 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 07:44:59,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:44:59,551 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=643526.6666666666, ans=0.0 2023-09-30 07:45:10,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:45:10,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:45:11,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 07:45:11,300 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=643526.6666666666, ans=0.125 2023-09-30 07:45:13,489 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.93 vs. limit=15.0 2023-09-30 07:45:14,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:45:16,162 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=643593.3333333334, ans=0.07 2023-09-30 07:45:17,701 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=643593.3333333334, ans=0.125 2023-09-30 07:45:18,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-30 07:45:18,919 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=643593.3333333334, ans=0.1 2023-09-30 07:45:20,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:45:25,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-30 07:45:26,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-30 07:45:26,610 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-30 07:45:26,731 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-30 07:45:33,488 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-30 07:45:33,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:45:33,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 07:45:36,195 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.60 vs. limit=15.0 2023-09-30 07:45:41,309 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:45:41,329 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:45:44,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-30 07:45:44,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:45:48,830 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-30 07:45:50,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:45:50,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:45:51,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:45:52,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:45:56,690 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-30 07:45:56,775 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-30 07:45:58,390 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-30 07:45:59,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-30 07:46:01,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:46:02,753 INFO [train.py:1039] (1/4) Epoch 19, batch 950, loss[loss=0.1797, simple_loss=0.2402, pruned_loss=0.05961, over 22723.00 frames. ], tot_loss[loss=0.1808, simple_loss=0.256, pruned_loss=0.05286, over 4663415.85 frames. ], batch size: 322, lr: 5.47e-03, grad_scale: 8.0 2023-09-30 07:46:04,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-30 07:46:06,958 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=643793.3333333334, ans=0.1 2023-09-30 07:46:10,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:46:13,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:46:13,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:46:15,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 07:46:15,779 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=643793.3333333334, ans=0.2 2023-09-30 07:46:17,591 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-30 07:46:22,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:46:23,651 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:46:23,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:46:25,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:46:25,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-30 07:46:25,494 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=643860.0, ans=0.125 2023-09-30 07:46:25,630 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=643860.0, ans=0.125 2023-09-30 07:46:26,664 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-30 07:46:28,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:46:28,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-30 07:46:30,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:46:33,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:46:33,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:46:34,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:46:34,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-30 07:46:37,826 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 07:46:39,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:46:43,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:46:50,187 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:46:50,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:46:53,366 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-30 07:46:57,325 WARNING [train.py:1197] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 07:46:57,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 07:46:58,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:46:58,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:46:58,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:46:59,086 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=643993.3333333334, ans=0.0 2023-09-30 07:47:02,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-30 07:47:03,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:47:05,184 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:47:06,604 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:47:06,647 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-30 07:47:06,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:47:06,677 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:47:08,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-30 07:47:08,636 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=644060.0, ans=0.125 2023-09-30 07:47:12,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:47:16,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:47:21,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:47:23,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-30 07:47:23,731 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-30 07:47:26,771 INFO [train.py:1039] (1/4) Epoch 19, batch 1000, loss[loss=0.1656, simple_loss=0.2465, pruned_loss=0.04238, over 21483.00 frames. ], tot_loss[loss=0.1803, simple_loss=0.255, pruned_loss=0.05282, over 4676074.74 frames. ], batch size: 47, lr: 5.47e-03, grad_scale: 8.0 2023-09-30 07:47:26,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:47:30,078 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-30 07:47:32,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:47:34,140 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=644126.6666666666, ans=0.125 2023-09-30 07:47:36,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:47:38,364 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-30 07:47:38,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-30 07:47:38,638 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=644126.6666666666, ans=0.125 2023-09-30 07:47:42,700 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.497e+02 2.138e+02 2.514e+02 3.202e+02 5.752e+02, threshold=5.028e+02, percent-clipped=6.0 2023-09-30 07:47:43,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:47:44,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:47:45,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:47:47,745 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-30 07:47:48,078 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=644193.3333333334, ans=0.0 2023-09-30 07:47:51,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-30 07:47:53,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-30 07:47:53,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:47:56,728 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-30 07:47:56,892 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-30 07:47:56,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-30 07:47:58,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:48:00,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:48:09,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:48:09,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:48:11,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:48:11,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:48:11,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-30 07:48:11,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:48:13,010 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:48:14,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:48:14,632 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-30 07:48:18,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-30 07:48:19,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-30 07:48:20,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-30 07:48:22,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:48:29,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:48:29,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:48:29,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:48:32,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:48:34,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-30 07:48:36,221 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:48:36,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-30 07:48:36,457 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=644393.3333333334, ans=0.1 2023-09-30 07:48:37,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-30 07:48:40,597 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:48:40,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:48:43,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:48:46,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 07:48:48,063 INFO [train.py:1039] (1/4) Epoch 19, batch 1050, loss[loss=0.1871, simple_loss=0.2763, pruned_loss=0.04892, over 24649.00 frames. ], tot_loss[loss=0.179, simple_loss=0.2535, pruned_loss=0.05226, over 4688288.74 frames. ], batch size: 73, lr: 5.47e-03, grad_scale: 8.0 2023-09-30 07:48:48,253 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:48:48,562 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=644460.0, ans=0.0 2023-09-30 07:48:50,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:48:51,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:48:53,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 07:48:54,830 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:48:58,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:49:00,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:49:01,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-30 07:49:05,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:49:06,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:49:06,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-30 07:49:08,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:49:08,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-30 07:49:09,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:49:09,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-30 07:49:15,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:49:15,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-30 07:49:15,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-30 07:49:21,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:49:21,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-30 07:49:21,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:49:24,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-30 07:49:24,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-30 07:49:25,739 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.70 vs. limit=22.5 2023-09-30 07:49:26,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:49:27,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-30 07:49:31,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-30 07:49:33,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:49:36,960 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=644660.0, ans=0.0 2023-09-30 07:49:38,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 07:49:40,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-30 07:49:40,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:49:41,899 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:49:45,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:49:48,761 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-30 07:49:50,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-30 07:49:50,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-30 07:49:51,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:49:51,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 07:49:53,339 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-30 07:49:56,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:49:58,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:49:58,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:49:58,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:49:59,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:50:03,658 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=644726.6666666666, ans=0.1 2023-09-30 07:50:04,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:50:06,911 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-30 07:50:07,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:50:07,147 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-30 07:50:08,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-30 07:50:08,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:50:11,432 INFO [train.py:1039] (1/4) Epoch 19, batch 1100, loss[loss=0.1758, simple_loss=0.259, pruned_loss=0.04628, over 24647.00 frames. ], tot_loss[loss=0.1789, simple_loss=0.2532, pruned_loss=0.05228, over 4677746.15 frames. ], batch size: 73, lr: 5.47e-03, grad_scale: 8.0 2023-09-30 07:50:11,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:50:17,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:50:20,211 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=644793.3333333334, ans=0.125 2023-09-30 07:50:23,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:50:25,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:50:25,086 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:50:26,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-30 07:50:26,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:50:28,203 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.585e+02 1.773e+02 2.054e+02 2.605e+02 4.840e+02, threshold=4.108e+02, percent-clipped=0.0 2023-09-30 07:50:29,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:50:31,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:50:34,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:50:34,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-30 07:50:36,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 07:50:39,600 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:50:39,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:50:41,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:50:43,594 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=644926.6666666666, ans=0.125 2023-09-30 07:50:43,887 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.63 vs. limit=15.0 2023-09-30 07:50:44,735 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:50:49,931 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:50:51,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-30 07:50:53,207 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-30 07:50:53,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:50:56,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:50:58,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-30 07:50:59,880 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:51:00,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-30 07:51:01,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:51:01,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:51:01,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:51:01,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:51:01,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-30 07:51:08,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:51:08,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-30 07:51:11,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:51:18,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 07:51:21,381 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-30 07:51:21,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-30 07:51:22,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:51:26,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:51:26,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:51:28,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-30 07:51:28,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:51:28,556 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:51:30,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-30 07:51:30,114 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:51:30,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-30 07:51:31,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:51:33,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:51:34,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-30 07:51:35,156 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=8.33 vs. limit=8.0 2023-09-30 07:51:35,350 INFO [train.py:1039] (1/4) Epoch 19, batch 1150, loss[loss=0.1638, simple_loss=0.2345, pruned_loss=0.04652, over 18509.00 frames. ], tot_loss[loss=0.1793, simple_loss=0.2535, pruned_loss=0.05257, over 4666147.24 frames. ], batch size: 40, lr: 5.47e-03, grad_scale: 8.0 2023-09-30 07:51:40,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:51:43,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:51:45,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:51:45,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:51:46,763 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-30 07:51:46,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:51:49,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-30 07:51:52,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:51:52,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 07:51:56,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-30 07:51:58,607 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:52:00,504 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=645193.3333333334, ans=0.0 2023-09-30 07:52:03,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:52:03,973 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=645193.3333333334, ans=0.2 2023-09-30 07:52:05,240 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:52:06,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-30 07:52:06,733 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-30 07:52:06,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:52:11,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-30 07:52:13,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:52:14,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:52:24,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:52:30,701 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=645326.6666666666, ans=0.0 2023-09-30 07:52:33,402 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:52:33,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-30 07:52:34,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:52:34,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:52:41,961 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-30 07:52:43,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:52:50,490 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-30 07:52:55,071 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:52:58,258 INFO [train.py:1039] (1/4) Epoch 19, batch 1200, loss[loss=0.187, simple_loss=0.2586, pruned_loss=0.05768, over 23652.00 frames. ], tot_loss[loss=0.1799, simple_loss=0.2547, pruned_loss=0.05262, over 4690174.80 frames. ], batch size: 232, lr: 5.46e-03, grad_scale: 16.0 2023-09-30 07:52:58,321 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:52:58,368 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-30 07:52:58,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:53:01,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:53:05,982 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.28 vs. limit=12.0 2023-09-30 07:53:08,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:53:08,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-30 07:53:09,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:53:09,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:53:09,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:53:11,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:53:13,149 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 07:53:14,391 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 1.937e+02 2.117e+02 2.458e+02 3.944e+02, threshold=4.235e+02, percent-clipped=0.0 2023-09-30 07:53:14,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:53:14,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:53:18,239 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-30 07:53:21,655 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-30 07:53:23,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:53:26,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:53:29,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:53:29,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:53:29,814 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=645593.3333333334, ans=0.0 2023-09-30 07:53:30,833 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-30 07:53:31,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:53:41,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-30 07:53:41,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:53:41,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-30 07:53:41,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:53:44,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-30 07:53:51,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-30 07:53:51,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:53:52,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:53:54,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:53:56,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-30 07:53:56,581 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=645660.0, ans=0.2 2023-09-30 07:53:57,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:53:57,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:53:58,136 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=645660.0, ans=0.0 2023-09-30 07:53:59,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:53:59,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-30 07:54:01,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 07:54:01,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:54:01,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 07:54:02,743 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:54:02,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:54:08,062 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-30 07:54:09,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:54:10,307 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.50 vs. limit=10.0 2023-09-30 07:54:13,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-30 07:54:16,438 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 07:54:17,620 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-30 07:54:19,257 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:54:20,535 INFO [train.py:1039] (1/4) Epoch 19, batch 1250, loss[loss=0.1653, simple_loss=0.2539, pruned_loss=0.03831, over 24656.00 frames. ], tot_loss[loss=0.1808, simple_loss=0.2558, pruned_loss=0.05291, over 4695979.13 frames. ], batch size: 73, lr: 5.46e-03, grad_scale: 16.0 2023-09-30 07:54:22,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:54:24,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:54:24,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:54:27,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-30 07:54:31,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:54:32,361 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=13.95 vs. limit=15.0 2023-09-30 07:54:32,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:54:32,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-30 07:54:35,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:54:37,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 07:54:39,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 07:54:41,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:54:43,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 07:54:43,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:54:44,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-30 07:54:49,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 07:54:49,838 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-30 07:54:49,847 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:54:51,418 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:54:52,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:54:56,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:54:56,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-30 07:55:00,140 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=645926.6666666666, ans=0.125 2023-09-30 07:55:03,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-30 07:55:03,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:55:06,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:55:06,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-30 07:55:08,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:55:08,277 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-30 07:55:08,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:55:08,331 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:55:08,496 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=645926.6666666666, ans=0.125 2023-09-30 07:55:12,436 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=645993.3333333334, ans=0.95 2023-09-30 07:55:13,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:55:17,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:55:18,044 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=645993.3333333334, ans=0.125 2023-09-30 07:55:19,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:55:19,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-30 07:55:19,589 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-30 07:55:21,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-30 07:55:24,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:55:26,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-30 07:55:26,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:55:31,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-30 07:55:31,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:55:32,193 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.78 vs. limit=15.0 2023-09-30 07:55:32,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-30 07:55:32,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-30 07:55:32,900 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:55:32,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-30 07:55:34,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:55:35,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-30 07:55:39,748 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:55:39,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:55:40,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:55:44,139 INFO [train.py:1039] (1/4) Epoch 19, batch 1300, loss[loss=0.1906, simple_loss=0.25, pruned_loss=0.06566, over 22746.00 frames. ], tot_loss[loss=0.181, simple_loss=0.2558, pruned_loss=0.05304, over 4690884.85 frames. ], batch size: 322, lr: 5.46e-03, grad_scale: 16.0 2023-09-30 07:55:44,303 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-30 07:55:47,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:55:47,761 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=646126.6666666666, ans=0.0 2023-09-30 07:55:48,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-30 07:55:51,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:55:54,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-30 07:55:54,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:55:56,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:55:59,141 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:56:00,445 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.480e+02 1.929e+02 2.084e+02 2.401e+02 3.525e+02, threshold=4.167e+02, percent-clipped=0.0 2023-09-30 07:56:00,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-30 07:56:05,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:56:07,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-30 07:56:08,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-30 07:56:12,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 07:56:15,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:56:16,049 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:56:17,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:56:19,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:56:21,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 07:56:21,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-30 07:56:22,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-30 07:56:24,413 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=646260.0, ans=0.1 2023-09-30 07:56:30,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:56:30,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 07:56:32,389 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-30 07:56:32,494 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 07:56:34,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:56:35,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:56:35,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-30 07:56:37,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:56:37,314 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-30 07:56:38,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:56:44,167 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:56:44,172 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:56:49,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-30 07:56:49,444 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-30 07:56:51,037 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-30 07:56:56,361 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:56:57,924 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-30 07:56:59,548 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:57:05,293 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=646393.3333333334, ans=0.1 2023-09-30 07:57:06,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-30 07:57:07,836 INFO [train.py:1039] (1/4) Epoch 19, batch 1350, loss[loss=0.1698, simple_loss=0.2534, pruned_loss=0.04312, over 24690.00 frames. ], tot_loss[loss=0.1806, simple_loss=0.2552, pruned_loss=0.05299, over 4699018.59 frames. ], batch size: 65, lr: 5.46e-03, grad_scale: 16.0 2023-09-30 07:57:10,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:57:13,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:57:14,952 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:57:16,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:57:16,579 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=646460.0, ans=0.125 2023-09-30 07:57:18,045 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=646460.0, ans=0.0 2023-09-30 07:57:19,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:57:19,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:57:26,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:57:27,066 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.64 vs. limit=22.5 2023-09-30 07:57:28,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-30 07:57:29,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-30 07:57:29,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:57:31,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-30 07:57:31,753 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=646526.6666666666, ans=0.0 2023-09-30 07:57:31,766 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=646526.6666666666, ans=0.0 2023-09-30 07:57:32,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:57:33,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:57:33,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-30 07:57:34,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-30 07:57:36,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-30 07:57:38,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:57:38,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-30 07:57:54,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:58:04,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:58:05,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:58:05,095 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-30 07:58:05,267 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=646660.0, ans=0.125 2023-09-30 07:58:09,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:58:09,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-30 07:58:09,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-30 07:58:11,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:58:14,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:58:16,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-30 07:58:17,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:58:24,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-30 07:58:26,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-30 07:58:30,831 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.73 vs. limit=15.0 2023-09-30 07:58:31,427 INFO [train.py:1039] (1/4) Epoch 19, batch 1400, loss[loss=0.1863, simple_loss=0.2567, pruned_loss=0.05798, over 23368.00 frames. ], tot_loss[loss=0.1792, simple_loss=0.2537, pruned_loss=0.05234, over 4703426.70 frames. ], batch size: 119, lr: 5.46e-03, grad_scale: 8.0 2023-09-30 07:58:33,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-30 07:58:34,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:58:36,992 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:58:38,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:58:44,460 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-30 07:58:45,995 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-30 07:58:49,378 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.898e+02 2.160e+02 2.671e+02 3.929e+02, threshold=4.321e+02, percent-clipped=0.0 2023-09-30 07:58:55,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:58:57,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:59:00,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:59:00,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-30 07:59:04,708 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:59:06,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 07:59:16,353 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:59:16,467 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:59:23,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-30 07:59:24,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:59:24,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:59:26,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:59:26,305 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:59:27,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:59:27,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:59:29,328 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:59:30,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-30 07:59:32,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:59:34,113 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=646993.3333333334, ans=0.1 2023-09-30 07:59:35,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:59:39,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-30 07:59:47,984 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-30 07:59:49,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 07:59:49,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:59:51,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 07:59:52,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:59:52,991 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:59:54,267 INFO [train.py:1039] (1/4) Epoch 19, batch 1450, loss[loss=0.1353, simple_loss=0.1871, pruned_loss=0.04173, over 19148.00 frames. ], tot_loss[loss=0.1783, simple_loss=0.2524, pruned_loss=0.05205, over 4701401.71 frames. ], batch size: 388, lr: 5.46e-03, grad_scale: 8.0 2023-09-30 07:59:55,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-30 07:59:59,375 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:59:59,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:59:59,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-30 07:59:59,904 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=647126.6666666666, ans=0.1 2023-09-30 08:00:04,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:00:04,900 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 08:00:06,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:00:07,840 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-30 08:00:07,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 08:00:09,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-30 08:00:09,826 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=647193.3333333334, ans=0.0 2023-09-30 08:00:10,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:00:11,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:00:11,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-30 08:00:14,135 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:00:14,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-30 08:00:14,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 08:00:14,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:00:15,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:00:16,028 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=647193.3333333334, ans=0.025 2023-09-30 08:00:18,015 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:00:21,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:00:24,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:00:24,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:00:27,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:00:27,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:00:29,544 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=647260.0, ans=0.125 2023-09-30 08:00:30,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:00:30,751 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:00:30,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:00:31,443 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.51 vs. limit=15.0 2023-09-30 08:00:32,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:00:32,671 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=647260.0, ans=0.0 2023-09-30 08:00:34,309 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=647260.0, ans=0.1 2023-09-30 08:00:35,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-30 08:00:37,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:00:40,663 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-30 08:00:42,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:00:43,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:00:45,366 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:00:45,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-30 08:00:45,743 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=647326.6666666666, ans=0.0 2023-09-30 08:00:49,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:00:50,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-30 08:00:52,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-30 08:00:55,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:00:58,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:00:58,392 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:00:58,641 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=647393.3333333334, ans=0.2 2023-09-30 08:01:00,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-30 08:01:03,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-30 08:01:03,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-30 08:01:03,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=647393.3333333334, ans=0.1 2023-09-30 08:01:05,108 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:01:06,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 08:01:09,942 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.57 vs. limit=5.0 2023-09-30 08:01:16,672 INFO [train.py:1039] (1/4) Epoch 19, batch 1500, loss[loss=0.2177, simple_loss=0.2772, pruned_loss=0.07914, over 22777.00 frames. ], tot_loss[loss=0.1787, simple_loss=0.2529, pruned_loss=0.05225, over 4709183.81 frames. ], batch size: 322, lr: 5.46e-03, grad_scale: 8.0 2023-09-30 08:01:16,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-30 08:01:16,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:01:16,820 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:01:17,132 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=647460.0, ans=0.125 2023-09-30 08:01:18,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:01:18,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:01:19,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:01:22,060 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-30 08:01:25,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:01:25,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-30 08:01:25,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:01:27,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:01:28,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:01:31,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:01:34,838 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.480e+02 1.890e+02 2.087e+02 2.413e+02 4.629e+02, threshold=4.174e+02, percent-clipped=1.0 2023-09-30 08:01:36,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:01:36,706 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-30 08:01:37,356 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.02 vs. limit=22.5 2023-09-30 08:01:38,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-30 08:01:38,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:01:40,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:01:41,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-30 08:01:47,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-30 08:01:50,103 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:01:50,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-30 08:01:51,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-30 08:01:53,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:01:54,905 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:01:54,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:01:56,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-30 08:01:58,577 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:01:58,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:01:59,961 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-30 08:02:00,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:02:00,801 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.63 vs. limit=6.0 2023-09-30 08:02:07,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:02:07,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-30 08:02:12,248 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 08:02:12,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 08:02:17,761 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-30 08:02:19,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:02:19,168 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-30 08:02:19,444 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:02:20,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:02:22,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:02:23,532 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-30 08:02:25,133 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-30 08:02:26,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-30 08:02:27,187 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=647726.6666666666, ans=0.95 2023-09-30 08:02:28,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:02:31,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:02:31,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:02:33,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:02:33,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:02:35,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:02:37,335 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-30 08:02:38,621 INFO [train.py:1039] (1/4) Epoch 19, batch 1550, loss[loss=0.1901, simple_loss=0.2731, pruned_loss=0.05358, over 24300.00 frames. ], tot_loss[loss=0.1794, simple_loss=0.2539, pruned_loss=0.05243, over 4720274.96 frames. ], batch size: 74, lr: 5.45e-03, grad_scale: 8.0 2023-09-30 08:02:38,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-30 08:02:38,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:02:39,423 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.86 vs. limit=15.0 2023-09-30 08:02:40,267 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-30 08:02:40,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-30 08:02:43,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:02:44,885 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:02:45,783 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=11.28 vs. limit=15.0 2023-09-30 08:02:46,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:02:46,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:02:46,588 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=647793.3333333334, ans=0.125 2023-09-30 08:02:47,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:02:47,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:02:51,450 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-30 08:02:51,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:02:51,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 08:02:53,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 08:02:55,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-30 08:02:55,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-30 08:02:56,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:02:58,230 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-30 08:02:58,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-30 08:02:58,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-30 08:02:59,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:03:01,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:03:05,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:03:08,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-30 08:03:08,332 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-30 08:03:16,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:03:22,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:03:22,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-30 08:03:22,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:03:22,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-30 08:03:30,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 08:03:33,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:03:34,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:03:36,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:03:37,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:03:37,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-30 08:03:37,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:03:40,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:03:40,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:03:43,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-30 08:03:43,170 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-30 08:03:46,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:03:51,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-30 08:03:57,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:03:59,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:03:59,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-30 08:04:01,500 INFO [train.py:1039] (1/4) Epoch 19, batch 1600, loss[loss=0.2046, simple_loss=0.2729, pruned_loss=0.06821, over 23399.00 frames. ], tot_loss[loss=0.1799, simple_loss=0.2544, pruned_loss=0.05274, over 4718275.93 frames. ], batch size: 285, lr: 5.45e-03, grad_scale: 16.0 2023-09-30 08:04:03,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:04:03,378 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:04:04,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:04:04,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:04:04,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:04:04,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:04:08,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:04:09,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-30 08:04:10,108 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=648126.6666666666, ans=0.125 2023-09-30 08:04:11,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-30 08:04:13,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-30 08:04:16,177 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:04:17,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-30 08:04:19,711 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.864e+02 2.063e+02 2.300e+02 3.333e+02, threshold=4.126e+02, percent-clipped=0.0 2023-09-30 08:04:19,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:04:21,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:04:26,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:04:29,633 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=648193.3333333334, ans=0.0 2023-09-30 08:04:30,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-30 08:04:33,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:04:34,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-30 08:04:34,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:04:36,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-30 08:04:39,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-30 08:04:39,786 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=648260.0, ans=0.125 2023-09-30 08:04:47,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:04:49,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-30 08:04:49,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:04:50,414 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.whiten.whitening_limit, batch_count=648326.6666666666, ans=12.0 2023-09-30 08:04:51,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:04:51,280 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:04:52,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-30 08:04:56,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 08:04:56,389 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:04:57,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:04:59,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:05:00,050 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:05:01,630 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:05:04,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:05:04,630 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:05:09,290 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.27 vs. limit=15.0 2023-09-30 08:05:10,381 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=648393.3333333334, ans=0.125 2023-09-30 08:05:11,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:05:13,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:05:16,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-30 08:05:16,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:05:16,716 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-30 08:05:23,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:05:23,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:05:25,261 INFO [train.py:1039] (1/4) Epoch 19, batch 1650, loss[loss=0.1582, simple_loss=0.2374, pruned_loss=0.03948, over 24296.00 frames. ], tot_loss[loss=0.1816, simple_loss=0.2558, pruned_loss=0.05365, over 4689848.46 frames. ], batch size: 61, lr: 5.45e-03, grad_scale: 16.0 2023-09-30 08:05:25,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:05:25,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-30 08:05:25,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-30 08:05:25,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-30 08:05:26,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-30 08:05:30,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:05:31,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:05:31,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:05:31,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-30 08:05:33,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:05:36,866 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-30 08:05:39,923 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:05:39,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:05:39,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:05:39,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:05:42,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-30 08:05:42,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-30 08:05:48,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 08:05:50,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:05:55,396 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=648526.6666666666, ans=0.125 2023-09-30 08:05:58,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-30 08:06:01,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:06:03,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-30 08:06:07,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:06:08,154 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=648593.3333333334, ans=0.125 2023-09-30 08:06:08,231 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=648593.3333333334, ans=0.0 2023-09-30 08:06:08,355 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=648593.3333333334, ans=0.125 2023-09-30 08:06:09,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:06:11,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:06:11,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:06:12,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:06:12,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:06:16,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:06:16,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:06:18,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:06:18,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:06:20,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:06:20,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:06:21,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:06:23,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-30 08:06:24,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:06:25,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-30 08:06:25,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-30 08:06:26,503 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-30 08:06:26,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:06:28,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:06:29,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:06:29,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:06:29,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-30 08:06:29,673 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=648726.6666666666, ans=0.0 2023-09-30 08:06:34,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:06:37,843 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:06:37,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:06:40,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-30 08:06:47,397 INFO [train.py:1039] (1/4) Epoch 19, batch 1700, loss[loss=0.1824, simple_loss=0.2482, pruned_loss=0.05833, over 23841.00 frames. ], tot_loss[loss=0.18, simple_loss=0.2552, pruned_loss=0.05242, over 4703114.50 frames. ], batch size: 195, lr: 5.45e-03, grad_scale: 16.0 2023-09-30 08:06:47,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:06:47,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:06:47,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-30 08:06:49,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:06:49,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:06:49,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:06:52,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:06:52,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:06:52,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-30 08:06:56,514 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 08:06:57,033 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=648793.3333333334, ans=0.1 2023-09-30 08:07:04,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:07:05,479 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.448e+02 1.834e+02 2.122e+02 2.379e+02 4.054e+02, threshold=4.245e+02, percent-clipped=0.0 2023-09-30 08:07:05,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:07:11,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-30 08:07:12,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:07:12,770 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:07:14,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:07:14,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=648860.0, ans=0.1 2023-09-30 08:07:17,578 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-30 08:07:19,300 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:07:19,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:07:22,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-30 08:07:24,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-30 08:07:26,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-30 08:07:26,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-30 08:07:27,695 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:07:29,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-30 08:07:29,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:07:40,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:07:40,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:07:42,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:07:44,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-30 08:07:44,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-30 08:07:44,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:07:47,182 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:07:47,184 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-30 08:07:48,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:07:48,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:07:48,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:07:48,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:07:51,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:07:51,769 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:07:54,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:07:54,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:07:54,213 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:07:59,516 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.27 vs. limit=22.5 2023-09-30 08:08:00,033 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:08:00,222 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-30 08:08:02,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:08:03,873 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:08:06,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-30 08:08:07,263 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=649060.0, ans=0.0 2023-09-30 08:08:09,786 INFO [train.py:1039] (1/4) Epoch 19, batch 1750, loss[loss=0.1695, simple_loss=0.2391, pruned_loss=0.04995, over 23667.00 frames. ], tot_loss[loss=0.1786, simple_loss=0.2535, pruned_loss=0.05186, over 4707528.30 frames. ], batch size: 232, lr: 5.45e-03, grad_scale: 16.0 2023-09-30 08:08:11,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:08:14,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:08:15,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-30 08:08:15,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-30 08:08:15,618 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=649126.6666666666, ans=0.125 2023-09-30 08:08:16,637 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:08:16,930 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=649126.6666666666, ans=0.0 2023-09-30 08:08:19,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:08:19,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:08:19,887 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=649126.6666666666, ans=0.0 2023-09-30 08:08:24,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-30 08:08:24,672 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.98 vs. limit=15.0 2023-09-30 08:08:26,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:08:29,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-30 08:08:29,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:08:31,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:08:34,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 08:08:36,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-30 08:08:39,561 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:08:39,621 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-30 08:08:50,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:08:52,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:08:52,253 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:08:55,353 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:08:55,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:08:58,351 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:09:00,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:09:02,411 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:09:02,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:09:04,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-30 08:09:06,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:09:06,478 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=649326.6666666666, ans=0.125 2023-09-30 08:09:07,146 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.24 vs. limit=22.5 2023-09-30 08:09:09,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-30 08:09:11,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:09:11,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:09:12,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:09:17,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 08:09:18,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-30 08:09:18,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:09:22,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:09:23,045 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=649393.3333333334, ans=0.0 2023-09-30 08:09:25,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:09:27,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:09:29,047 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:09:30,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-30 08:09:30,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:09:30,820 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=649460.0, ans=0.125 2023-09-30 08:09:31,820 INFO [train.py:1039] (1/4) Epoch 19, batch 1800, loss[loss=0.1652, simple_loss=0.2139, pruned_loss=0.05819, over 19173.00 frames. ], tot_loss[loss=0.1778, simple_loss=0.2525, pruned_loss=0.0515, over 4710199.99 frames. ], batch size: 388, lr: 5.45e-03, grad_scale: 16.0 2023-09-30 08:09:32,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-30 08:09:32,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:09:32,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:09:32,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:09:32,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-30 08:09:37,151 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:09:38,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:09:40,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 08:09:43,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:09:47,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 08:09:47,306 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:09:50,069 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.459e+02 1.833e+02 2.073e+02 2.384e+02 3.418e+02, threshold=4.146e+02, percent-clipped=0.0 2023-09-30 08:09:50,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:09:53,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:09:53,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:09:54,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:09:58,635 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:09:58,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-30 08:09:58,772 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:10:02,332 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=649526.6666666666, ans=0.125 2023-09-30 08:10:03,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:10:06,588 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-30 08:10:09,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-30 08:10:09,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-30 08:10:10,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:10:11,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:10:11,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:10:11,679 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:10:20,271 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-30 08:10:21,844 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:10:23,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:10:24,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-30 08:10:25,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-30 08:10:25,246 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:10:26,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:10:26,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:10:28,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 08:10:33,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-30 08:10:39,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:10:41,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-30 08:10:41,298 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:10:42,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:10:42,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:10:44,966 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-30 08:10:47,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:10:47,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:10:51,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-30 08:10:51,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:10:53,177 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:10:53,365 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=649793.3333333334, ans=0.0 2023-09-30 08:10:54,644 INFO [train.py:1039] (1/4) Epoch 19, batch 1850, loss[loss=0.1841, simple_loss=0.2681, pruned_loss=0.05008, over 24333.00 frames. ], tot_loss[loss=0.178, simple_loss=0.2533, pruned_loss=0.0514, over 4713746.86 frames. ], batch size: 74, lr: 5.45e-03, grad_scale: 16.0 2023-09-30 08:10:54,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-30 08:10:54,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:10:56,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:10:56,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:10:58,626 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=649793.3333333334, ans=0.0 2023-09-30 08:10:59,842 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:10:59,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:11:01,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:11:01,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:11:11,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:11:11,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-30 08:11:11,751 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.53 vs. limit=22.5 2023-09-30 08:11:15,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-30 08:11:17,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-30 08:11:21,377 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=649860.0, ans=0.0 2023-09-30 08:11:22,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:11:22,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-30 08:11:22,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 08:11:32,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:11:34,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-30 08:11:38,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:11:38,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:11:44,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-30 08:11:45,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:11:45,992 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 08:11:47,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:11:49,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:11:50,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:11:54,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:11:54,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:11:55,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 08:11:56,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:11:57,625 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:11:58,242 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=13.92 vs. limit=15.0 2023-09-30 08:11:59,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:12:02,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-30 08:12:02,770 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:12:07,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:12:07,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 08:12:07,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-30 08:12:07,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-30 08:12:09,573 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-30 08:12:11,038 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-30 08:12:12,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 08:12:12,619 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:12:12,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:12:12,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:12:13,487 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-30 08:12:13,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:12:13,572 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:12:14,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-30 08:12:16,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 08:12:16,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:12:16,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-30 08:12:18,011 INFO [train.py:1039] (1/4) Epoch 19, batch 1900, loss[loss=0.1529, simple_loss=0.2364, pruned_loss=0.03472, over 24475.00 frames. ], tot_loss[loss=0.1776, simple_loss=0.2534, pruned_loss=0.05094, over 4725319.23 frames. ], batch size: 63, lr: 5.44e-03, grad_scale: 16.0 2023-09-30 08:12:19,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:12:19,663 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-30 08:12:19,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:12:22,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:12:29,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:12:32,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:12:34,266 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-30 08:12:35,699 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.498e+02 1.831e+02 2.039e+02 2.265e+02 4.223e+02, threshold=4.078e+02, percent-clipped=1.0 2023-09-30 08:12:35,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-30 08:12:36,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:12:37,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:12:37,556 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-30 08:12:37,612 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-30 08:12:42,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-30 08:12:44,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:12:48,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-30 08:12:52,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-30 08:12:56,004 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=650260.0, ans=0.1 2023-09-30 08:13:00,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-30 08:13:05,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-30 08:13:05,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:13:05,705 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-30 08:13:05,723 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-30 08:13:05,768 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-30 08:13:07,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-30 08:13:07,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:13:09,797 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=650326.6666666666, ans=0.125 2023-09-30 08:13:10,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-30 08:13:14,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:13:15,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:13:15,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-30 08:13:20,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:13:23,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-30 08:13:23,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-30 08:13:30,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:13:30,539 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:13:30,559 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:13:32,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:13:33,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 08:13:33,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-30 08:13:34,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:13:38,073 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:13:38,076 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:13:40,095 INFO [train.py:1039] (1/4) Epoch 19, batch 1950, loss[loss=0.1599, simple_loss=0.2397, pruned_loss=0.04006, over 24606.00 frames. ], tot_loss[loss=0.179, simple_loss=0.2545, pruned_loss=0.05176, over 4723338.80 frames. ], batch size: 60, lr: 5.44e-03, grad_scale: 8.0 2023-09-30 08:13:41,749 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:13:41,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:13:41,823 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-30 08:13:43,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:13:48,296 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:13:49,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:13:49,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:13:49,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 08:13:53,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-30 08:13:54,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 08:13:55,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:13:55,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:13:58,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:13:59,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:13:59,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:14:02,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:14:03,671 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:14:03,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 08:14:03,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:14:03,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:14:04,488 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=17.22 vs. limit=22.5 2023-09-30 08:14:06,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:14:12,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:14:12,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:14:12,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-30 08:14:12,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-30 08:14:12,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 08:14:13,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:14:13,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:14:18,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:14:20,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:14:25,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 08:14:28,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:14:28,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-30 08:14:30,410 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-30 08:14:30,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:14:33,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:14:35,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:14:35,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-30 08:14:41,985 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:14:43,467 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:14:45,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:14:49,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:14:52,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:14:53,905 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:14:54,017 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-30 08:14:54,026 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 08:14:55,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:14:55,836 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=650726.6666666666, ans=0.0 2023-09-30 08:14:57,530 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-30 08:14:57,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:15:03,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-30 08:15:04,371 INFO [train.py:1039] (1/4) Epoch 19, batch 2000, loss[loss=0.1661, simple_loss=0.2375, pruned_loss=0.04732, over 23529.00 frames. ], tot_loss[loss=0.1794, simple_loss=0.2549, pruned_loss=0.05191, over 4720927.06 frames. ], batch size: 134, lr: 5.44e-03, grad_scale: 16.0 2023-09-30 08:15:04,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:15:04,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:15:05,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:15:07,572 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:15:10,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-30 08:15:12,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-30 08:15:13,996 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=650793.3333333334, ans=0.0 2023-09-30 08:15:16,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:15:18,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-30 08:15:18,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 08:15:18,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:15:22,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:15:23,989 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 2.096e+02 2.439e+02 2.971e+02 4.515e+02, threshold=4.878e+02, percent-clipped=2.0 2023-09-30 08:15:24,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-30 08:15:25,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:15:27,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:15:28,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:15:30,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-30 08:15:30,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 08:15:32,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-30 08:15:32,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:15:36,985 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:15:39,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-30 08:15:39,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:15:39,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:15:42,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:15:42,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-30 08:15:45,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-30 08:15:45,351 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:15:45,365 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:15:51,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:15:51,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:15:51,500 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:15:52,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:15:56,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:15:57,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:15:58,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:15:58,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:15:58,280 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=650993.3333333334, ans=0.0 2023-09-30 08:16:00,132 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:16:04,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:16:04,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-30 08:16:09,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 08:16:10,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:16:13,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:16:13,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:16:18,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:16:21,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:16:21,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:16:21,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 08:16:21,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:16:23,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:16:25,875 INFO [train.py:1039] (1/4) Epoch 19, batch 2050, loss[loss=0.1954, simple_loss=0.2546, pruned_loss=0.06809, over 23813.00 frames. ], tot_loss[loss=0.1796, simple_loss=0.255, pruned_loss=0.05209, over 4714076.17 frames. ], batch size: 212, lr: 5.44e-03, grad_scale: 16.0 2023-09-30 08:16:25,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:16:30,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:16:30,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:16:34,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:16:38,069 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:16:40,112 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:16:40,200 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:16:43,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-30 08:16:43,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:16:44,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:16:44,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-30 08:16:53,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:16:53,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:16:56,665 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-30 08:16:59,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:17:01,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-30 08:17:01,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:17:06,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:17:06,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:17:08,236 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-30 08:17:08,312 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:17:10,487 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:17:12,065 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:17:12,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:17:15,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:17:16,398 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.44 vs. limit=15.0 2023-09-30 08:17:17,221 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:17:20,254 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:17:22,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:17:26,214 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=10.77 vs. limit=15.0 2023-09-30 08:17:26,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:17:30,571 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=651326.6666666666, ans=0.125 2023-09-30 08:17:31,667 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:17:33,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-30 08:17:40,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:17:40,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:17:40,410 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=651393.3333333334, ans=0.0 2023-09-30 08:17:43,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:17:44,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-30 08:17:50,125 INFO [train.py:1039] (1/4) Epoch 19, batch 2100, loss[loss=0.1911, simple_loss=0.2556, pruned_loss=0.06328, over 23877.00 frames. ], tot_loss[loss=0.1785, simple_loss=0.2533, pruned_loss=0.05183, over 4706835.66 frames. ], batch size: 212, lr: 5.44e-03, grad_scale: 16.0 2023-09-30 08:17:50,342 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-30 08:17:50,342 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:17:50,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:17:51,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 08:17:53,327 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:17:53,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-30 08:17:53,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-30 08:17:55,620 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:17:57,300 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=651460.0, ans=0.125 2023-09-30 08:17:58,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:17:58,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:18:00,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:18:01,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:18:01,726 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-30 08:18:03,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:18:04,609 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-30 08:18:04,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-30 08:18:06,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:18:07,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:18:07,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-30 08:18:07,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 08:18:09,219 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 1.858e+02 2.144e+02 2.529e+02 4.189e+02, threshold=4.288e+02, percent-clipped=0.0 2023-09-30 08:18:13,115 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-30 08:18:13,117 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 08:18:15,039 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=651526.6666666666, ans=0.0 2023-09-30 08:18:16,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:18:16,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:18:19,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:18:21,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-30 08:18:21,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:18:21,813 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 08:18:24,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-30 08:18:26,303 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:18:26,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-30 08:18:26,393 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-30 08:18:28,427 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-30 08:18:29,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-30 08:18:33,017 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:18:34,799 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=651593.3333333334, ans=0.125 2023-09-30 08:18:36,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 08:18:36,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 08:18:37,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:18:39,300 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:18:39,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-30 08:18:39,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:18:40,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:18:40,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:18:40,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-30 08:18:42,501 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-30 08:18:43,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-30 08:18:47,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:18:52,256 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:18:52,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-30 08:18:52,911 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.15 vs. limit=15.0 2023-09-30 08:18:58,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:19:01,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:19:03,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:19:03,242 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:19:03,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-30 08:19:03,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 08:19:04,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:19:04,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-30 08:19:05,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:19:05,126 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:19:07,434 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.34 vs. limit=15.0 2023-09-30 08:19:08,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-30 08:19:09,744 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-30 08:19:09,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:19:12,615 INFO [train.py:1039] (1/4) Epoch 19, batch 2150, loss[loss=0.1843, simple_loss=0.2471, pruned_loss=0.06075, over 23530.00 frames. ], tot_loss[loss=0.1774, simple_loss=0.252, pruned_loss=0.05143, over 4706117.51 frames. ], batch size: 134, lr: 5.44e-03, grad_scale: 16.0 2023-09-30 08:19:12,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:19:12,697 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:19:12,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:19:12,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:19:19,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 08:19:22,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:19:22,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:19:24,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:19:24,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:19:25,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:19:27,222 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:19:27,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:19:27,312 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:19:32,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:19:32,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-30 08:19:39,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:19:41,005 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:19:41,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:19:42,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:19:42,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:19:42,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-30 08:19:44,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:19:44,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:19:44,155 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:19:45,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-30 08:19:45,961 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=651926.6666666666, ans=0.1 2023-09-30 08:19:47,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:19:48,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:19:48,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:19:50,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 08:19:51,179 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=651926.6666666666, ans=0.04949747468305833 2023-09-30 08:19:52,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:19:55,567 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:19:55,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:19:57,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:19:57,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-30 08:19:57,173 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-30 08:20:00,277 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=651993.3333333334, ans=0.015 2023-09-30 08:20:01,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:20:01,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:20:03,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:20:03,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 08:20:05,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:20:07,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:20:07,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-30 08:20:07,366 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=651993.3333333334, ans=0.04949747468305833 2023-09-30 08:20:09,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-30 08:20:10,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:20:10,772 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-30 08:20:10,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:20:12,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:20:12,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-30 08:20:12,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:20:12,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-30 08:20:13,810 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-30 08:20:13,810 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-30 08:20:13,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-30 08:20:16,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:20:16,897 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:20:16,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:20:18,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:20:20,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 08:20:21,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:20:21,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:20:28,568 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=652060.0, ans=0.125 2023-09-30 08:20:29,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:20:31,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-30 08:20:31,540 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=652060.0, ans=0.1 2023-09-30 08:20:34,266 INFO [train.py:1039] (1/4) Epoch 19, batch 2200, loss[loss=0.1968, simple_loss=0.2493, pruned_loss=0.07215, over 19248.00 frames. ], tot_loss[loss=0.1773, simple_loss=0.2521, pruned_loss=0.0512, over 4708038.92 frames. ], batch size: 388, lr: 5.44e-03, grad_scale: 16.0 2023-09-30 08:20:35,880 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:20:37,784 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=652126.6666666666, ans=0.0 2023-09-30 08:20:41,238 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=652126.6666666666, ans=0.09899494936611666 2023-09-30 08:20:42,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:20:42,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:20:44,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:20:44,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-30 08:20:46,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:20:48,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:20:48,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-30 08:20:50,305 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=652193.3333333334, ans=0.125 2023-09-30 08:20:53,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-30 08:20:54,310 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.914e+02 2.227e+02 2.792e+02 4.256e+02, threshold=4.455e+02, percent-clipped=0.0 2023-09-30 08:20:55,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 08:21:01,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-30 08:21:02,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:21:03,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:21:04,527 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:21:09,118 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:21:09,175 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-30 08:21:12,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-30 08:21:14,435 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:21:14,539 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-30 08:21:14,746 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=652260.0, ans=0.125 2023-09-30 08:21:16,293 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=652260.0, ans=0.125 2023-09-30 08:21:19,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:21:19,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:21:22,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:21:22,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:21:26,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-30 08:21:27,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:21:28,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-30 08:21:32,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:21:32,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-30 08:21:32,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:21:33,799 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=10.65 vs. limit=15.0 2023-09-30 08:21:34,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:21:34,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:21:35,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:21:35,798 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:21:37,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-30 08:21:37,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:21:37,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=652326.6666666666, ans=0.125 2023-09-30 08:21:40,376 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 08:21:42,108 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 08:21:42,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:21:45,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:21:47,415 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-30 08:21:48,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:21:49,070 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-30 08:21:51,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-30 08:21:52,946 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-30 08:21:54,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:21:54,596 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-30 08:21:56,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:21:57,589 INFO [train.py:1039] (1/4) Epoch 19, batch 2250, loss[loss=0.1853, simple_loss=0.2588, pruned_loss=0.05587, over 23611.00 frames. ], tot_loss[loss=0.1779, simple_loss=0.2529, pruned_loss=0.05146, over 4716313.30 frames. ], batch size: 149, lr: 5.43e-03, grad_scale: 16.0 2023-09-30 08:21:59,380 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-30 08:22:00,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:22:02,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:22:06,144 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:22:08,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:22:08,272 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=652460.0, ans=0.125 2023-09-30 08:22:09,527 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:22:12,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:22:12,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 08:22:14,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:22:15,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-30 08:22:16,013 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:22:16,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:22:19,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-30 08:22:20,999 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:22:21,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:22:22,622 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 08:22:28,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:22:30,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 08:22:31,379 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-30 08:22:31,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-30 08:22:33,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:22:34,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:22:39,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:22:41,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:22:42,044 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=652593.3333333334, ans=0.125 2023-09-30 08:22:43,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:22:43,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:22:47,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:22:50,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:22:55,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:22:58,963 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-30 08:23:05,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 08:23:06,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:23:06,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:23:07,112 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=652726.6666666666, ans=0.125 2023-09-30 08:23:11,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 08:23:16,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-30 08:23:16,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-30 08:23:16,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:23:16,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:23:19,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-30 08:23:21,050 INFO [train.py:1039] (1/4) Epoch 19, batch 2300, loss[loss=0.16, simple_loss=0.245, pruned_loss=0.03748, over 24661.00 frames. ], tot_loss[loss=0.1779, simple_loss=0.2531, pruned_loss=0.0514, over 4723805.61 frames. ], batch size: 68, lr: 5.43e-03, grad_scale: 16.0 2023-09-30 08:23:22,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:23:22,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:23:28,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:23:29,056 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:23:32,509 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-30 08:23:34,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:23:40,225 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.428e+02 1.880e+02 2.084e+02 2.491e+02 4.260e+02, threshold=4.169e+02, percent-clipped=0.0 2023-09-30 08:23:40,637 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=652860.0, ans=0.2 2023-09-30 08:23:43,140 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:23:43,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-30 08:23:43,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:23:43,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:23:43,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-30 08:23:44,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:23:46,515 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=652860.0, ans=0.09899494936611666 2023-09-30 08:23:47,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:23:47,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:23:51,244 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:23:55,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:23:58,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:24:03,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:24:03,630 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:24:07,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:24:07,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:24:10,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:24:11,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 08:24:11,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:24:11,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-30 08:24:18,361 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 08:24:18,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:24:18,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:24:18,524 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:24:18,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:24:20,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 08:24:20,077 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-30 08:24:20,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-30 08:24:21,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:24:21,544 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:24:21,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-30 08:24:29,819 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:24:32,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:24:36,824 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:24:36,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:24:38,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-30 08:24:38,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 08:24:38,749 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=653060.0, ans=0.0 2023-09-30 08:24:39,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:24:39,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 08:24:41,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-30 08:24:42,943 INFO [train.py:1039] (1/4) Epoch 19, batch 2350, loss[loss=0.1941, simple_loss=0.2714, pruned_loss=0.05843, over 23882.00 frames. ], tot_loss[loss=0.1785, simple_loss=0.2541, pruned_loss=0.05144, over 4718651.92 frames. ], batch size: 86, lr: 5.43e-03, grad_scale: 16.0 2023-09-30 08:24:44,249 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.93 vs. limit=6.0 2023-09-30 08:24:46,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:24:46,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-30 08:24:55,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-30 08:24:57,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:25:00,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:25:00,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:25:01,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:25:01,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:25:03,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-30 08:25:07,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:25:13,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-30 08:25:14,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:25:16,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:25:16,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:25:19,673 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:25:21,249 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-30 08:25:22,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:25:23,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:25:23,057 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:25:23,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:25:28,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:25:30,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-30 08:25:30,889 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=653260.0, ans=0.125 2023-09-30 08:25:32,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:25:35,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:25:35,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:25:38,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-30 08:25:38,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:25:42,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-30 08:25:43,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:25:48,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-30 08:25:54,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-30 08:25:55,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:25:55,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-30 08:25:55,566 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-30 08:25:55,595 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-30 08:25:57,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-30 08:25:59,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:26:04,833 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:26:06,051 INFO [train.py:1039] (1/4) Epoch 19, batch 2400, loss[loss=0.1699, simple_loss=0.2495, pruned_loss=0.04513, over 23606.00 frames. ], tot_loss[loss=0.178, simple_loss=0.2536, pruned_loss=0.05122, over 4723888.03 frames. ], batch size: 149, lr: 5.43e-03, grad_scale: 32.0 2023-09-30 08:26:10,045 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:26:13,037 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:26:13,118 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-30 08:26:13,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-30 08:26:16,847 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.98 vs. limit=15.0 2023-09-30 08:26:19,875 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 08:26:19,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:26:22,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-30 08:26:22,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:26:22,996 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:26:24,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-30 08:26:25,930 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.493e+02 1.937e+02 2.110e+02 2.328e+02 3.835e+02, threshold=4.219e+02, percent-clipped=0.0 2023-09-30 08:26:29,150 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:26:32,610 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-30 08:26:37,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-30 08:26:43,044 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-30 08:26:44,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:26:46,561 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=653593.3333333334, ans=0.125 2023-09-30 08:26:47,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:26:48,105 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=653593.3333333334, ans=0.125 2023-09-30 08:26:53,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:26:54,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-30 08:26:54,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:26:56,940 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=18.15 vs. limit=22.5 2023-09-30 08:27:02,353 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:27:04,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:27:07,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:27:08,911 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:27:08,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-30 08:27:08,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:27:08,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:27:11,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:27:11,052 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 08:27:17,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:27:18,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 08:27:18,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-30 08:27:20,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-30 08:27:22,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:27:23,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:27:23,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-30 08:27:23,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-30 08:27:23,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-30 08:27:23,233 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-30 08:27:25,458 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-30 08:27:26,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:27:29,755 INFO [train.py:1039] (1/4) Epoch 19, batch 2450, loss[loss=0.1768, simple_loss=0.2682, pruned_loss=0.0427, over 24303.00 frames. ], tot_loss[loss=0.1771, simple_loss=0.2525, pruned_loss=0.05086, over 4733404.43 frames. ], batch size: 74, lr: 5.43e-03, grad_scale: 32.0 2023-09-30 08:27:29,830 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:27:29,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:27:31,395 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-30 08:27:31,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:27:31,722 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=653793.3333333334, ans=0.125 2023-09-30 08:27:32,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-30 08:27:36,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:27:36,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:27:38,157 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=653793.3333333334, ans=0.0 2023-09-30 08:27:39,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:27:39,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:27:40,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-30 08:27:45,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:27:45,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:27:51,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:27:51,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:27:51,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:27:52,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-30 08:27:56,347 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=653860.0, ans=0.125 2023-09-30 08:27:57,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:27:58,044 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=653860.0, ans=0.125 2023-09-30 08:27:59,291 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 08:27:59,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:28:04,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-30 08:28:04,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:28:05,281 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.08 vs. limit=6.0 2023-09-30 08:28:06,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:28:07,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:28:09,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-30 08:28:10,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:28:18,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:28:20,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:28:20,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:28:20,527 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:28:20,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:28:24,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:28:24,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-30 08:28:26,605 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.82 vs. limit=12.0 2023-09-30 08:28:28,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:28:28,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:28:31,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:28:31,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:28:37,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:28:37,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-30 08:28:39,559 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:28:39,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:28:39,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-30 08:28:41,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:28:41,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:28:43,092 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=654060.0, ans=0.0 2023-09-30 08:28:46,393 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.33 vs. limit=15.0 2023-09-30 08:28:47,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:28:50,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:28:50,231 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:28:51,564 INFO [train.py:1039] (1/4) Epoch 19, batch 2500, loss[loss=0.1654, simple_loss=0.2461, pruned_loss=0.04237, over 24631.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.2516, pruned_loss=0.05088, over 4729572.84 frames. ], batch size: 65, lr: 5.43e-03, grad_scale: 32.0 2023-09-30 08:28:53,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-30 08:28:54,199 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=654126.6666666666, ans=0.125 2023-09-30 08:28:55,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:29:01,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:29:11,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:29:11,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:29:12,084 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.16 vs. limit=10.0 2023-09-30 08:29:12,730 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.785e+02 1.970e+02 2.171e+02 3.825e+02, threshold=3.939e+02, percent-clipped=0.0 2023-09-30 08:29:12,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:29:12,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-30 08:29:13,314 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:29:20,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 08:29:22,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:29:22,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-30 08:29:22,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 08:29:22,310 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-30 08:29:25,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:29:25,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:29:25,314 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-30 08:29:25,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:29:27,522 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-30 08:29:27,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:29:32,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:29:32,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:29:35,133 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=654260.0, ans=0.125 2023-09-30 08:29:36,282 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 08:29:36,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-30 08:29:36,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:29:38,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:29:41,288 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:29:47,778 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:29:52,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:29:54,017 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=654326.6666666666, ans=0.125 2023-09-30 08:29:56,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-30 08:29:58,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=654393.3333333334, ans=0.1 2023-09-30 08:29:59,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-30 08:30:00,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:30:00,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-30 08:30:03,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:30:03,758 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 08:30:03,958 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-30 08:30:03,959 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-30 08:30:03,968 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-30 08:30:09,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:30:10,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-30 08:30:11,007 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-30 08:30:11,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:30:12,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-30 08:30:14,927 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.89 vs. limit=6.0 2023-09-30 08:30:15,597 INFO [train.py:1039] (1/4) Epoch 19, batch 2550, loss[loss=0.1975, simple_loss=0.2746, pruned_loss=0.06022, over 23265.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.2524, pruned_loss=0.05102, over 4734469.66 frames. ], batch size: 93, lr: 5.43e-03, grad_scale: 32.0 2023-09-30 08:30:15,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-30 08:30:17,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:30:20,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:30:20,302 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:30:24,001 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:30:25,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-30 08:30:25,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-30 08:30:30,109 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-30 08:30:31,667 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:30:33,245 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:30:34,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:30:36,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 08:30:37,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 08:30:37,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:30:37,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:30:39,355 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:30:41,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-30 08:30:41,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-30 08:30:41,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:30:41,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-30 08:30:51,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:30:51,312 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=654593.3333333334, ans=0.1 2023-09-30 08:30:58,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:30:58,943 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:30:58,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:30:59,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 08:31:05,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:31:08,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 08:31:08,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:31:10,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:31:10,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-30 08:31:10,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-30 08:31:12,936 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=654660.0, ans=0.0 2023-09-30 08:31:15,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:31:15,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:31:23,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:31:23,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-30 08:31:23,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:31:23,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:31:23,639 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=654726.6666666666, ans=0.125 2023-09-30 08:31:25,259 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:31:25,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 08:31:26,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:31:34,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:31:34,643 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=654726.6666666666, ans=0.2 2023-09-30 08:31:35,830 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:31:37,207 INFO [train.py:1039] (1/4) Epoch 19, batch 2600, loss[loss=0.1559, simple_loss=0.246, pruned_loss=0.03291, over 24475.00 frames. ], tot_loss[loss=0.1769, simple_loss=0.2529, pruned_loss=0.05045, over 4748765.37 frames. ], batch size: 66, lr: 5.42e-03, grad_scale: 32.0 2023-09-30 08:31:40,314 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-30 08:31:43,942 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-30 08:31:43,980 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:31:44,029 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-30 08:31:44,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-30 08:31:45,522 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-30 08:31:48,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:31:48,612 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-30 08:31:50,684 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-30 08:31:50,831 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-30 08:31:54,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:31:54,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-30 08:31:56,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-30 08:31:57,363 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 1.901e+02 2.112e+02 2.544e+02 3.828e+02, threshold=4.224e+02, percent-clipped=0.0 2023-09-30 08:31:57,601 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-30 08:31:59,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-30 08:31:59,950 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=654860.0, ans=0.125 2023-09-30 08:32:02,613 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-30 08:32:03,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-30 08:32:10,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:32:11,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:32:11,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:32:11,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-30 08:32:14,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:32:19,922 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-30 08:32:26,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:32:26,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:32:28,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-30 08:32:28,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:32:28,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:32:29,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-30 08:32:31,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-30 08:32:33,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:32:34,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:32:36,505 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=654993.3333333334, ans=0.125 2023-09-30 08:32:37,968 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-30 08:32:39,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:32:39,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:32:44,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:32:45,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:32:45,566 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-30 08:32:45,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:32:47,283 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:32:48,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:32:49,068 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=655060.0, ans=0.07 2023-09-30 08:32:55,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-30 08:32:56,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:32:58,319 INFO [train.py:1039] (1/4) Epoch 19, batch 2650, loss[loss=0.1913, simple_loss=0.2527, pruned_loss=0.06494, over 23763.00 frames. ], tot_loss[loss=0.1779, simple_loss=0.2535, pruned_loss=0.05113, over 4752322.60 frames. ], batch size: 164, lr: 5.42e-03, grad_scale: 32.0 2023-09-30 08:32:58,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 08:33:02,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-30 08:33:02,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:33:04,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 08:33:06,506 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-30 08:33:06,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:33:08,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:33:12,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 08:33:14,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:33:15,776 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:33:17,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-30 08:33:17,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 08:33:17,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:33:20,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-30 08:33:23,362 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-30 08:33:23,994 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.66 vs. limit=15.0 2023-09-30 08:33:24,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:33:27,281 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-30 08:33:27,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:33:28,795 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-30 08:33:30,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:33:30,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-30 08:33:30,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:33:32,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:33:36,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-30 08:33:36,436 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=655260.0, ans=0.125 2023-09-30 08:33:37,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-30 08:33:39,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-30 08:33:42,813 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-30 08:33:42,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:33:43,242 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=655260.0, ans=0.0 2023-09-30 08:33:44,345 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:33:44,400 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-30 08:33:44,659 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=655260.0, ans=0.1 2023-09-30 08:33:45,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:33:47,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:33:47,634 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=655326.6666666666, ans=0.125 2023-09-30 08:33:50,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:33:51,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:33:53,326 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:33:53,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:33:54,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:33:57,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:33:58,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:34:00,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:34:01,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:34:01,717 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=655326.6666666666, ans=0.125 2023-09-30 08:34:02,204 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.55 vs. limit=6.0 2023-09-30 08:34:02,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-30 08:34:04,888 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=655393.3333333334, ans=0.2 2023-09-30 08:34:06,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:34:06,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:34:06,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:34:08,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-30 08:34:12,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:34:13,698 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:34:14,092 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=655393.3333333334, ans=0.07 2023-09-30 08:34:15,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:34:15,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:34:17,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-30 08:34:17,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:34:20,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:34:20,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-30 08:34:21,741 INFO [train.py:1039] (1/4) Epoch 19, batch 2700, loss[loss=0.1724, simple_loss=0.2437, pruned_loss=0.0506, over 23518.00 frames. ], tot_loss[loss=0.1795, simple_loss=0.255, pruned_loss=0.052, over 4736380.83 frames. ], batch size: 134, lr: 5.42e-03, grad_scale: 32.0 2023-09-30 08:34:21,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:34:23,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 08:34:25,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:34:25,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:34:26,481 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:34:28,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:34:28,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:34:28,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:34:28,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-30 08:34:28,429 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=655460.0, ans=0.0 2023-09-30 08:34:29,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-30 08:34:30,954 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:34:32,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-30 08:34:33,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 08:34:35,931 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:34:39,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:34:41,046 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.525e+02 1.869e+02 2.051e+02 2.484e+02 4.492e+02, threshold=4.101e+02, percent-clipped=1.0 2023-09-30 08:34:41,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-30 08:34:41,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:34:46,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:34:46,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:34:46,815 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=655526.6666666666, ans=0.0 2023-09-30 08:34:53,159 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:34:53,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:34:53,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:34:53,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:34:57,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:34:59,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:35:00,255 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.70 vs. limit=12.0 2023-09-30 08:35:00,739 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-30 08:35:00,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:35:05,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:35:05,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:35:14,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:35:16,099 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:35:16,725 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.14 vs. limit=10.0 2023-09-30 08:35:21,237 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:35:21,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:35:23,056 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=655660.0, ans=0.5 2023-09-30 08:35:26,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:35:26,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:35:27,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:35:28,159 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:35:29,695 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:35:29,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:35:34,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:35:34,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:35:34,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:35:37,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-30 08:35:40,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:35:40,815 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:35:40,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-30 08:35:41,204 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=655726.6666666666, ans=0.0 2023-09-30 08:35:42,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-30 08:35:42,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:35:43,942 INFO [train.py:1039] (1/4) Epoch 19, batch 2750, loss[loss=0.1844, simple_loss=0.2545, pruned_loss=0.05715, over 19427.00 frames. ], tot_loss[loss=0.1794, simple_loss=0.2545, pruned_loss=0.05214, over 4730433.72 frames. ], batch size: 42, lr: 5.42e-03, grad_scale: 16.0 2023-09-30 08:35:45,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:35:47,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:35:49,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:35:49,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-30 08:35:51,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:35:53,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:35:54,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 08:35:54,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:35:54,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:35:54,617 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-30 08:35:54,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:35:54,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:35:57,770 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.22 vs. limit=12.0 2023-09-30 08:35:59,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-30 08:36:01,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:36:02,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:36:03,085 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:36:04,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-30 08:36:04,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:36:04,895 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=655860.0, ans=0.0 2023-09-30 08:36:06,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:36:07,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:36:08,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:36:12,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 08:36:12,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 08:36:13,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:36:15,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:36:15,489 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=655926.6666666666, ans=0.125 2023-09-30 08:36:16,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 08:36:24,267 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=655926.6666666666, ans=0.2 2023-09-30 08:36:25,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:36:27,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 08:36:27,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:36:35,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:36:35,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:36:35,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:36:43,089 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:36:44,019 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.73 vs. limit=6.0 2023-09-30 08:36:44,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:36:44,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-30 08:36:44,814 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=655993.3333333334, ans=0.125 2023-09-30 08:36:47,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:36:49,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-30 08:36:53,949 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-30 08:36:57,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:36:59,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-30 08:36:59,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:37:01,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:37:01,314 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-30 08:37:03,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:37:07,165 INFO [train.py:1039] (1/4) Epoch 19, batch 2800, loss[loss=0.1738, simple_loss=0.2681, pruned_loss=0.03978, over 24443.00 frames. ], tot_loss[loss=0.179, simple_loss=0.2535, pruned_loss=0.05219, over 4717681.27 frames. ], batch size: 69, lr: 5.42e-03, grad_scale: 32.0 2023-09-30 08:37:07,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-30 08:37:07,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:37:08,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:37:08,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-30 08:37:08,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:37:10,237 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:37:10,599 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=656126.6666666666, ans=0.2 2023-09-30 08:37:11,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:37:11,962 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-30 08:37:11,963 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-30 08:37:15,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:37:16,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 08:37:16,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:37:20,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:37:21,689 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-30 08:37:23,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-30 08:37:23,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-30 08:37:23,646 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:37:24,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:37:26,440 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:37:26,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:37:28,348 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.513e+02 1.835e+02 2.025e+02 2.355e+02 3.473e+02, threshold=4.050e+02, percent-clipped=0.0 2023-09-30 08:37:32,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:37:32,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:37:32,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-30 08:37:32,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:37:34,715 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:37:41,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:37:42,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:37:45,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:37:47,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:37:47,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:37:52,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:37:52,090 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-30 08:37:52,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:37:52,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:37:52,335 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:37:57,740 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.90 vs. limit=12.0 2023-09-30 08:37:58,379 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:37:58,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:38:02,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:38:04,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:38:04,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:38:04,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 08:38:06,443 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 08:38:06,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 08:38:08,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:38:08,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-30 08:38:08,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:38:08,893 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=656326.6666666666, ans=0.125 2023-09-30 08:38:10,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:38:10,154 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:38:11,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-30 08:38:11,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:38:11,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:38:13,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:38:14,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-30 08:38:19,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:38:19,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 08:38:21,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:38:24,043 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:38:27,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:38:27,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:38:28,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:38:30,212 INFO [train.py:1039] (1/4) Epoch 19, batch 2850, loss[loss=0.173, simple_loss=0.255, pruned_loss=0.04547, over 24505.00 frames. ], tot_loss[loss=0.1785, simple_loss=0.2528, pruned_loss=0.05213, over 4710466.67 frames. ], batch size: 63, lr: 5.42e-03, grad_scale: 32.0 2023-09-30 08:38:31,816 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:38:31,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:38:36,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-30 08:38:36,213 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-30 08:38:44,232 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-30 08:38:44,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:38:45,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-30 08:38:45,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:38:48,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-30 08:38:50,408 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-30 08:38:51,908 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:39:04,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:39:04,274 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=656593.3333333334, ans=0.125 2023-09-30 08:39:05,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:39:05,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:39:07,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 08:39:07,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 08:39:07,156 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:39:09,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 08:39:10,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-30 08:39:11,055 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=656593.3333333334, ans=0.125 2023-09-30 08:39:14,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:39:14,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:39:14,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:39:16,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:39:18,135 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:39:19,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:39:19,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:39:20,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:39:22,528 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:39:22,727 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:39:24,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:39:24,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:39:26,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:39:27,705 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=656660.0, ans=0.1 2023-09-30 08:39:29,646 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=17.60 vs. limit=22.5 2023-09-30 08:39:31,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:39:33,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-30 08:39:33,440 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-30 08:39:35,055 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 08:39:35,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:39:35,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-30 08:39:36,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:39:38,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:39:38,099 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:39:39,476 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:39:39,477 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-30 08:39:40,253 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-30 08:39:40,260 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:39:40,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:39:42,057 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=656726.6666666666, ans=0.1 2023-09-30 08:39:46,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-30 08:39:46,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:39:48,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:39:50,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-30 08:39:51,775 INFO [train.py:1039] (1/4) Epoch 19, batch 2900, loss[loss=0.1886, simple_loss=0.2606, pruned_loss=0.05828, over 23201.00 frames. ], tot_loss[loss=0.178, simple_loss=0.2526, pruned_loss=0.0517, over 4719076.91 frames. ], batch size: 105, lr: 5.42e-03, grad_scale: 16.0 2023-09-30 08:39:53,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:39:54,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-30 08:39:55,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-30 08:39:55,378 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=656793.3333333334, ans=0.025 2023-09-30 08:39:55,454 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=656793.3333333334, ans=0.125 2023-09-30 08:39:56,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:39:56,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-30 08:39:56,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=656793.3333333334, ans=0.125 2023-09-30 08:39:59,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:40:01,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:40:04,228 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:40:05,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:40:08,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-30 08:40:08,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-30 08:40:10,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-30 08:40:11,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:40:12,061 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=656860.0, ans=0.0 2023-09-30 08:40:13,065 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.614e+02 1.848e+02 2.073e+02 2.444e+02 4.000e+02, threshold=4.146e+02, percent-clipped=0.0 2023-09-30 08:40:14,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-30 08:40:16,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-30 08:40:19,820 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:40:19,839 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-30 08:40:19,881 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:40:20,233 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=656860.0, ans=0.0 2023-09-30 08:40:22,590 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.65 vs. limit=15.0 2023-09-30 08:40:23,506 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:40:23,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-30 08:40:27,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:40:29,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:40:32,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:40:35,445 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:40:37,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-30 08:40:37,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-30 08:40:37,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:40:41,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 08:40:41,965 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=656993.3333333334, ans=0.0 2023-09-30 08:40:44,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-30 08:40:46,272 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:40:51,467 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:40:54,875 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=657060.0, ans=0.0 2023-09-30 08:40:57,466 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.85 vs. limit=15.0 2023-09-30 08:40:59,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:41:01,823 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-30 08:41:01,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-30 08:41:05,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:41:05,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-30 08:41:05,208 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:41:05,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-30 08:41:12,577 INFO [train.py:1039] (1/4) Epoch 19, batch 2950, loss[loss=0.18, simple_loss=0.2603, pruned_loss=0.04986, over 24597.00 frames. ], tot_loss[loss=0.1784, simple_loss=0.2535, pruned_loss=0.05158, over 4732996.11 frames. ], batch size: 68, lr: 5.42e-03, grad_scale: 16.0 2023-09-30 08:41:12,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:41:15,748 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-30 08:41:17,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:41:17,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:41:18,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:41:20,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:41:21,931 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-30 08:41:23,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-30 08:41:23,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 08:41:23,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:41:30,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:41:33,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:41:35,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:41:35,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:41:38,498 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=657193.3333333334, ans=0.05 2023-09-30 08:41:39,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:41:41,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:41:42,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:41:44,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:41:44,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:41:48,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-30 08:41:53,456 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-30 08:41:53,498 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-30 08:41:54,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:41:56,551 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-30 08:41:58,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-30 08:41:58,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:41:58,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:41:58,301 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-30 08:41:58,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-30 08:42:01,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-30 08:42:03,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:42:03,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:42:03,953 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=657326.6666666666, ans=10.0 2023-09-30 08:42:05,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:42:05,677 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.38 vs. limit=15.0 2023-09-30 08:42:07,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:42:07,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:42:07,191 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-30 08:42:07,253 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:42:07,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-30 08:42:12,517 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:42:15,917 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:42:17,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:42:18,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-30 08:42:18,961 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:42:20,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-30 08:42:22,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:42:23,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:42:25,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:42:26,730 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:42:26,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 08:42:28,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:42:29,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:42:29,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-30 08:42:29,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-30 08:42:31,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:42:31,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:42:31,689 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:42:32,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:42:34,283 INFO [train.py:1039] (1/4) Epoch 19, batch 3000, loss[loss=0.1604, simple_loss=0.2281, pruned_loss=0.04631, over 23498.00 frames. ], tot_loss[loss=0.1791, simple_loss=0.2544, pruned_loss=0.05188, over 4728190.78 frames. ], batch size: 134, lr: 5.41e-03, grad_scale: 16.0 2023-09-30 08:42:34,283 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-30 08:42:48,939 INFO [train.py:1071] (1/4) Epoch 19, validation: loss=0.3515, simple_loss=0.275, pruned_loss=0.214, over 1125622.00 frames. 2023-09-30 08:42:48,939 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-30 08:42:49,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-30 08:42:49,391 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=657460.0, ans=0.1 2023-09-30 08:42:50,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:42:52,215 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:42:52,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-30 08:42:54,048 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=657460.0, ans=0.125 2023-09-30 08:42:55,309 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-30 08:42:55,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-30 08:42:57,031 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:42:57,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:42:57,351 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=657460.0, ans=0.125 2023-09-30 08:42:58,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-30 08:42:59,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:43:07,319 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 08:43:11,618 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.513e+02 1.829e+02 2.117e+02 2.474e+02 3.888e+02, threshold=4.234e+02, percent-clipped=0.0 2023-09-30 08:43:16,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:43:25,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-30 08:43:25,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:43:27,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 08:43:27,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:43:27,671 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=657593.3333333334, ans=0.0 2023-09-30 08:43:28,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:43:30,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:43:30,534 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-30 08:43:33,589 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-30 08:43:35,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:43:35,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 08:43:37,030 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:43:37,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:43:38,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:43:38,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:43:39,466 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.60 vs. limit=22.5 2023-09-30 08:43:43,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 08:43:43,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:43:43,787 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:43:46,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:43:49,008 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-30 08:43:50,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:43:50,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:43:50,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:43:55,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:43:55,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:43:55,994 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=657726.6666666666, ans=0.125 2023-09-30 08:43:59,182 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-30 08:43:59,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-30 08:43:59,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:43:59,337 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-30 08:44:00,735 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 08:44:02,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-30 08:44:02,734 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=657726.6666666666, ans=0.125 2023-09-30 08:44:04,017 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-30 08:44:04,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 08:44:05,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-30 08:44:05,647 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-30 08:44:05,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 08:44:07,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:44:07,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:44:07,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-30 08:44:07,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:44:09,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:44:12,053 INFO [train.py:1039] (1/4) Epoch 19, batch 3050, loss[loss=0.1731, simple_loss=0.2465, pruned_loss=0.04987, over 23431.00 frames. ], tot_loss[loss=0.1778, simple_loss=0.2535, pruned_loss=0.05106, over 4728480.99 frames. ], batch size: 105, lr: 5.41e-03, grad_scale: 16.0 2023-09-30 08:44:13,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-30 08:44:15,078 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:44:16,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:44:16,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:44:20,548 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:44:22,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-30 08:44:32,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-30 08:44:32,313 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-30 08:44:32,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:44:36,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:44:41,500 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:44:42,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:44:43,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:44:46,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:44:46,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-30 08:44:46,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:44:48,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:44:48,356 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:44:48,502 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:44:50,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:44:53,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:44:53,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-30 08:44:53,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:44:53,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 08:44:58,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:44:58,900 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:44:58,998 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:45:00,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:45:03,715 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=657993.3333333334, ans=0.125 2023-09-30 08:45:06,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:45:06,592 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:45:14,258 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=3.99 vs. limit=5.0 2023-09-30 08:45:16,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:45:16,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:45:16,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:45:19,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:45:19,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 08:45:19,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:45:21,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-30 08:45:22,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:45:22,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:45:24,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-30 08:45:27,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:45:31,793 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=658060.0, ans=0.125 2023-09-30 08:45:33,011 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:45:34,280 INFO [train.py:1039] (1/4) Epoch 19, batch 3100, loss[loss=0.1753, simple_loss=0.256, pruned_loss=0.04731, over 24673.00 frames. ], tot_loss[loss=0.1776, simple_loss=0.2531, pruned_loss=0.0511, over 4729966.79 frames. ], batch size: 68, lr: 5.41e-03, grad_scale: 16.0 2023-09-30 08:45:34,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:45:36,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 08:45:39,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-30 08:45:42,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-30 08:45:42,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-30 08:45:44,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:45:47,697 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:45:47,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:45:48,049 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=658126.6666666666, ans=0.0 2023-09-30 08:45:51,157 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=658193.3333333334, ans=0.2 2023-09-30 08:45:52,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-30 08:45:55,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:45:56,629 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.486e+02 1.813e+02 2.094e+02 2.454e+02 3.292e+02, threshold=4.189e+02, percent-clipped=0.0 2023-09-30 08:46:01,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-30 08:46:04,680 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=658193.3333333334, ans=0.0 2023-09-30 08:46:05,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 08:46:05,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:46:07,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:46:07,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:46:09,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-30 08:46:09,546 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=658260.0, ans=0.0 2023-09-30 08:46:10,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:46:10,919 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-30 08:46:10,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:46:12,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:46:12,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-30 08:46:14,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:46:20,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:46:20,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-30 08:46:20,412 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=658260.0, ans=0.1 2023-09-30 08:46:22,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-30 08:46:23,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:46:25,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:46:26,846 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:46:26,864 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:46:26,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:46:28,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-30 08:46:28,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:46:30,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:46:30,143 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:46:30,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:46:30,156 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 08:46:30,495 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=658326.6666666666, ans=0.0 2023-09-30 08:46:34,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:46:37,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-30 08:46:40,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:46:42,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-30 08:46:42,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:46:42,556 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=658393.3333333334, ans=0.125 2023-09-30 08:46:43,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:46:43,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-30 08:46:47,273 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=658393.3333333334, ans=0.125 2023-09-30 08:46:54,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-30 08:46:56,191 INFO [train.py:1039] (1/4) Epoch 19, batch 3150, loss[loss=0.1834, simple_loss=0.2631, pruned_loss=0.05186, over 23283.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.2517, pruned_loss=0.05088, over 4725206.73 frames. ], batch size: 93, lr: 5.41e-03, grad_scale: 16.0 2023-09-30 08:46:58,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:47:00,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:47:01,727 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:47:01,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:47:01,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-30 08:47:03,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:47:03,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-30 08:47:04,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-30 08:47:06,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:47:10,830 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-30 08:47:13,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-30 08:47:13,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:47:15,476 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-30 08:47:15,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-30 08:47:18,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-30 08:47:18,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-30 08:47:18,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-30 08:47:19,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:47:19,023 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:47:20,594 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:47:22,247 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-30 08:47:23,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:47:25,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:47:25,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:47:26,919 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-30 08:47:30,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-30 08:47:31,527 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:47:33,624 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-30 08:47:33,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:47:35,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-30 08:47:36,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-30 08:47:38,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:47:38,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 08:47:39,495 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 08:47:39,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:47:39,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:47:44,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-30 08:47:44,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-30 08:47:45,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-30 08:47:47,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 08:47:47,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:47:49,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:47:49,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:47:49,371 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-30 08:47:49,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:47:51,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-30 08:47:52,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:47:52,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-30 08:47:54,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-30 08:47:55,891 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:47:55,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:47:56,170 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=658660.0, ans=0.125 2023-09-30 08:47:57,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-30 08:48:00,172 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 08:48:00,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:48:03,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:48:04,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:48:06,911 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:48:11,600 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:48:11,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:48:14,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-30 08:48:20,491 INFO [train.py:1039] (1/4) Epoch 19, batch 3200, loss[loss=0.1645, simple_loss=0.2095, pruned_loss=0.05976, over 19404.00 frames. ], tot_loss[loss=0.1769, simple_loss=0.2514, pruned_loss=0.05118, over 4703550.88 frames. ], batch size: 388, lr: 5.41e-03, grad_scale: 32.0 2023-09-30 08:48:20,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:48:20,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-30 08:48:24,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:48:24,428 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:48:24,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-30 08:48:26,402 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=658793.3333333334, ans=0.125 2023-09-30 08:48:27,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:48:32,143 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-30 08:48:35,326 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:48:43,541 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.911e+02 2.162e+02 2.460e+02 4.180e+02, threshold=4.324e+02, percent-clipped=0.0 2023-09-30 08:48:43,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:48:53,423 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.22 vs. limit=15.0 2023-09-30 08:48:56,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-30 08:48:57,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:49:00,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-30 08:49:02,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 08:49:03,088 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=6.41 vs. limit=12.0 2023-09-30 08:49:05,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:49:05,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 08:49:05,897 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=658926.6666666666, ans=0.0 2023-09-30 08:49:06,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:49:11,506 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-30 08:49:13,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-30 08:49:14,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-30 08:49:16,477 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-30 08:49:18,966 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=658993.3333333334, ans=0.0 2023-09-30 08:49:20,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-30 08:49:26,973 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:49:27,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:49:27,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:49:28,547 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-30 08:49:28,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 08:49:33,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:49:35,317 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-30 08:49:36,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-30 08:49:36,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-30 08:49:38,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-30 08:49:41,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:49:42,868 INFO [train.py:1039] (1/4) Epoch 19, batch 3250, loss[loss=0.1752, simple_loss=0.2624, pruned_loss=0.04406, over 24652.00 frames. ], tot_loss[loss=0.177, simple_loss=0.2515, pruned_loss=0.05124, over 4705044.72 frames. ], batch size: 68, lr: 5.41e-03, grad_scale: 32.0 2023-09-30 08:49:43,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-30 08:49:44,481 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-30 08:49:44,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:49:44,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:49:46,084 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-30 08:49:50,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 08:49:54,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:49:56,810 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.20 vs. limit=15.0 2023-09-30 08:50:04,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:50:04,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-30 08:50:05,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:50:05,825 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:50:05,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:50:07,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:50:07,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 08:50:10,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:50:11,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:50:11,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:50:12,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:50:12,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:50:12,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:50:15,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:50:17,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:50:18,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:50:18,779 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:50:20,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:50:20,392 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:50:20,411 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:50:26,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-30 08:50:26,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:50:26,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:50:28,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:50:30,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-30 08:50:37,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:50:43,844 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=659326.6666666666, ans=0.125 2023-09-30 08:50:43,864 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=659326.6666666666, ans=0.125 2023-09-30 08:50:45,734 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:50:47,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:50:47,188 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-30 08:50:47,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:50:47,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 08:50:47,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:50:48,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-30 08:50:50,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-30 08:50:50,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:50:52,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:50:52,338 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=659393.3333333334, ans=0.125 2023-09-30 08:50:53,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:50:53,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-30 08:50:53,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:50:58,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:50:58,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:51:01,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-30 08:51:01,445 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:51:02,190 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.94 vs. limit=15.0 2023-09-30 08:51:03,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:51:03,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-30 08:51:05,721 INFO [train.py:1039] (1/4) Epoch 19, batch 3300, loss[loss=0.1689, simple_loss=0.243, pruned_loss=0.04744, over 24434.00 frames. ], tot_loss[loss=0.1774, simple_loss=0.2525, pruned_loss=0.05119, over 4731685.41 frames. ], batch size: 58, lr: 5.41e-03, grad_scale: 32.0 2023-09-30 08:51:07,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:51:07,398 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-30 08:51:08,937 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-30 08:51:10,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-30 08:51:10,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:51:12,914 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=659460.0, ans=0.125 2023-09-30 08:51:14,219 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=659460.0, ans=0.125 2023-09-30 08:51:16,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:51:18,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:51:19,748 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:51:19,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 08:51:22,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 08:51:24,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:51:25,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:51:28,248 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.450e+02 1.763e+02 1.973e+02 2.210e+02 4.562e+02, threshold=3.946e+02, percent-clipped=1.0 2023-09-30 08:51:29,919 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-30 08:51:30,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:51:30,061 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:51:32,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:51:32,909 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-30 08:51:35,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:51:37,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 08:51:37,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 08:51:37,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:51:38,714 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-30 08:51:41,335 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_positive, batch_count=659593.3333333334, ans=0.05 2023-09-30 08:51:43,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:51:43,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-30 08:51:44,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:51:44,811 WARNING [train.py:1197] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-30 08:51:46,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-30 08:51:46,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:51:47,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:51:50,094 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-30 08:51:51,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-30 08:51:51,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:51:54,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-30 08:51:54,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:51:59,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:52:00,015 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=659660.0, ans=0.0 2023-09-30 08:52:01,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:52:02,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:52:02,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:52:02,861 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:52:02,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:52:06,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:52:06,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:52:06,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:52:07,784 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-30 08:52:09,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-30 08:52:09,535 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=659726.6666666666, ans=0.125 2023-09-30 08:52:11,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-30 08:52:11,599 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:52:11,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:52:13,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:52:13,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:52:15,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 08:52:16,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:52:16,754 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-30 08:52:16,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:52:20,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 08:52:23,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-30 08:52:23,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:52:23,858 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=659726.6666666666, ans=0.1 2023-09-30 08:52:25,000 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:52:25,150 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=659726.6666666666, ans=0.125 2023-09-30 08:52:26,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:52:28,528 INFO [train.py:1039] (1/4) Epoch 19, batch 3350, loss[loss=0.1527, simple_loss=0.2404, pruned_loss=0.03253, over 24650.00 frames. ], tot_loss[loss=0.1781, simple_loss=0.2534, pruned_loss=0.05142, over 4722325.62 frames. ], batch size: 65, lr: 5.40e-03, grad_scale: 32.0 2023-09-30 08:52:28,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:52:28,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:52:30,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:52:30,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:52:32,272 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=659793.3333333334, ans=0.0 2023-09-30 08:52:33,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:52:33,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:52:34,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:52:36,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:52:38,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:52:39,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:52:41,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:52:41,712 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:52:43,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-30 08:52:44,957 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-30 08:52:46,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:52:48,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-30 08:52:48,663 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-30 08:52:50,227 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:52:50,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:52:51,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:52:52,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-30 08:52:54,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:52:54,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:52:57,061 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:52:58,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:52:58,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:53:00,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:53:01,872 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=659926.6666666666, ans=0.0 2023-09-30 08:53:03,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:53:06,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:53:07,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:53:11,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:53:12,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:53:13,353 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=659926.6666666666, ans=0.2 2023-09-30 08:53:14,595 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:53:16,015 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:53:16,337 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=659993.3333333334, ans=0.0 2023-09-30 08:53:18,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:53:21,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-30 08:53:21,582 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 08:53:21,625 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-30 08:53:21,677 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:53:23,185 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-30 08:53:24,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:53:28,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:53:34,245 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.81 vs. limit=15.0 2023-09-30 08:53:34,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:53:34,952 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-30 08:53:35,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 08:53:37,902 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:53:39,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:53:42,268 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.63 vs. limit=15.0 2023-09-30 08:53:44,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:53:46,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-30 08:53:46,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 08:53:47,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:53:49,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:53:49,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-30 08:53:49,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:53:49,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-30 08:53:51,450 INFO [train.py:1039] (1/4) Epoch 19, batch 3400, loss[loss=0.1665, simple_loss=0.251, pruned_loss=0.04102, over 24681.00 frames. ], tot_loss[loss=0.1792, simple_loss=0.2544, pruned_loss=0.05203, over 4721981.52 frames. ], batch size: 68, lr: 5.40e-03, grad_scale: 32.0 2023-09-30 08:53:51,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:53:51,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:53:53,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-30 08:53:54,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:53:54,696 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-30 08:54:01,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-30 08:54:01,311 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-30 08:54:01,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:54:06,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:54:06,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 08:54:06,482 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:54:08,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:54:14,446 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.902e+02 2.101e+02 2.445e+02 3.700e+02, threshold=4.201e+02, percent-clipped=0.0 2023-09-30 08:54:14,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:54:16,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-30 08:54:20,790 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:54:23,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:54:23,733 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:54:23,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-30 08:54:24,683 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.80 vs. limit=12.0 2023-09-30 08:54:31,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:54:36,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-30 08:54:41,909 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.21 vs. limit=10.0 2023-09-30 08:54:42,486 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:54:42,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:54:43,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-30 08:54:43,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:54:45,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:54:46,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:54:48,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:54:50,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:54:53,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 08:54:53,818 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:55:00,547 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:55:03,423 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-30 08:55:08,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 08:55:11,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-30 08:55:13,178 INFO [train.py:1039] (1/4) Epoch 19, batch 3450, loss[loss=0.1598, simple_loss=0.2228, pruned_loss=0.04844, over 23368.00 frames. ], tot_loss[loss=0.1797, simple_loss=0.2552, pruned_loss=0.05204, over 4726174.39 frames. ], batch size: 285, lr: 5.40e-03, grad_scale: 32.0 2023-09-30 08:55:16,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-30 08:55:18,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:55:19,858 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:55:19,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-30 08:55:21,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:55:23,930 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.25 vs. limit=15.0 2023-09-30 08:55:25,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-30 08:55:29,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:55:31,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:55:31,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-30 08:55:31,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:55:35,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:55:39,393 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.77 vs. limit=15.0 2023-09-30 08:55:41,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-30 08:55:43,281 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=660526.6666666666, ans=0.0 2023-09-30 08:55:48,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-30 08:55:48,141 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 08:55:48,211 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:55:48,422 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=660593.3333333334, ans=0.0 2023-09-30 08:55:51,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:55:56,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-30 08:55:56,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:56:01,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:56:01,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:56:02,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-30 08:56:04,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:56:06,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-30 08:56:06,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:56:08,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:56:11,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:56:14,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-30 08:56:18,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:56:21,347 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=660726.6666666666, ans=0.125 2023-09-30 08:56:22,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:56:24,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:56:27,714 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:56:31,761 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=660726.6666666666, ans=0.09899494936611666 2023-09-30 08:56:32,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:56:32,914 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:56:34,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:56:34,459 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:56:35,711 INFO [train.py:1039] (1/4) Epoch 19, batch 3500, loss[loss=0.1882, simple_loss=0.2714, pruned_loss=0.05249, over 24056.00 frames. ], tot_loss[loss=0.1787, simple_loss=0.2542, pruned_loss=0.05156, over 4730702.03 frames. ], batch size: 80, lr: 5.40e-03, grad_scale: 32.0 2023-09-30 08:56:38,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:56:42,666 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:56:42,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-30 08:56:45,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 08:56:48,912 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-30 08:56:50,730 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:56:51,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:56:53,256 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-30 08:56:57,030 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:56:58,829 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.797e+02 1.956e+02 2.209e+02 3.007e+02, threshold=3.913e+02, percent-clipped=0.0 2023-09-30 08:56:59,072 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:56:59,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:57:01,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:57:01,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-30 08:57:01,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:57:01,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:57:01,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-30 08:57:04,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:57:05,752 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-30 08:57:07,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:57:12,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:57:12,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-30 08:57:13,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:57:15,221 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:57:18,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:57:20,239 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:57:21,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:57:21,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:57:23,350 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-30 08:57:23,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-30 08:57:25,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-30 08:57:25,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:57:26,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:57:28,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:57:28,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:57:31,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 08:57:33,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:57:37,903 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:57:38,052 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=660993.3333333334, ans=0.1 2023-09-30 08:57:38,165 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=660993.3333333334, ans=0.1 2023-09-30 08:57:39,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-30 08:57:40,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-30 08:57:40,845 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:57:42,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:57:42,661 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:57:45,648 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:57:49,315 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-30 08:57:50,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:57:52,322 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:57:53,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-30 08:57:56,631 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-30 08:57:56,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:57:58,224 INFO [train.py:1039] (1/4) Epoch 19, batch 3550, loss[loss=0.1739, simple_loss=0.2474, pruned_loss=0.05014, over 23758.00 frames. ], tot_loss[loss=0.1771, simple_loss=0.252, pruned_loss=0.05113, over 4722344.09 frames. ], batch size: 150, lr: 5.40e-03, grad_scale: 32.0 2023-09-30 08:57:58,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:57:58,485 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:58:00,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:58:03,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:58:13,026 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=661126.6666666666, ans=0.0 2023-09-30 08:58:14,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:58:15,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 08:58:20,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:58:20,457 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:58:21,354 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.42 vs. limit=15.0 2023-09-30 08:58:22,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:58:22,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:58:23,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 08:58:25,697 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:58:27,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:58:27,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:58:27,172 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-30 08:58:28,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 08:58:33,858 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.82 vs. limit=6.0 2023-09-30 08:58:34,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:58:34,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:58:37,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:58:37,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:58:38,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:58:38,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-30 08:58:38,649 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:58:40,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:58:41,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 08:58:45,547 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=661260.0, ans=0.95 2023-09-30 08:58:48,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:58:49,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:58:49,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:58:50,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-30 08:58:52,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:58:52,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-30 08:58:52,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:58:55,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:58:55,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:59:00,457 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-30 08:59:00,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:59:06,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:59:06,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-30 08:59:07,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:59:11,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:59:12,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-30 08:59:21,004 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-30 08:59:21,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:59:21,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:59:22,712 INFO [train.py:1039] (1/4) Epoch 19, batch 3600, loss[loss=0.1732, simple_loss=0.258, pruned_loss=0.04421, over 24567.00 frames. ], tot_loss[loss=0.1768, simple_loss=0.2518, pruned_loss=0.05091, over 4724153.87 frames. ], batch size: 71, lr: 5.40e-03, grad_scale: 32.0 2023-09-30 08:59:22,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:59:23,247 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=661460.0, ans=0.1 2023-09-30 08:59:24,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:59:24,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:59:29,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:59:31,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:59:31,301 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:59:32,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:59:32,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:59:34,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:59:34,092 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-30 08:59:35,800 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 08:59:37,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:59:41,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:59:43,994 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:59:45,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 08:59:45,585 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:59:45,615 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-30 08:59:46,971 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.820e+02 2.003e+02 2.240e+02 3.370e+02, threshold=4.007e+02, percent-clipped=0.0 2023-09-30 08:59:47,189 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:59:47,495 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:59:49,011 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=661526.6666666666, ans=0.025 2023-09-30 08:59:50,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:59:50,324 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:59:53,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:59:55,504 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:59:55,759 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=661593.3333333334, ans=0.2 2023-09-30 08:59:56,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:59:57,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-30 09:00:02,728 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=661593.3333333334, ans=0.125 2023-09-30 09:00:05,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:00:06,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 09:00:06,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-30 09:00:12,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:00:19,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:00:22,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:00:24,748 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=661660.0, ans=0.125 2023-09-30 09:00:28,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-30 09:00:28,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:00:28,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-30 09:00:31,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-30 09:00:32,752 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-30 09:00:34,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:00:34,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:00:35,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-30 09:00:37,942 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:00:37,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:00:38,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:00:39,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-30 09:00:39,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-30 09:00:44,183 INFO [train.py:1039] (1/4) Epoch 19, batch 3650, loss[loss=0.2, simple_loss=0.2685, pruned_loss=0.06568, over 23380.00 frames. ], tot_loss[loss=0.1779, simple_loss=0.2528, pruned_loss=0.05146, over 4694176.63 frames. ], batch size: 285, lr: 5.40e-03, grad_scale: 32.0 2023-09-30 09:00:44,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:00:45,748 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-30 09:00:46,268 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=661793.3333333334, ans=0.95 2023-09-30 09:00:49,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-30 09:00:52,598 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:00:54,240 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=661793.3333333334, ans=0.0 2023-09-30 09:00:57,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-30 09:00:59,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-30 09:01:02,369 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:01:02,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:01:03,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:01:05,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-30 09:01:05,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:01:07,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-30 09:01:07,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:01:07,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:01:10,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-30 09:01:11,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 09:01:12,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:01:12,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:01:14,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:01:17,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-30 09:01:19,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-30 09:01:20,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:01:22,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-30 09:01:23,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:01:25,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:01:29,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 09:01:31,281 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.11 vs. limit=15.0 2023-09-30 09:01:32,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:01:32,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-30 09:01:34,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-30 09:01:34,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:01:35,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:01:40,364 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:01:42,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:01:42,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:01:44,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 09:01:46,105 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:01:46,197 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:01:51,026 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-30 09:01:55,408 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:01:55,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:01:56,972 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-30 09:01:58,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:02:00,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-30 09:02:00,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:02:03,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-30 09:02:03,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:02:05,848 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.42 vs. limit=10.0 2023-09-30 09:02:06,513 INFO [train.py:1039] (1/4) Epoch 19, batch 3700, loss[loss=0.1592, simple_loss=0.2353, pruned_loss=0.04159, over 24615.00 frames. ], tot_loss[loss=0.1784, simple_loss=0.2532, pruned_loss=0.05177, over 4706792.17 frames. ], batch size: 60, lr: 5.39e-03, grad_scale: 32.0 2023-09-30 09:02:06,648 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 09:02:07,012 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=662126.6666666666, ans=0.125 2023-09-30 09:02:10,245 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:02:10,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:02:13,362 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:02:13,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-30 09:02:13,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:02:14,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 09:02:14,827 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 09:02:16,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 09:02:22,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:02:22,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:02:23,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:02:23,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:02:25,144 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 09:02:28,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:02:28,314 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-30 09:02:31,268 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.504e+02 1.890e+02 2.038e+02 2.335e+02 3.154e+02, threshold=4.075e+02, percent-clipped=0.0 2023-09-30 09:02:38,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:02:38,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 09:02:39,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:02:39,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-30 09:02:39,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:02:43,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:02:45,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-30 09:02:45,959 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.36 vs. limit=10.0 2023-09-30 09:02:46,516 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:02:48,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:02:51,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:02:51,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 09:02:55,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 09:02:55,886 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=662326.6666666666, ans=0.125 2023-09-30 09:02:58,668 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:02:58,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-30 09:03:00,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:03:00,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-30 09:03:04,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:03:05,710 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.27 vs. limit=12.0 2023-09-30 09:03:06,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:03:09,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:03:10,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-30 09:03:11,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:03:11,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-30 09:03:11,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:03:13,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:03:18,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:03:19,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-30 09:03:21,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-30 09:03:22,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:03:22,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:03:24,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:03:25,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:03:27,671 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=662460.0, ans=0.1 2023-09-30 09:03:29,196 INFO [train.py:1039] (1/4) Epoch 19, batch 3750, loss[loss=0.1649, simple_loss=0.2493, pruned_loss=0.04025, over 24602.00 frames. ], tot_loss[loss=0.1794, simple_loss=0.2546, pruned_loss=0.05206, over 4719639.36 frames. ], batch size: 71, lr: 5.39e-03, grad_scale: 32.0 2023-09-30 09:03:29,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:03:29,726 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=662460.0, ans=0.0 2023-09-30 09:03:31,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:03:32,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:03:32,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-30 09:03:34,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 09:03:36,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-30 09:03:37,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-30 09:03:39,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:03:40,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:03:40,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:03:40,987 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=662460.0, ans=0.1 2023-09-30 09:03:42,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:03:47,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:03:49,200 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=662526.6666666666, ans=0.125 2023-09-30 09:03:50,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:03:52,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 09:03:55,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:03:57,302 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=662526.6666666666, ans=0.125 2023-09-30 09:03:58,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:03:58,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-30 09:03:59,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:04:01,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:04:01,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:04:05,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-30 09:04:08,724 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:04:09,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-30 09:04:10,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:04:11,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:04:13,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:04:16,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:04:18,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-30 09:04:20,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-30 09:04:24,565 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=662660.0, ans=0.125 2023-09-30 09:04:25,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:04:28,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:04:28,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:04:32,177 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=662660.0, ans=0.0 2023-09-30 09:04:33,272 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 09:04:37,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 09:04:39,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-30 09:04:42,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:04:42,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:04:45,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-30 09:04:51,924 INFO [train.py:1039] (1/4) Epoch 19, batch 3800, loss[loss=0.1988, simple_loss=0.2626, pruned_loss=0.06754, over 23861.00 frames. ], tot_loss[loss=0.1797, simple_loss=0.2546, pruned_loss=0.05238, over 4715060.61 frames. ], batch size: 195, lr: 5.39e-03, grad_scale: 16.0 2023-09-30 09:04:55,166 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:05:01,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:05:01,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 09:05:03,282 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-30 09:05:04,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:05:04,978 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:05:06,488 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-30 09:05:08,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 09:05:08,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:05:08,545 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=662860.0, ans=0.0 2023-09-30 09:05:10,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:05:11,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:05:13,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:05:13,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:05:15,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-30 09:05:18,089 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.822e+02 1.936e+02 2.155e+02 2.834e+02, threshold=3.873e+02, percent-clipped=0.0 2023-09-30 09:05:19,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-30 09:05:21,367 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:05:23,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:05:24,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:05:26,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:05:28,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-30 09:05:28,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:05:28,850 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=662926.6666666666, ans=0.125 2023-09-30 09:05:30,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:05:31,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:05:36,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 09:05:36,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-30 09:05:39,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:05:41,551 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=662993.3333333334, ans=0.0 2023-09-30 09:05:46,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:05:53,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:05:56,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-30 09:05:57,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-30 09:05:59,279 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:06:00,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:06:01,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:06:04,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-30 09:06:07,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-30 09:06:07,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-30 09:06:07,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:06:09,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:06:11,794 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer_ff3.min_abs, batch_count=663060.0, ans=0.2 2023-09-30 09:06:13,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:06:14,482 INFO [train.py:1039] (1/4) Epoch 19, batch 3850, loss[loss=0.1746, simple_loss=0.2645, pruned_loss=0.04239, over 24557.00 frames. ], tot_loss[loss=0.1784, simple_loss=0.2532, pruned_loss=0.05183, over 4728299.39 frames. ], batch size: 71, lr: 5.39e-03, grad_scale: 16.0 2023-09-30 09:06:14,618 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 09:06:19,135 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.76 vs. limit=15.0 2023-09-30 09:06:19,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:06:21,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-30 09:06:21,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:06:23,432 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:06:26,580 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 09:06:28,258 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:06:31,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-30 09:06:32,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-30 09:06:39,612 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:06:41,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:06:43,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:06:44,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:06:46,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:06:47,895 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:06:50,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:06:50,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:06:50,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:06:50,507 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:06:53,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:06:53,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:06:54,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:06:54,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-30 09:06:54,761 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-30 09:06:56,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:06:56,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:06:59,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:07:01,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:07:02,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-30 09:07:05,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-30 09:07:07,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:07:09,485 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-30 09:07:12,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-30 09:07:15,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:07:17,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:07:22,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:07:22,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-30 09:07:26,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-30 09:07:27,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:07:29,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:07:31,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 09:07:31,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 09:07:31,674 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:07:33,188 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:07:33,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:07:33,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-30 09:07:34,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:07:36,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-30 09:07:36,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:07:36,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:07:37,755 INFO [train.py:1039] (1/4) Epoch 19, batch 3900, loss[loss=0.1686, simple_loss=0.2629, pruned_loss=0.03715, over 24657.00 frames. ], tot_loss[loss=0.1775, simple_loss=0.2526, pruned_loss=0.0512, over 4727002.12 frames. ], batch size: 73, lr: 5.39e-03, grad_scale: 16.0 2023-09-30 09:07:37,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:07:39,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:07:40,911 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:07:42,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:07:42,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:07:42,635 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=663460.0, ans=0.125 2023-09-30 09:07:43,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:07:43,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-30 09:07:43,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:07:46,508 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:07:47,667 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:07:49,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 09:07:51,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:07:51,630 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=663460.0, ans=0.0 2023-09-30 09:07:52,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:07:55,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 09:07:55,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:07:57,368 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-30 09:07:59,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-30 09:07:59,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:08:02,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-30 09:08:02,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:08:02,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-30 09:08:03,991 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.880e+02 2.094e+02 2.304e+02 3.533e+02, threshold=4.187e+02, percent-clipped=0.0 2023-09-30 09:08:04,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-30 09:08:10,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:08:11,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:08:12,366 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:08:12,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:08:17,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:08:18,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:08:22,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:08:22,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:08:23,935 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:08:29,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:08:29,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:08:36,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 09:08:37,766 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:08:49,333 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:08:53,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:08:53,744 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-30 09:08:53,820 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-30 09:08:54,598 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:08:55,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-30 09:08:57,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:08:58,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-30 09:09:00,551 INFO [train.py:1039] (1/4) Epoch 19, batch 3950, loss[loss=0.1721, simple_loss=0.261, pruned_loss=0.04158, over 24555.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.2522, pruned_loss=0.05107, over 4721227.01 frames. ], batch size: 71, lr: 5.39e-03, grad_scale: 16.0 2023-09-30 09:09:04,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:09:05,988 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-30 09:09:06,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:09:08,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:09:09,069 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.92 vs. limit=15.0 2023-09-30 09:09:09,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:09:10,048 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=663793.3333333334, ans=0.0 2023-09-30 09:09:11,887 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.01 vs. limit=15.0 2023-09-30 09:09:15,985 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-30 09:09:17,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:09:18,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-30 09:09:19,441 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-30 09:09:19,485 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:09:21,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:09:22,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:09:22,670 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:09:24,400 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-30 09:09:26,360 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=663860.0, ans=0.1 2023-09-30 09:09:27,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:09:28,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:09:28,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:09:29,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:09:29,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:09:43,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:09:43,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:09:51,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-30 09:09:55,365 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=663993.3333333334, ans=0.1 2023-09-30 09:09:58,386 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-30 09:09:58,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-30 09:09:58,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:10:00,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:10:03,664 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=663993.3333333334, ans=0.0 2023-09-30 09:10:06,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-30 09:10:06,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-30 09:10:08,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:10:08,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:10:08,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-30 09:10:14,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:10:16,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:10:17,542 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.38 vs. limit=15.0 2023-09-30 09:10:19,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-30 09:10:24,393 INFO [train.py:1039] (1/4) Epoch 19, batch 4000, loss[loss=0.1698, simple_loss=0.256, pruned_loss=0.04177, over 24651.00 frames. ], tot_loss[loss=0.1778, simple_loss=0.2531, pruned_loss=0.05125, over 4719069.32 frames. ], batch size: 68, lr: 5.39e-03, grad_scale: 32.0 2023-09-30 09:10:30,046 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.53 vs. limit=6.0 2023-09-30 09:10:31,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:10:38,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:10:42,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:10:42,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:10:44,142 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:10:44,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-30 09:10:45,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-30 09:10:45,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-30 09:10:45,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:10:45,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-30 09:10:48,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:10:51,346 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.842e+02 2.148e+02 2.341e+02 3.331e+02, threshold=4.296e+02, percent-clipped=0.0 2023-09-30 09:10:53,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:10:53,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:10:53,108 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:10:53,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:10:53,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-30 09:10:56,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:10:57,726 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-30 09:10:57,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:10:59,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:10:59,682 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=664260.0, ans=0.1 2023-09-30 09:11:02,408 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-30 09:11:03,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 09:11:03,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:11:09,330 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-30 09:11:09,399 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:11:12,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:11:14,354 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-30 09:11:15,873 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:11:17,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-30 09:11:17,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:11:17,672 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=664326.6666666666, ans=0.09899494936611666 2023-09-30 09:11:18,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:11:18,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:11:19,316 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=664326.6666666666, ans=0.125 2023-09-30 09:11:20,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:11:20,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-30 09:11:20,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:11:20,880 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=664326.6666666666, ans=0.1 2023-09-30 09:11:24,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-30 09:11:24,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:11:26,457 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=664326.6666666666, ans=0.2 2023-09-30 09:11:27,580 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-30 09:11:30,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:11:34,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 09:11:37,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:11:37,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:11:38,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:11:40,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:11:45,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:11:46,664 INFO [train.py:1039] (1/4) Epoch 19, batch 4050, loss[loss=0.1518, simple_loss=0.2268, pruned_loss=0.03839, over 14933.00 frames. ], tot_loss[loss=0.1776, simple_loss=0.2531, pruned_loss=0.05101, over 4725614.19 frames. ], batch size: 32, lr: 5.39e-03, grad_scale: 32.0 2023-09-30 09:11:48,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-30 09:11:49,589 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten.whitening_limit, batch_count=664460.0, ans=15.0 2023-09-30 09:11:50,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-30 09:11:51,923 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:11:51,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:11:53,537 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:11:55,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-30 09:11:56,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:12:02,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:12:03,756 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:12:05,217 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 09:12:06,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:12:06,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:12:12,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:12:14,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-30 09:12:16,615 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten.whitening_limit, batch_count=664526.6666666666, ans=15.0 2023-09-30 09:12:17,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 09:12:19,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-30 09:12:19,446 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-30 09:12:22,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:12:24,765 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.53 vs. limit=6.0 2023-09-30 09:12:27,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-30 09:12:29,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:12:32,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:12:38,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:12:38,193 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:12:38,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:12:40,019 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=664660.0, ans=0.1 2023-09-30 09:12:42,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:12:44,710 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=664660.0, ans=0.125 2023-09-30 09:12:45,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-30 09:12:47,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 09:12:47,592 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:12:49,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-30 09:12:50,944 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=664726.6666666666, ans=0.1 2023-09-30 09:12:55,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:13:03,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-30 09:13:05,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:13:05,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:13:07,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-30 09:13:07,723 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-30 09:13:07,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:13:09,180 INFO [train.py:1039] (1/4) Epoch 19, batch 4100, loss[loss=0.1992, simple_loss=0.2824, pruned_loss=0.05805, over 24357.00 frames. ], tot_loss[loss=0.1782, simple_loss=0.254, pruned_loss=0.05124, over 4720403.44 frames. ], batch size: 77, lr: 5.38e-03, grad_scale: 16.0 2023-09-30 09:13:09,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:13:11,424 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:13:11,451 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:13:16,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-30 09:13:16,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-30 09:13:19,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-30 09:13:20,199 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.82 vs. limit=6.0 2023-09-30 09:13:20,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-30 09:13:20,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:13:21,018 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:13:22,371 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:13:22,406 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:13:22,523 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-30 09:13:27,616 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:13:29,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:13:29,169 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:13:29,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:13:33,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 09:13:34,659 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:13:34,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:13:34,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-30 09:13:34,894 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=664860.0, ans=0.125 2023-09-30 09:13:36,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:13:36,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:13:36,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:13:36,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:13:38,156 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.904e+02 2.164e+02 2.650e+02 3.755e+02, threshold=4.328e+02, percent-clipped=0.0 2023-09-30 09:13:38,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-30 09:13:41,370 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:13:41,607 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=664926.6666666666, ans=0.1 2023-09-30 09:13:43,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-30 09:13:45,182 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:13:45,508 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=664926.6666666666, ans=0.0 2023-09-30 09:13:48,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:13:48,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-30 09:13:48,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:13:50,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:13:50,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:13:51,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-30 09:13:53,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:13:54,623 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:13:57,714 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-30 09:13:57,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:13:57,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-30 09:14:01,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:14:07,804 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:14:10,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:14:11,644 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:14:13,992 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=664993.3333333334, ans=0.0 2023-09-30 09:14:16,214 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=665060.0, ans=0.125 2023-09-30 09:14:19,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:14:19,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:14:22,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:14:23,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:14:28,239 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-30 09:14:29,706 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 09:14:31,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:14:31,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:14:33,073 INFO [train.py:1039] (1/4) Epoch 19, batch 4150, loss[loss=0.1882, simple_loss=0.2725, pruned_loss=0.05194, over 24626.00 frames. ], tot_loss[loss=0.1798, simple_loss=0.2549, pruned_loss=0.05235, over 4701899.86 frames. ], batch size: 65, lr: 5.38e-03, grad_scale: 16.0 2023-09-30 09:14:34,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-30 09:14:34,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:14:35,073 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=665126.6666666666, ans=0.125 2023-09-30 09:14:36,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-30 09:14:37,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-30 09:14:37,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-30 09:14:39,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:14:43,818 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=665126.6666666666, ans=0.04949747468305833 2023-09-30 09:14:44,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:14:45,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:14:50,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:14:52,329 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:14:52,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-30 09:14:55,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 09:14:55,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:14:56,901 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-30 09:15:02,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:15:08,076 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:15:08,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-30 09:15:08,504 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=665260.0, ans=0.125 2023-09-30 09:15:11,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-30 09:15:11,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:15:12,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-30 09:15:12,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:15:12,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:15:15,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:15:15,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:15:22,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-30 09:15:26,008 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-30 09:15:26,282 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=665326.6666666666, ans=0.125 2023-09-30 09:15:28,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:15:28,299 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-30 09:15:29,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:15:31,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-30 09:15:34,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:15:34,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:15:35,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:15:37,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-30 09:15:37,541 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:15:37,749 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=665393.3333333334, ans=0.2 2023-09-30 09:15:38,879 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-30 09:15:39,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 09:15:41,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-30 09:15:42,722 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:15:42,730 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 09:15:42,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 09:15:44,258 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-30 09:15:44,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:15:45,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 09:15:45,814 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:15:47,430 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=665393.3333333334, ans=0.125 2023-09-30 09:15:48,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:15:50,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-30 09:15:50,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-30 09:15:50,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=665393.3333333334, ans=0.125 2023-09-30 09:15:55,461 INFO [train.py:1039] (1/4) Epoch 19, batch 4200, loss[loss=0.1725, simple_loss=0.2608, pruned_loss=0.04208, over 24629.00 frames. ], tot_loss[loss=0.1788, simple_loss=0.2539, pruned_loss=0.05187, over 4706062.75 frames. ], batch size: 68, lr: 5.38e-03, grad_scale: 16.0 2023-09-30 09:15:55,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:15:55,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-30 09:15:58,716 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 09:16:00,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:16:02,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:16:02,475 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:16:02,477 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:16:05,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-30 09:16:08,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-30 09:16:08,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:16:11,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:16:15,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:16:18,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-30 09:16:19,937 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:16:19,991 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:16:21,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-30 09:16:21,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:16:22,942 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.443e+02 1.959e+02 2.201e+02 2.604e+02 4.093e+02, threshold=4.401e+02, percent-clipped=0.0 2023-09-30 09:16:23,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:16:23,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:16:23,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:16:25,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:16:27,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-30 09:16:27,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:16:29,072 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=665593.3333333334, ans=0.125 2023-09-30 09:16:32,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-30 09:16:33,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:16:36,200 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=665593.3333333334, ans=0.125 2023-09-30 09:16:37,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:16:38,436 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.60 vs. limit=15.0 2023-09-30 09:16:38,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:16:40,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:16:40,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-30 09:16:40,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:16:42,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:16:48,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-30 09:16:49,676 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:16:54,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-30 09:16:58,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-30 09:17:01,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:17:06,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 09:17:08,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:17:10,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-30 09:17:16,393 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-30 09:17:17,685 INFO [train.py:1039] (1/4) Epoch 19, batch 4250, loss[loss=0.1839, simple_loss=0.2506, pruned_loss=0.0586, over 23776.00 frames. ], tot_loss[loss=0.1776, simple_loss=0.2524, pruned_loss=0.05137, over 4709339.77 frames. ], batch size: 212, lr: 5.38e-03, grad_scale: 16.0 2023-09-30 09:17:19,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:17:19,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-30 09:17:21,430 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:17:22,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:17:27,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-30 09:17:29,369 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-30 09:17:29,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:17:33,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:17:33,422 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=665860.0, ans=0.1 2023-09-30 09:17:36,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:17:36,707 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=665860.0, ans=0.125 2023-09-30 09:17:38,775 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=665860.0, ans=0.125 2023-09-30 09:17:40,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:17:40,571 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:17:40,869 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=665860.0, ans=0.0 2023-09-30 09:17:43,572 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:17:43,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:17:43,808 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=665860.0, ans=0.0 2023-09-30 09:17:46,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:17:48,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:17:49,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:17:51,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:17:52,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:17:54,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-30 09:17:56,390 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=665926.6666666666, ans=0.125 2023-09-30 09:17:58,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-30 09:17:58,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:17:58,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:17:59,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:17:59,182 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:18:00,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:18:00,560 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:18:01,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:18:05,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-30 09:18:07,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:18:11,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:18:13,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:18:13,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-30 09:18:13,513 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=665993.3333333334, ans=0.05 2023-09-30 09:18:14,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:18:14,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-30 09:18:16,952 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-30 09:18:19,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-30 09:18:21,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:18:21,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:18:22,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-30 09:18:24,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 09:18:24,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-30 09:18:24,905 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=666060.0, ans=0.0 2023-09-30 09:18:27,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:18:30,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:18:32,567 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=666060.0, ans=0.0 2023-09-30 09:18:33,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:18:35,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:18:35,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:18:37,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:18:38,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:18:38,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-30 09:18:41,242 INFO [train.py:1039] (1/4) Epoch 19, batch 4300, loss[loss=0.1821, simple_loss=0.2675, pruned_loss=0.04838, over 24597.00 frames. ], tot_loss[loss=0.1773, simple_loss=0.2523, pruned_loss=0.05113, over 4711492.50 frames. ], batch size: 71, lr: 5.38e-03, grad_scale: 16.0 2023-09-30 09:18:41,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:18:44,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:18:45,554 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.61 vs. limit=6.0 2023-09-30 09:18:46,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:18:51,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:18:58,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:18:58,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-30 09:18:59,644 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:18:59,930 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=666193.3333333334, ans=0.125 2023-09-30 09:19:01,324 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:19:01,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:19:01,386 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-30 09:19:01,646 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=666193.3333333334, ans=0.0 2023-09-30 09:19:04,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 09:19:07,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 09:19:08,727 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.820e+02 2.101e+02 2.491e+02 4.654e+02, threshold=4.202e+02, percent-clipped=1.0 2023-09-30 09:19:09,069 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-30 09:19:09,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:19:09,122 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-30 09:19:12,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 09:19:14,354 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:19:17,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:19:19,066 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:19:19,233 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:19:20,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:19:20,975 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=666260.0, ans=0.125 2023-09-30 09:19:22,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:19:24,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-30 09:19:24,344 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-30 09:19:27,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:19:30,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:19:30,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 09:19:30,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:19:30,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:19:31,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-30 09:19:31,762 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-30 09:19:31,860 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-30 09:19:31,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:19:32,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-30 09:19:32,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-30 09:19:32,231 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=666326.6666666666, ans=0.0 2023-09-30 09:19:36,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:19:38,168 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-30 09:19:40,990 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:19:42,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:19:42,560 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:19:46,233 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-30 09:19:46,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 09:19:46,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:19:47,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:19:47,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:19:49,306 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:19:52,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:19:55,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:19:57,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:19:57,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:20:02,593 INFO [train.py:1039] (1/4) Epoch 19, batch 4350, loss[loss=0.1927, simple_loss=0.2604, pruned_loss=0.06253, over 23725.00 frames. ], tot_loss[loss=0.1781, simple_loss=0.2532, pruned_loss=0.0515, over 4712160.32 frames. ], batch size: 164, lr: 5.38e-03, grad_scale: 16.0 2023-09-30 09:20:02,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-30 09:20:02,846 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-30 09:20:10,178 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:20:13,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:20:16,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:20:16,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:20:16,523 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=666526.6666666666, ans=0.125 2023-09-30 09:20:20,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:20:24,733 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:20:27,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:20:27,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:20:30,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:20:30,365 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=666526.6666666666, ans=0.07 2023-09-30 09:20:33,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:20:35,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-30 09:20:39,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-30 09:20:41,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:20:42,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:20:47,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:20:50,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-30 09:20:55,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:20:56,213 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.68 vs. limit=15.0 2023-09-30 09:20:57,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 09:20:57,834 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.27 vs. limit=6.0 2023-09-30 09:21:02,692 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-30 09:21:04,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:21:04,780 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-30 09:21:06,190 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-30 09:21:06,331 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-30 09:21:07,686 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:21:07,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:21:07,972 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=666660.0, ans=0.125 2023-09-30 09:21:09,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:21:09,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:21:09,428 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:21:09,496 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:21:13,197 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-30 09:21:13,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:21:13,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:21:13,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:21:14,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-30 09:21:16,308 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-30 09:21:16,315 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-30 09:21:16,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-30 09:21:20,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:21:20,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:21:22,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:21:23,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:21:23,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-30 09:21:24,108 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=666726.6666666666, ans=0.125 2023-09-30 09:21:25,495 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-30 09:21:25,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:21:26,826 INFO [train.py:1039] (1/4) Epoch 19, batch 4400, loss[loss=0.1726, simple_loss=0.2609, pruned_loss=0.04221, over 24677.00 frames. ], tot_loss[loss=0.1781, simple_loss=0.2538, pruned_loss=0.05125, over 4726575.16 frames. ], batch size: 73, lr: 5.38e-03, grad_scale: 16.0 2023-09-30 09:21:29,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:21:30,006 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:21:31,527 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:21:35,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-30 09:21:35,682 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-30 09:21:37,056 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-30 09:21:37,098 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-30 09:21:37,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 09:21:37,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:21:40,285 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-30 09:21:43,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:21:43,532 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=666860.0, ans=0.125 2023-09-30 09:21:45,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:21:45,233 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-30 09:21:49,700 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:21:49,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-30 09:21:49,764 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-30 09:21:50,158 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=666860.0, ans=0.0 2023-09-30 09:21:54,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-30 09:21:54,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-30 09:21:54,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-30 09:21:54,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:21:55,659 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.851e+02 2.031e+02 2.280e+02 3.356e+02, threshold=4.061e+02, percent-clipped=0.0 2023-09-30 09:21:55,922 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:21:57,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:21:57,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:22:00,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-30 09:22:00,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-30 09:22:00,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:22:02,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:22:02,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:22:03,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:22:05,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:22:05,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-30 09:22:06,759 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-30 09:22:10,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:22:16,304 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:22:17,930 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-30 09:22:20,240 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=666993.3333333334, ans=0.1 2023-09-30 09:22:23,000 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:22:24,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:22:25,419 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=8.87 vs. limit=15.0 2023-09-30 09:22:29,115 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:22:29,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-30 09:22:29,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:22:29,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:22:29,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:22:30,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-30 09:22:35,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-30 09:22:36,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-30 09:22:38,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-30 09:22:38,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:22:38,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-30 09:22:40,722 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:22:45,813 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:22:48,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-30 09:22:49,462 INFO [train.py:1039] (1/4) Epoch 19, batch 4450, loss[loss=0.1932, simple_loss=0.2689, pruned_loss=0.05881, over 23295.00 frames. ], tot_loss[loss=0.179, simple_loss=0.2551, pruned_loss=0.05148, over 4736477.47 frames. ], batch size: 93, lr: 5.37e-03, grad_scale: 16.0 2023-09-30 09:22:51,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:22:53,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:22:53,624 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:23:02,291 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:23:02,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:23:04,180 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=667193.3333333334, ans=0.125 2023-09-30 09:23:05,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:23:05,879 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=667193.3333333334, ans=0.125 2023-09-30 09:23:07,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:23:10,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:23:10,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:23:11,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-30 09:23:11,620 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:23:11,732 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:23:11,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:23:11,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:23:15,433 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 09:23:21,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:23:22,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:23:23,006 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=667260.0, ans=0.2 2023-09-30 09:23:24,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:23:24,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:23:26,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:23:30,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 09:23:30,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-30 09:23:31,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-30 09:23:31,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:23:34,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:23:35,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-30 09:23:40,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-30 09:23:44,618 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:23:44,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-30 09:23:44,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:23:44,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:23:44,809 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:23:46,151 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:23:48,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:23:52,881 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-30 09:23:54,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-30 09:23:55,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 09:23:58,219 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=667393.3333333334, ans=0.125 2023-09-30 09:23:59,491 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:23:59,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:24:01,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:24:01,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 09:24:02,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-30 09:24:04,925 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=667393.3333333334, ans=0.125 2023-09-30 09:24:06,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-30 09:24:06,887 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=8.22 vs. limit=12.0 2023-09-30 09:24:08,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:24:10,371 INFO [train.py:1039] (1/4) Epoch 19, batch 4500, loss[loss=0.1765, simple_loss=0.2351, pruned_loss=0.05892, over 22612.00 frames. ], tot_loss[loss=0.179, simple_loss=0.2551, pruned_loss=0.05146, over 4726883.43 frames. ], batch size: 322, lr: 5.37e-03, grad_scale: 16.0 2023-09-30 09:24:12,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:24:13,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-30 09:24:13,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-30 09:24:16,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:24:21,370 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:24:23,506 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:24:23,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 09:24:25,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:24:25,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:24:25,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:24:39,994 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.864e+02 2.108e+02 2.361e+02 3.088e+02, threshold=4.216e+02, percent-clipped=0.0 2023-09-30 09:24:40,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:24:41,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:24:43,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:24:43,481 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:24:44,943 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:24:51,204 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 09:24:54,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:24:59,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 09:25:02,710 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:25:02,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-30 09:25:04,320 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:25:04,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:25:07,086 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.57 vs. limit=15.0 2023-09-30 09:25:08,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:25:08,388 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:25:11,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:25:11,362 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-30 09:25:11,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 09:25:11,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:25:17,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:25:17,556 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:25:20,575 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:25:22,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-30 09:25:22,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:25:23,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-30 09:25:26,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-30 09:25:26,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-30 09:25:30,046 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=667793.3333333334, ans=0.0 2023-09-30 09:25:31,114 INFO [train.py:1039] (1/4) Epoch 19, batch 4550, loss[loss=0.1868, simple_loss=0.2654, pruned_loss=0.05413, over 23675.00 frames. ], tot_loss[loss=0.1783, simple_loss=0.2538, pruned_loss=0.05143, over 4716139.52 frames. ], batch size: 85, lr: 5.37e-03, grad_scale: 16.0 2023-09-30 09:25:31,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-30 09:25:33,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-30 09:25:35,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:25:36,641 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=667793.3333333334, ans=0.1 2023-09-30 09:25:38,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:25:40,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:25:42,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:25:42,495 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=667793.3333333334, ans=0.0 2023-09-30 09:25:45,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:25:47,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:25:48,811 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:25:48,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:25:48,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:25:51,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:25:51,751 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:25:54,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:25:56,595 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-30 09:25:58,066 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-30 09:25:58,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:25:59,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-30 09:26:05,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-30 09:26:07,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:26:10,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-30 09:26:13,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 09:26:15,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:26:15,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:26:15,682 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-30 09:26:18,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-30 09:26:21,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:26:23,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=667993.3333333334, ans=0.125 2023-09-30 09:26:24,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:26:24,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:26:26,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:26:27,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-30 09:26:28,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-30 09:26:28,044 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:26:29,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-30 09:26:32,459 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-30 09:26:32,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:26:34,042 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:26:35,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:26:35,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:26:35,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:26:38,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 09:26:38,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-30 09:26:40,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:26:40,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 09:26:42,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-30 09:26:42,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:26:42,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-30 09:26:45,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:26:45,472 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:26:47,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:26:48,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:26:49,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-30 09:26:50,573 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:26:53,410 INFO [train.py:1039] (1/4) Epoch 19, batch 4600, loss[loss=0.1665, simple_loss=0.2418, pruned_loss=0.04566, over 24493.00 frames. ], tot_loss[loss=0.1778, simple_loss=0.2529, pruned_loss=0.05134, over 4710733.63 frames. ], batch size: 58, lr: 5.37e-03, grad_scale: 16.0 2023-09-30 09:26:53,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-30 09:26:56,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:26:56,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:26:58,472 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:26:58,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:26:59,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:27:01,491 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-30 09:27:01,824 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=668126.6666666666, ans=0.1 2023-09-30 09:27:03,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:27:07,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:27:09,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:27:10,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:27:18,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-30 09:27:18,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:27:22,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:27:25,423 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.849e+02 2.100e+02 2.657e+02 4.568e+02, threshold=4.200e+02, percent-clipped=2.0 2023-09-30 09:27:27,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:27:27,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:27:32,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-30 09:27:32,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 09:27:33,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:27:38,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:27:38,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:27:40,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:27:45,104 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-30 09:27:47,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-30 09:27:51,066 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=668326.6666666666, ans=0.0 2023-09-30 09:27:52,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:27:53,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:27:55,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:27:55,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 09:27:55,683 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=668326.6666666666, ans=0.125 2023-09-30 09:27:56,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:27:58,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-30 09:27:58,189 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:28:00,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:28:01,763 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:28:01,873 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:28:01,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:28:02,115 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=668393.3333333334, ans=0.125 2023-09-30 09:28:03,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-30 09:28:04,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-30 09:28:04,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-30 09:28:04,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:28:06,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:28:07,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:28:08,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:28:16,106 INFO [train.py:1039] (1/4) Epoch 19, batch 4650, loss[loss=0.1696, simple_loss=0.2414, pruned_loss=0.04895, over 23591.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.2527, pruned_loss=0.05087, over 4722703.52 frames. ], batch size: 256, lr: 5.37e-03, grad_scale: 8.0 2023-09-30 09:28:19,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:28:22,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:28:22,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:28:24,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:28:24,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:28:24,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:28:25,046 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:28:25,358 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=668460.0, ans=0.125 2023-09-30 09:28:29,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-30 09:28:31,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:28:34,778 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-30 09:28:34,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:28:36,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-30 09:28:36,347 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:28:36,426 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-30 09:28:36,608 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=668526.6666666666, ans=0.125 2023-09-30 09:28:37,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-30 09:28:37,771 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:28:39,238 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:28:42,366 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:28:43,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:28:43,905 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-30 09:28:46,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:28:47,188 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=668593.3333333334, ans=0.125 2023-09-30 09:28:48,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-30 09:28:48,768 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=668593.3333333334, ans=0.0 2023-09-30 09:28:52,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:28:52,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:28:53,592 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-30 09:28:55,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:28:55,934 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=668593.3333333334, ans=0.125 2023-09-30 09:28:59,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:29:01,034 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=668593.3333333334, ans=0.125 2023-09-30 09:29:02,373 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:29:07,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:29:10,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:29:10,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:29:12,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:29:12,679 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.85 vs. limit=15.0 2023-09-30 09:29:15,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-30 09:29:16,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-30 09:29:16,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 09:29:16,723 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-30 09:29:18,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:29:25,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:29:25,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:29:26,465 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-30 09:29:26,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:29:26,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:29:26,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:29:28,316 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=668726.6666666666, ans=0.125 2023-09-30 09:29:28,654 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=11.17 vs. limit=15.0 2023-09-30 09:29:30,197 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:29:31,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:29:31,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:29:34,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:29:38,517 INFO [train.py:1039] (1/4) Epoch 19, batch 4700, loss[loss=0.1706, simple_loss=0.2448, pruned_loss=0.04822, over 23438.00 frames. ], tot_loss[loss=0.1777, simple_loss=0.2535, pruned_loss=0.05098, over 4727870.55 frames. ], batch size: 285, lr: 5.37e-03, grad_scale: 8.0 2023-09-30 09:29:38,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:29:38,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:29:38,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 09:29:38,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-30 09:29:38,998 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=668793.3333333334, ans=0.125 2023-09-30 09:29:40,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-30 09:29:41,793 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-30 09:29:50,135 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=668793.3333333334, ans=0.125 2023-09-30 09:29:51,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:29:51,433 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:29:52,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:29:52,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:29:55,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 09:29:56,559 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.43 vs. limit=12.0 2023-09-30 09:30:00,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-30 09:30:02,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-30 09:30:03,878 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:30:05,914 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:30:05,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:30:09,360 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.848e+02 1.982e+02 2.168e+02 3.287e+02, threshold=3.963e+02, percent-clipped=0.0 2023-09-30 09:30:11,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:30:16,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 09:30:17,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 09:30:21,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:30:27,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-30 09:30:28,844 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:30:31,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:30:35,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-30 09:30:37,206 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:30:41,021 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:30:42,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-30 09:30:44,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:30:44,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:30:47,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:30:48,974 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:30:49,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-30 09:30:49,155 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-30 09:30:50,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:30:52,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:30:52,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:30:52,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-30 09:30:53,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:30:58,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-30 09:31:00,077 INFO [train.py:1039] (1/4) Epoch 19, batch 4750, loss[loss=0.1813, simple_loss=0.2612, pruned_loss=0.05066, over 23358.00 frames. ], tot_loss[loss=0.1779, simple_loss=0.254, pruned_loss=0.05087, over 4724401.15 frames. ], batch size: 105, lr: 5.37e-03, grad_scale: 8.0 2023-09-30 09:31:01,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:31:03,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:31:06,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:31:06,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:31:09,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-30 09:31:10,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:31:15,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-30 09:31:16,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:31:16,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:31:16,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:31:24,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-30 09:31:25,380 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=669193.3333333334, ans=0.125 2023-09-30 09:31:28,258 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:31:31,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-30 09:31:31,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:31:34,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:31:34,985 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:31:35,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:31:36,507 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-30 09:31:36,511 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-30 09:31:43,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-30 09:31:46,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:31:49,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:31:51,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 09:31:51,533 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-30 09:31:51,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:31:53,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:31:56,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:31:58,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-30 09:31:58,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-30 09:31:58,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:31:58,659 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:31:58,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:32:00,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 09:32:01,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-30 09:32:04,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-30 09:32:05,164 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.47 vs. limit=15.0 2023-09-30 09:32:09,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:32:11,482 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:32:11,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-30 09:32:11,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:32:13,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:32:16,124 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:32:16,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:32:18,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 09:32:20,083 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=669393.3333333334, ans=0.1 2023-09-30 09:32:21,250 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:32:21,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-30 09:32:21,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-30 09:32:23,303 INFO [train.py:1039] (1/4) Epoch 19, batch 4800, loss[loss=0.2026, simple_loss=0.2701, pruned_loss=0.06758, over 23878.00 frames. ], tot_loss[loss=0.1777, simple_loss=0.2539, pruned_loss=0.05076, over 4731035.42 frames. ], batch size: 195, lr: 5.37e-03, grad_scale: 16.0 2023-09-30 09:32:23,499 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-30 09:32:26,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-30 09:32:26,515 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:32:28,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-30 09:32:32,172 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=669460.0, ans=0.0 2023-09-30 09:32:34,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:32:36,303 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:32:39,409 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=669526.6666666666, ans=0.125 2023-09-30 09:32:40,923 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:32:42,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:32:42,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:32:42,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-30 09:32:44,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:32:44,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:32:44,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:32:51,298 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:32:52,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:32:52,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:32:54,169 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.524e+02 1.880e+02 2.098e+02 2.403e+02 3.356e+02, threshold=4.196e+02, percent-clipped=0.0 2023-09-30 09:32:54,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:32:55,828 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 09:32:55,851 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:32:55,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:32:58,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:32:59,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:33:01,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:33:01,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-30 09:33:04,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 09:33:06,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:33:07,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-30 09:33:07,817 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-30 09:33:09,225 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:33:09,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:33:09,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:33:09,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:33:09,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:33:12,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:33:13,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:33:17,668 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:33:19,619 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=669660.0, ans=0.2 2023-09-30 09:33:20,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:33:23,544 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:33:28,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-30 09:33:28,871 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:33:30,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:33:30,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 09:33:30,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:33:34,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:33:34,572 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=669726.6666666666, ans=0.125 2023-09-30 09:33:35,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:33:35,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:33:36,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:33:37,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:33:37,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:33:42,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:33:42,642 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:33:43,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:33:43,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:33:44,620 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.94 vs. limit=6.0 2023-09-30 09:33:45,378 INFO [train.py:1039] (1/4) Epoch 19, batch 4850, loss[loss=0.1655, simple_loss=0.2408, pruned_loss=0.04509, over 24492.00 frames. ], tot_loss[loss=0.1787, simple_loss=0.2543, pruned_loss=0.05155, over 4715546.71 frames. ], batch size: 58, lr: 5.36e-03, grad_scale: 16.0 2023-09-30 09:33:45,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-30 09:33:47,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-30 09:33:47,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:33:47,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:33:47,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:33:47,206 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:33:50,380 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:34:00,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-30 09:34:00,934 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:34:05,659 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:34:07,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 09:34:07,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:34:11,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:34:13,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:34:14,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:34:14,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-30 09:34:19,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:34:20,807 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:34:20,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 09:34:22,248 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 09:34:22,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-30 09:34:25,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:34:25,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:34:28,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:34:29,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-30 09:34:30,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-30 09:34:31,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 09:34:39,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:34:39,522 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-30 09:34:42,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:34:42,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:34:44,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-30 09:34:46,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-30 09:34:46,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:34:46,533 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=669993.3333333334, ans=0.125 2023-09-30 09:34:46,810 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.08 vs. limit=15.0 2023-09-30 09:34:47,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-30 09:34:47,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:34:49,208 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:34:49,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-30 09:34:49,514 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_abs, batch_count=669993.3333333334, ans=0.5 2023-09-30 09:34:58,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:35:05,109 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:35:05,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:35:08,132 INFO [train.py:1039] (1/4) Epoch 19, batch 4900, loss[loss=0.1817, simple_loss=0.249, pruned_loss=0.05722, over 19581.00 frames. ], tot_loss[loss=0.1789, simple_loss=0.2541, pruned_loss=0.05182, over 4702107.23 frames. ], batch size: 42, lr: 5.36e-03, grad_scale: 16.0 2023-09-30 09:35:11,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-30 09:35:11,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:35:16,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:35:18,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:35:18,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:35:21,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-30 09:35:23,574 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=670193.3333333334, ans=0.0 2023-09-30 09:35:25,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-30 09:35:28,735 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.89 vs. limit=22.5 2023-09-30 09:35:29,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-30 09:35:31,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-30 09:35:31,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-30 09:35:31,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:35:31,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:35:31,388 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:35:31,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:35:31,689 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=670193.3333333334, ans=0.0 2023-09-30 09:35:32,998 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-30 09:35:33,633 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.92 vs. limit=15.0 2023-09-30 09:35:38,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-30 09:35:38,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 09:35:39,976 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.500e+02 2.060e+02 2.332e+02 2.793e+02 4.381e+02, threshold=4.664e+02, percent-clipped=2.0 2023-09-30 09:35:41,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-30 09:35:41,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-30 09:35:43,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:35:44,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:35:46,798 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:35:46,822 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-30 09:35:49,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:35:49,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:35:49,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-30 09:35:49,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-30 09:35:55,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-30 09:35:56,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:35:58,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:35:58,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:35:59,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:35:59,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 09:35:59,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:36:00,031 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=670326.6666666666, ans=0.125 2023-09-30 09:36:01,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-30 09:36:03,086 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:36:04,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-30 09:36:04,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:36:08,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-30 09:36:08,712 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.37 vs. limit=15.0 2023-09-30 09:36:09,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:36:11,491 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-30 09:36:11,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-30 09:36:20,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:36:21,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:36:23,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-30 09:36:23,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 09:36:23,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:36:25,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:36:29,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:36:29,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:36:31,091 INFO [train.py:1039] (1/4) Epoch 19, batch 4950, loss[loss=0.1773, simple_loss=0.2552, pruned_loss=0.04974, over 24342.00 frames. ], tot_loss[loss=0.1781, simple_loss=0.2529, pruned_loss=0.05163, over 4704278.71 frames. ], batch size: 61, lr: 5.36e-03, grad_scale: 16.0 2023-09-30 09:36:31,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:36:31,189 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-30 09:36:32,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 09:36:34,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:36:35,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 09:36:37,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-30 09:36:37,635 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-30 09:36:37,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-30 09:36:39,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-30 09:36:39,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:36:39,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:36:39,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:36:39,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:36:40,926 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:36:41,190 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=670460.0, ans=0.1 2023-09-30 09:36:42,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:36:44,408 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:36:46,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:36:49,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:36:49,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:36:54,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 09:36:54,864 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=670526.6666666666, ans=0.125 2023-09-30 09:36:58,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:36:59,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:37:01,517 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:37:02,964 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:37:04,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:37:06,011 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-30 09:37:07,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-30 09:37:07,686 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=670593.3333333334, ans=0.125 2023-09-30 09:37:10,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:37:11,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:37:11,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:37:13,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:37:13,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:37:15,062 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-30 09:37:16,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:37:18,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-30 09:37:20,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:37:22,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:37:22,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:37:24,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-30 09:37:24,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:37:26,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:37:30,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:37:32,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:37:32,870 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:37:34,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:37:34,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:37:34,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:37:35,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:37:37,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:37:37,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:37:39,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-30 09:37:43,683 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:37:49,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-30 09:37:49,708 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-30 09:37:53,447 INFO [train.py:1039] (1/4) Epoch 19, batch 5000, loss[loss=0.1812, simple_loss=0.2451, pruned_loss=0.05866, over 22810.00 frames. ], tot_loss[loss=0.1775, simple_loss=0.2522, pruned_loss=0.05134, over 4706813.78 frames. ], batch size: 322, lr: 5.36e-03, grad_scale: 8.0 2023-09-30 09:37:57,392 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:37:57,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-30 09:37:59,012 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-30 09:38:00,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-30 09:38:02,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:38:04,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-30 09:38:04,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:38:04,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 09:38:05,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-30 09:38:07,292 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:38:08,798 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:38:08,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-30 09:38:08,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:38:10,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:38:10,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-30 09:38:10,898 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=670860.0, ans=0.125 2023-09-30 09:38:11,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-30 09:38:12,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:38:13,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-30 09:38:13,540 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 09:38:14,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:38:15,066 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 09:38:15,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-30 09:38:15,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-30 09:38:18,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-30 09:38:18,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:38:18,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:38:19,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-30 09:38:19,794 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-30 09:38:21,708 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=670860.0, ans=0.0 2023-09-30 09:38:22,768 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:38:22,911 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:38:24,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-30 09:38:24,923 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=670926.6666666666, ans=0.1 2023-09-30 09:38:26,581 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.806e+02 2.052e+02 2.286e+02 3.432e+02, threshold=4.103e+02, percent-clipped=0.0 2023-09-30 09:38:26,808 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-30 09:38:28,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:38:29,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:38:34,782 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-30 09:38:38,621 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:38:38,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:38:38,787 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:38:43,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-30 09:38:43,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:38:43,413 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:38:43,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:38:46,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-30 09:38:46,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:38:49,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:38:51,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:38:53,242 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=670993.3333333334, ans=0.0 2023-09-30 09:38:55,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-30 09:38:59,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:39:07,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:39:10,071 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:39:10,085 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:39:10,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:39:10,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 09:39:10,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-30 09:39:11,732 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:39:14,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:39:15,421 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.51 vs. limit=15.0 2023-09-30 09:39:16,092 INFO [train.py:1039] (1/4) Epoch 19, batch 5050, loss[loss=0.1727, simple_loss=0.2374, pruned_loss=0.05398, over 23470.00 frames. ], tot_loss[loss=0.1773, simple_loss=0.2521, pruned_loss=0.05129, over 4697093.46 frames. ], batch size: 285, lr: 5.36e-03, grad_scale: 8.0 2023-09-30 09:39:16,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-30 09:39:16,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:39:17,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:39:19,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:39:19,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-30 09:39:20,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:39:21,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:39:22,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 09:39:24,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:39:24,480 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=671126.6666666666, ans=0.0 2023-09-30 09:39:25,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-30 09:39:25,881 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=671126.6666666666, ans=0.125 2023-09-30 09:39:32,601 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=671193.3333333334, ans=0.0 2023-09-30 09:39:34,296 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=671193.3333333334, ans=0.1 2023-09-30 09:39:38,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-30 09:39:39,723 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-30 09:39:39,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:39:41,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-30 09:39:42,045 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:39:42,299 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=671193.3333333334, ans=0.125 2023-09-30 09:39:44,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:39:44,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:39:46,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:39:46,216 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-30 09:39:47,703 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-30 09:39:47,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:39:50,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:39:53,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:39:54,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-30 09:39:55,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:40:00,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-30 09:40:01,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:40:01,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:40:03,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:40:03,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:40:03,479 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=671326.6666666666, ans=0.125 2023-09-30 09:40:06,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:40:07,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:40:09,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:40:09,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:40:09,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:40:09,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-30 09:40:11,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:40:13,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:40:19,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:40:19,136 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-30 09:40:19,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-30 09:40:19,363 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=671326.6666666666, ans=0.125 2023-09-30 09:40:20,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:40:22,124 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:40:22,161 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-30 09:40:23,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:40:23,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-30 09:40:23,926 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:40:28,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:40:28,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:40:28,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-30 09:40:30,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-30 09:40:31,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:40:31,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:40:33,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:40:36,329 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-30 09:40:37,742 INFO [train.py:1039] (1/4) Epoch 19, batch 5100, loss[loss=0.2291, simple_loss=0.288, pruned_loss=0.0851, over 19016.00 frames. ], tot_loss[loss=0.1784, simple_loss=0.2536, pruned_loss=0.05161, over 4690982.06 frames. ], batch size: 388, lr: 5.36e-03, grad_scale: 8.0 2023-09-30 09:40:37,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:40:40,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-30 09:40:41,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-30 09:40:41,330 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=671460.0, ans=0.025 2023-09-30 09:40:43,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:40:44,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:40:48,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:40:48,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-30 09:40:50,321 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-30 09:40:54,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:40:54,170 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:40:58,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:41:01,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-30 09:41:03,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:41:04,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:41:04,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-30 09:41:07,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:41:07,919 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:41:07,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-30 09:41:10,887 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.887e+02 2.088e+02 2.426e+02 5.296e+02, threshold=4.177e+02, percent-clipped=1.0 2023-09-30 09:41:11,043 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-30 09:41:11,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:41:11,305 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=671593.3333333334, ans=0.125 2023-09-30 09:41:12,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-30 09:41:12,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-30 09:41:16,592 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.10 vs. limit=12.0 2023-09-30 09:41:17,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:41:25,304 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=671593.3333333334, ans=0.125 2023-09-30 09:41:25,330 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=671593.3333333334, ans=0.125 2023-09-30 09:41:26,543 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:41:30,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-30 09:41:30,220 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-30 09:41:30,233 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-30 09:41:33,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-30 09:41:33,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:41:33,846 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.86 vs. limit=22.5 2023-09-30 09:41:34,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-30 09:41:39,411 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-30 09:41:40,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 09:41:42,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:41:45,726 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-30 09:41:47,329 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-30 09:41:47,403 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-30 09:41:52,856 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=671726.6666666666, ans=0.0 2023-09-30 09:41:53,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:41:53,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:41:53,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:41:54,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:41:54,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 09:41:55,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:41:57,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-30 09:41:57,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-30 09:41:57,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-30 09:41:59,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:41:59,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-30 09:42:00,981 INFO [train.py:1039] (1/4) Epoch 19, batch 5150, loss[loss=0.1854, simple_loss=0.2616, pruned_loss=0.05464, over 23163.00 frames. ], tot_loss[loss=0.1794, simple_loss=0.2544, pruned_loss=0.05215, over 4695201.83 frames. ], batch size: 105, lr: 5.36e-03, grad_scale: 8.0 2023-09-30 09:42:01,678 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:42:01,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 09:42:03,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:42:04,755 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:42:10,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 09:42:10,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-30 09:42:10,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:42:12,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:42:14,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-30 09:42:14,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:42:14,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:42:15,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:42:15,549 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:42:17,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-30 09:42:18,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:42:18,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:42:20,424 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:42:21,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 09:42:25,185 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-30 09:42:25,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:42:29,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:42:34,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-30 09:42:39,290 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:42:43,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:42:45,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:42:47,679 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.87 vs. limit=15.0 2023-09-30 09:42:51,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:42:53,261 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:42:54,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-30 09:42:59,470 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:42:59,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:42:59,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:43:03,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:43:03,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:43:04,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-30 09:43:10,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:43:12,455 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 09:43:15,496 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:43:15,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:43:15,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-30 09:43:16,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-30 09:43:16,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:43:16,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:43:20,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:43:21,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:43:23,308 INFO [train.py:1039] (1/4) Epoch 19, batch 5200, loss[loss=0.1809, simple_loss=0.2651, pruned_loss=0.04837, over 24683.00 frames. ], tot_loss[loss=0.1802, simple_loss=0.2556, pruned_loss=0.05241, over 4700320.87 frames. ], batch size: 68, lr: 5.35e-03, grad_scale: 16.0 2023-09-30 09:43:23,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:43:29,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-30 09:43:30,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:43:30,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:43:33,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:43:36,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:43:37,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:43:39,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-30 09:43:42,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 09:43:44,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:43:47,131 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:43:48,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-30 09:43:48,596 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=672193.3333333334, ans=0.0 2023-09-30 09:43:49,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-30 09:43:51,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:43:51,437 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-30 09:43:52,971 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-30 09:43:53,428 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:43:54,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-30 09:43:56,150 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.506e+02 1.839e+02 2.019e+02 2.213e+02 3.440e+02, threshold=4.038e+02, percent-clipped=0.0 2023-09-30 09:43:56,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:43:56,314 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-30 09:43:56,336 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:43:57,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:43:57,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:43:59,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-30 09:44:00,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:44:02,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:44:05,763 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-30 09:44:05,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-30 09:44:07,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-30 09:44:12,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-30 09:44:14,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 09:44:19,621 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=672326.6666666666, ans=0.1 2023-09-30 09:44:21,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-30 09:44:21,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:44:23,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-30 09:44:23,734 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:44:23,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-30 09:44:23,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:44:23,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 09:44:28,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:44:28,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:44:31,796 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=672393.3333333334, ans=0.125 2023-09-30 09:44:33,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:44:33,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:44:33,216 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:44:38,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:44:39,633 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-30 09:44:39,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:44:41,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:44:41,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:44:42,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-30 09:44:42,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:44:45,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:44:46,781 INFO [train.py:1039] (1/4) Epoch 19, batch 5250, loss[loss=0.1729, simple_loss=0.258, pruned_loss=0.04387, over 24693.00 frames. ], tot_loss[loss=0.1793, simple_loss=0.2548, pruned_loss=0.05189, over 4711515.51 frames. ], batch size: 73, lr: 5.35e-03, grad_scale: 16.0 2023-09-30 09:44:48,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:44:48,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:44:50,853 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:44:56,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:44:56,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:44:58,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:45:00,249 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=672460.0, ans=0.125 2023-09-30 09:45:01,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 09:45:01,703 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=672460.0, ans=0.2 2023-09-30 09:45:02,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-30 09:45:02,934 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:45:04,470 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:45:15,498 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.20 vs. limit=15.0 2023-09-30 09:45:33,258 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=672660.0, ans=0.1 2023-09-30 09:45:41,991 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.22 vs. limit=22.5 2023-09-30 09:46:00,071 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=672726.6666666666, ans=0.0 2023-09-30 09:46:02,578 INFO [train.py:1039] (1/4) Epoch 19, batch 5300, loss[loss=0.1797, simple_loss=0.2552, pruned_loss=0.05216, over 23268.00 frames. ], tot_loss[loss=0.1784, simple_loss=0.2531, pruned_loss=0.05189, over 4705085.03 frames. ], batch size: 119, lr: 5.35e-03, grad_scale: 16.0 2023-09-30 09:46:16,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:46:16,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-30 09:46:16,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-30 09:46:16,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:46:16,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:46:16,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:46:16,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:46:16,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:46:16,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:46:17,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:46:17,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-30 09:46:17,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:46:17,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-30 09:46:17,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-30 09:46:17,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-30 09:46:18,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-30 09:46:18,181 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-30 09:46:18,311 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-30 09:46:18,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:46:19,014 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:46:19,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:46:19,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:46:19,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:46:20,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:46:20,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:46:20,350 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:46:20,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:46:20,533 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:46:20,540 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:46:20,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:46:20,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:46:21,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-30 09:46:21,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:46:22,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:46:22,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-30 09:46:22,121 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-30 09:46:22,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-30 09:46:22,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:46:22,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-30 09:46:22,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-30 09:46:22,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-30 09:46:23,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:46:24,091 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:46:24,248 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-30 09:46:24,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-30 09:46:24,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-30 09:46:24,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:46:24,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-30 09:46:24,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-30 09:46:24,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-30 09:46:25,164 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-30 09:46:34,311 INFO [train.py:1039] (1/4) Epoch 20, batch 0, loss[loss=0.187, simple_loss=0.2548, pruned_loss=0.05961, over 22747.00 frames. ], tot_loss[loss=0.187, simple_loss=0.2548, pruned_loss=0.05961, over 22747.00 frames. ], batch size: 322, lr: 5.21e-03, grad_scale: 32.0 2023-09-30 09:46:34,312 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-30 09:46:47,945 INFO [train.py:1071] (1/4) Epoch 20, validation: loss=0.2867, simple_loss=0.2695, pruned_loss=0.152, over 1125622.00 frames. 2023-09-30 09:46:47,945 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-30 09:46:49,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-30 09:46:49,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:46:52,585 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:46:55,874 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:46:57,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:46:57,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:46:57,524 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=672866.6666666666, ans=0.0 2023-09-30 09:46:58,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-30 09:47:01,566 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.839e+02 2.043e+02 2.275e+02 5.407e+02, threshold=4.087e+02, percent-clipped=3.0 2023-09-30 09:47:01,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-30 09:47:05,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:47:05,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:47:05,608 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=672933.3333333334, ans=0.125 2023-09-30 09:47:08,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:47:10,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:47:10,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:47:10,130 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:47:13,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-30 09:47:15,390 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:47:24,242 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:47:24,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:47:25,834 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-30 09:47:30,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:47:30,370 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:47:31,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:47:35,100 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:47:39,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:47:46,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-30 09:47:50,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-30 09:47:50,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:47:50,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:47:51,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:47:53,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:47:56,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-30 09:47:59,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:47:59,935 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=673133.3333333334, ans=0.125 2023-09-30 09:48:01,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:48:04,536 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:48:08,966 INFO [train.py:1039] (1/4) Epoch 20, batch 50, loss[loss=0.1909, simple_loss=0.2582, pruned_loss=0.0618, over 23021.00 frames. ], tot_loss[loss=0.1798, simple_loss=0.2541, pruned_loss=0.05275, over 1064369.92 frames. ], batch size: 322, lr: 5.21e-03, grad_scale: 16.0 2023-09-30 09:48:09,063 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-30 09:48:10,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:48:13,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:48:16,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:48:16,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-30 09:48:17,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 09:48:17,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:48:19,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:48:23,018 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:48:23,379 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=673200.0, ans=0.2 2023-09-30 09:48:24,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:48:29,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-30 09:48:29,153 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:48:29,541 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=673266.6666666666, ans=0.0 2023-09-30 09:48:36,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-30 09:48:38,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-30 09:48:39,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-30 09:48:41,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:48:43,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:48:43,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:48:44,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:48:45,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-30 09:48:45,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 09:48:45,995 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:48:53,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:48:56,971 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:48:56,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 09:48:57,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-30 09:49:00,490 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:49:00,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:49:00,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-30 09:49:00,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:49:03,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-30 09:49:10,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:49:10,731 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:49:12,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:49:12,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:49:12,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-30 09:49:14,151 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=673466.6666666666, ans=0.1 2023-09-30 09:49:15,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-30 09:49:15,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-30 09:49:17,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:49:17,097 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-30 09:49:18,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:49:19,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:49:20,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-30 09:49:21,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-30 09:49:21,731 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-30 09:49:23,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:49:23,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:49:25,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-30 09:49:25,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-30 09:49:27,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:49:28,520 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:49:29,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-30 09:49:30,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:49:31,412 INFO [train.py:1039] (1/4) Epoch 20, batch 100, loss[loss=0.1741, simple_loss=0.2467, pruned_loss=0.05071, over 23320.00 frames. ], tot_loss[loss=0.1816, simple_loss=0.2568, pruned_loss=0.05323, over 1878102.63 frames. ], batch size: 105, lr: 5.21e-03, grad_scale: 16.0 2023-09-30 09:49:33,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:49:35,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:49:41,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:49:42,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-30 09:49:42,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:49:46,893 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:49:46,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:49:48,171 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.857e+02 2.032e+02 2.240e+02 3.945e+02, threshold=4.064e+02, percent-clipped=0.0 2023-09-30 09:49:48,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:49:48,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:49:48,375 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:49:49,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-30 09:49:51,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-30 09:49:51,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:49:53,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:49:53,014 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:49:54,822 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=673600.0, ans=0.2 2023-09-30 09:49:57,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-30 09:49:59,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:50:00,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:50:02,669 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-30 09:50:03,070 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=673666.6666666666, ans=0.0 2023-09-30 09:50:05,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 09:50:07,584 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=673666.6666666666, ans=0.125 2023-09-30 09:50:08,840 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-30 09:50:08,878 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-30 09:50:10,412 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:50:10,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:50:12,841 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=673666.6666666666, ans=0.2 2023-09-30 09:50:14,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-30 09:50:15,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:50:19,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:50:22,967 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=673733.3333333334, ans=0.035 2023-09-30 09:50:24,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:50:25,768 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-30 09:50:27,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-30 09:50:30,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-30 09:50:30,662 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=673733.3333333334, ans=0.125 2023-09-30 09:50:31,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:50:34,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:50:35,241 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=673800.0, ans=0.0 2023-09-30 09:50:37,976 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:50:41,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:50:43,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:50:46,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:50:48,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:50:49,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:50:49,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:50:49,742 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:50:51,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-30 09:50:51,201 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-30 09:50:51,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:50:52,559 INFO [train.py:1039] (1/4) Epoch 20, batch 150, loss[loss=0.1621, simple_loss=0.2364, pruned_loss=0.0439, over 24299.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.2559, pruned_loss=0.05254, over 2515063.58 frames. ], batch size: 56, lr: 5.21e-03, grad_scale: 8.0 2023-09-30 09:50:53,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:50:54,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:50:54,699 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:50:54,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 09:50:54,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 09:50:54,821 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-30 09:50:56,669 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:50:56,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:50:58,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:50:59,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:50:59,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:51:02,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:51:04,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:51:04,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:51:06,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:51:09,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:51:09,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:51:12,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-30 09:51:13,754 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:51:17,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-30 09:51:17,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-30 09:51:17,701 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-30 09:51:22,032 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:51:22,041 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:51:22,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:51:24,427 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:51:24,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:51:25,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:51:25,921 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:51:29,382 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-30 09:51:30,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:51:37,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:51:42,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:51:42,421 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-30 09:51:46,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:51:46,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:51:46,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:51:47,160 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=674066.6666666666, ans=0.1 2023-09-30 09:51:50,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:51:52,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:51:52,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:51:53,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:51:55,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-30 09:51:59,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:52:01,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:52:02,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:52:02,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:52:05,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:52:07,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 09:52:09,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:52:10,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:52:10,791 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:52:13,749 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:52:13,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-30 09:52:13,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:52:13,979 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:52:15,050 INFO [train.py:1039] (1/4) Epoch 20, batch 200, loss[loss=0.2055, simple_loss=0.2675, pruned_loss=0.0718, over 23803.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.2563, pruned_loss=0.05239, over 3015259.81 frames. ], batch size: 195, lr: 5.21e-03, grad_scale: 8.0 2023-09-30 09:52:15,146 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-30 09:52:18,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:52:21,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:52:21,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:52:24,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-30 09:52:26,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:52:26,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:52:28,301 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-30 09:52:30,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-30 09:52:31,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:52:33,172 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.908e+02 2.091e+02 2.356e+02 3.035e+02, threshold=4.181e+02, percent-clipped=0.0 2023-09-30 09:52:34,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:52:37,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:52:37,146 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:52:37,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:52:57,720 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=674333.3333333334, ans=0.125 2023-09-30 09:52:58,259 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.52 vs. limit=22.5 2023-09-30 09:52:58,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:52:58,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:53:00,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:53:01,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:53:02,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 09:53:02,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 09:53:05,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:53:05,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:53:07,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:53:07,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:53:10,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-30 09:53:10,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 09:53:10,361 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:53:10,685 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=674400.0, ans=0.2 2023-09-30 09:53:14,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:53:14,302 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=674400.0, ans=0.0 2023-09-30 09:53:21,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:53:27,519 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.79 vs. limit=22.5 2023-09-30 09:53:27,998 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:53:29,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:53:33,084 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=674466.6666666666, ans=0.125 2023-09-30 09:53:34,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:53:36,196 INFO [train.py:1039] (1/4) Epoch 20, batch 250, loss[loss=0.1744, simple_loss=0.2567, pruned_loss=0.04609, over 23282.00 frames. ], tot_loss[loss=0.1798, simple_loss=0.2553, pruned_loss=0.05212, over 3381281.60 frames. ], batch size: 93, lr: 5.21e-03, grad_scale: 8.0 2023-09-30 09:53:37,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-30 09:53:37,842 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:53:37,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:53:37,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:53:39,306 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:53:41,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-30 09:53:42,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:53:42,877 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-30 09:53:46,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:53:47,193 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.20 vs. limit=10.0 2023-09-30 09:53:48,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:53:49,581 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:53:49,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:53:51,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:53:51,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:53:54,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:53:57,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:54:09,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:54:10,868 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:54:10,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:54:16,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-30 09:54:17,493 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.00 vs. limit=22.5 2023-09-30 09:54:18,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-30 09:54:19,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:54:19,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:54:21,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 09:54:21,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:54:21,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:54:21,609 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=674666.6666666666, ans=0.125 2023-09-30 09:54:24,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:54:28,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-30 09:54:28,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:54:29,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:54:29,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:54:29,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:54:31,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:54:32,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:54:32,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:54:34,445 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:54:37,311 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:54:37,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:54:41,325 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-30 09:54:41,397 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=674800.0, ans=0.125 2023-09-30 09:54:45,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:54:51,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:54:54,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:54:56,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:54:59,546 INFO [train.py:1039] (1/4) Epoch 20, batch 300, loss[loss=0.1651, simple_loss=0.2324, pruned_loss=0.04887, over 23556.00 frames. ], tot_loss[loss=0.178, simple_loss=0.2532, pruned_loss=0.05145, over 3678772.09 frames. ], batch size: 149, lr: 5.21e-03, grad_scale: 8.0 2023-09-30 09:54:59,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-30 09:55:01,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:55:01,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:55:01,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-30 09:55:03,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-30 09:55:03,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:55:04,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-30 09:55:09,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:55:10,930 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:55:13,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:55:14,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-30 09:55:16,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:55:16,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 09:55:17,751 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.494e+02 1.911e+02 2.106e+02 2.458e+02 4.276e+02, threshold=4.211e+02, percent-clipped=1.0 2023-09-30 09:55:17,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-30 09:55:17,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:55:21,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:55:24,974 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:55:25,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-30 09:55:29,658 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-30 09:55:29,733 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:55:32,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:55:36,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:55:36,335 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-30 09:55:36,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:55:39,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:55:40,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:55:40,932 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:55:46,834 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-30 09:55:46,848 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-30 09:55:49,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:55:52,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:55:53,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-30 09:55:55,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:55:59,421 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:56:01,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:56:01,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-30 09:56:07,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:56:07,045 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 09:56:08,790 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:56:11,095 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:56:12,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-30 09:56:12,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 09:56:12,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:56:14,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-30 09:56:14,614 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=675133.3333333334, ans=0.0 2023-09-30 09:56:17,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:56:17,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:56:19,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:56:19,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:56:20,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:56:22,135 INFO [train.py:1039] (1/4) Epoch 20, batch 350, loss[loss=0.1623, simple_loss=0.2471, pruned_loss=0.03876, over 24499.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2516, pruned_loss=0.05067, over 3895357.18 frames. ], batch size: 66, lr: 5.20e-03, grad_scale: 4.0 2023-09-30 09:56:24,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:56:24,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 09:56:27,600 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:56:34,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:56:37,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:56:38,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:56:41,236 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.46 vs. limit=15.0 2023-09-30 09:56:42,005 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-30 09:56:43,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:56:44,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-30 09:56:47,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:56:48,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-30 09:56:48,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:56:51,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-30 09:56:53,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:56:55,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:56:56,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:56:58,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:56:58,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:56:58,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:56:58,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:56:58,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:57:01,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:57:01,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:57:09,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:57:09,790 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-30 09:57:09,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:57:11,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:57:15,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-30 09:57:15,930 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:57:22,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:57:22,374 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:57:22,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:57:23,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-30 09:57:25,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:57:26,436 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.46 vs. limit=15.0 2023-09-30 09:57:27,041 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-30 09:57:27,847 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.19 vs. limit=15.0 2023-09-30 09:57:28,588 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-30 09:57:28,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:57:30,401 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=675466.6666666666, ans=0.2 2023-09-30 09:57:31,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:57:31,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-30 09:57:35,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:57:36,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:57:38,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:57:40,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:57:40,443 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:57:42,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:57:43,396 INFO [train.py:1039] (1/4) Epoch 20, batch 400, loss[loss=0.1792, simple_loss=0.226, pruned_loss=0.06617, over 19387.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2509, pruned_loss=0.05029, over 4068238.06 frames. ], batch size: 389, lr: 5.20e-03, grad_scale: 8.0 2023-09-30 09:57:43,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:57:45,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:57:47,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-30 09:57:48,373 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:57:48,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:57:50,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:57:51,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:57:53,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:57:55,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:57:56,933 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-30 09:57:57,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-30 09:57:57,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:57:58,952 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=675600.0, ans=0.125 2023-09-30 09:58:00,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-30 09:58:00,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:58:03,708 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.898e+02 2.066e+02 2.335e+02 3.981e+02, threshold=4.133e+02, percent-clipped=0.0 2023-09-30 09:58:04,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:58:04,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:58:04,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-30 09:58:05,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:58:05,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:58:05,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:58:06,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:58:09,352 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-30 09:58:10,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-30 09:58:15,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:58:18,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:58:18,734 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=675666.6666666666, ans=0.0 2023-09-30 09:58:19,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-30 09:58:20,034 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-30 09:58:23,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:58:24,766 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:58:25,115 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=675666.6666666666, ans=0.0 2023-09-30 09:58:27,101 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=675666.6666666666, ans=0.0 2023-09-30 09:58:33,026 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-30 09:58:34,730 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-30 09:58:36,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-30 09:58:36,730 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=675733.3333333334, ans=0.0 2023-09-30 09:58:38,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:58:41,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:58:41,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-30 09:58:45,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:58:47,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 09:58:48,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:58:51,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:58:51,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-30 09:58:53,517 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-30 09:58:55,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-30 09:58:56,891 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=675800.0, ans=0.125 2023-09-30 09:58:58,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:58:58,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:59:00,755 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.30 vs. limit=22.5 2023-09-30 09:59:01,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-30 09:59:02,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 09:59:03,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:59:04,497 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-30 09:59:04,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-30 09:59:04,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:59:06,081 INFO [train.py:1039] (1/4) Epoch 20, batch 450, loss[loss=0.1617, simple_loss=0.2441, pruned_loss=0.03963, over 24299.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.2524, pruned_loss=0.05037, over 4209353.08 frames. ], batch size: 61, lr: 5.20e-03, grad_scale: 8.0 2023-09-30 09:59:06,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:59:07,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-30 09:59:07,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-30 09:59:09,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:59:10,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:59:13,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:59:25,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:59:25,124 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:59:26,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-30 09:59:26,895 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=675933.3333333334, ans=0.2 2023-09-30 09:59:28,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-30 09:59:32,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-30 09:59:33,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:59:34,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:59:40,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:59:41,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:59:44,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-30 09:59:44,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-30 09:59:46,827 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=676000.0, ans=0.125 2023-09-30 09:59:46,871 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=676000.0, ans=0.125 2023-09-30 09:59:47,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-30 09:59:49,843 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:59:49,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:59:51,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:59:53,472 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-30 09:59:53,487 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-30 09:59:53,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:59:55,467 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=676066.6666666666, ans=0.1 2023-09-30 09:59:57,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:59:58,815 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-30 10:00:01,908 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-30 10:00:01,962 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:00:03,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-30 10:00:05,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-30 10:00:06,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:00:09,588 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-30 10:00:09,647 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:00:12,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-30 10:00:15,068 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.25 vs. limit=15.0 2023-09-30 10:00:16,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:00:17,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-30 10:00:19,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-30 10:00:19,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:00:23,074 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=676133.3333333334, ans=0.025 2023-09-30 10:00:26,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:00:27,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:00:29,627 INFO [train.py:1039] (1/4) Epoch 20, batch 500, loss[loss=0.1735, simple_loss=0.2561, pruned_loss=0.04546, over 24632.00 frames. ], tot_loss[loss=0.1773, simple_loss=0.2531, pruned_loss=0.05077, over 4323582.56 frames. ], batch size: 65, lr: 5.20e-03, grad_scale: 8.0 2023-09-30 10:00:29,734 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 10:00:29,770 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-30 10:00:33,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:00:35,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:00:35,136 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:00:35,161 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-30 10:00:37,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-30 10:00:38,008 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:00:41,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 10:00:47,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 10:00:48,765 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.475e+02 1.813e+02 1.975e+02 2.252e+02 5.149e+02, threshold=3.950e+02, percent-clipped=1.0 2023-09-30 10:00:48,899 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:00:51,072 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:00:51,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:00:52,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:00:56,806 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.05 vs. limit=15.0 2023-09-30 10:01:00,847 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=7.59 vs. limit=15.0 2023-09-30 10:01:05,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:01:05,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:01:05,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-30 10:01:05,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:01:05,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-30 10:01:05,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 10:01:10,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-30 10:01:10,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:01:10,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:01:10,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:01:12,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-30 10:01:15,459 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-30 10:01:17,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:01:18,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:01:20,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:01:21,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:01:21,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-30 10:01:23,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-30 10:01:26,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:01:28,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:01:33,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:01:33,417 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=676466.6666666666, ans=0.125 2023-09-30 10:01:34,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:01:40,132 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=676466.6666666666, ans=0.125 2023-09-30 10:01:42,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:01:47,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-30 10:01:47,408 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:01:47,429 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:01:49,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-30 10:01:50,583 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-30 10:01:52,013 INFO [train.py:1039] (1/4) Epoch 20, batch 550, loss[loss=0.1755, simple_loss=0.2546, pruned_loss=0.04824, over 23734.00 frames. ], tot_loss[loss=0.1789, simple_loss=0.2546, pruned_loss=0.05165, over 4408609.32 frames. ], batch size: 85, lr: 5.20e-03, grad_scale: 8.0 2023-09-30 10:01:52,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:01:58,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-30 10:01:59,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-30 10:01:59,799 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:01:59,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-30 10:02:01,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:02:01,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:02:02,773 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:02:04,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:02:04,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-30 10:02:06,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:02:07,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:02:09,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-30 10:02:09,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:02:13,560 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:02:14,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:02:17,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:02:17,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:02:23,052 WARNING [train.py:1197] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-30 10:02:23,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-30 10:02:24,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:02:24,967 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:02:25,324 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.73 vs. limit=22.5 2023-09-30 10:02:28,143 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=676666.6666666666, ans=0.2 2023-09-30 10:02:30,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:02:30,938 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:02:32,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-30 10:02:34,910 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.12 vs. limit=15.0 2023-09-30 10:02:36,856 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:02:36,866 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-30 10:02:37,018 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:02:39,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 10:02:42,090 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:02:42,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:02:42,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:02:44,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:02:46,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-30 10:02:48,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-30 10:02:49,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:02:49,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:02:49,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:02:49,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:02:52,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:02:54,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:02:57,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:02:58,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:02:59,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 10:02:59,702 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.22 vs. limit=15.0 2023-09-30 10:03:00,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 10:03:00,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:03:02,306 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-30 10:03:02,398 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:03:03,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-30 10:03:03,894 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-30 10:03:10,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-30 10:03:12,613 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=676866.6666666666, ans=0.0 2023-09-30 10:03:13,672 INFO [train.py:1039] (1/4) Epoch 20, batch 600, loss[loss=0.1825, simple_loss=0.26, pruned_loss=0.05245, over 24028.00 frames. ], tot_loss[loss=0.1797, simple_loss=0.2551, pruned_loss=0.05214, over 4458015.69 frames. ], batch size: 80, lr: 5.20e-03, grad_scale: 8.0 2023-09-30 10:03:13,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-30 10:03:13,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:03:15,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 10:03:15,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:03:25,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:03:25,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 10:03:28,584 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-30 10:03:31,582 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:03:33,013 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.492e+02 1.840e+02 2.105e+02 2.465e+02 3.570e+02, threshold=4.210e+02, percent-clipped=0.0 2023-09-30 10:03:33,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:03:36,114 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:03:37,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-30 10:03:37,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:03:42,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-30 10:03:46,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:03:46,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:03:47,150 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=677000.0, ans=0.125 2023-09-30 10:03:48,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:03:49,963 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.72 vs. limit=12.0 2023-09-30 10:03:54,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:03:54,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:03:54,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:04:02,054 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:04:06,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:04:06,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:04:06,680 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:04:12,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-30 10:04:18,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-30 10:04:18,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:04:23,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-30 10:04:25,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:04:26,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-30 10:04:27,022 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:04:28,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:04:33,348 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=677200.0, ans=0.125 2023-09-30 10:04:34,464 INFO [train.py:1039] (1/4) Epoch 20, batch 650, loss[loss=0.1859, simple_loss=0.2509, pruned_loss=0.06044, over 23698.00 frames. ], tot_loss[loss=0.1786, simple_loss=0.2539, pruned_loss=0.05163, over 4520619.66 frames. ], batch size: 179, lr: 5.20e-03, grad_scale: 8.0 2023-09-30 10:04:36,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 10:04:37,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-30 10:04:39,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:04:42,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:04:43,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:04:46,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-30 10:04:48,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:04:53,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 10:04:53,701 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:04:57,662 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:05:01,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-30 10:05:04,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:05:04,365 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:05:09,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:05:09,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 10:05:11,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:05:11,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:05:12,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 10:05:14,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:05:15,705 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:05:16,252 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.11 vs. limit=12.0 2023-09-30 10:05:17,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 10:05:17,346 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-30 10:05:17,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:05:17,394 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:05:19,535 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.51 vs. limit=12.0 2023-09-30 10:05:20,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:05:20,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:05:22,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:05:22,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:05:23,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-30 10:05:23,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:05:23,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:05:27,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:05:27,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:05:29,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 10:05:31,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-30 10:05:33,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-30 10:05:33,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:05:33,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:05:33,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:05:34,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:05:35,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:05:37,174 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=677400.0, ans=0.0 2023-09-30 10:05:41,506 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:05:41,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:05:43,128 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:05:43,427 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=677466.6666666666, ans=0.125 2023-09-30 10:05:46,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:05:46,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 10:05:46,171 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:05:51,046 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=677466.6666666666, ans=0.1 2023-09-30 10:05:53,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 10:05:53,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:05:53,932 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:05:55,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:05:56,757 INFO [train.py:1039] (1/4) Epoch 20, batch 700, loss[loss=0.1772, simple_loss=0.2588, pruned_loss=0.04776, over 24372.00 frames. ], tot_loss[loss=0.177, simple_loss=0.2521, pruned_loss=0.05095, over 4561977.84 frames. ], batch size: 77, lr: 5.20e-03, grad_scale: 8.0 2023-09-30 10:06:00,614 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-30 10:06:01,526 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.37 vs. limit=15.0 2023-09-30 10:06:02,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-30 10:06:04,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-30 10:06:05,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:06:05,294 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=677533.3333333334, ans=0.0 2023-09-30 10:06:06,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:06:08,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-30 10:06:14,093 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:06:15,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:06:17,050 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.862e+02 2.095e+02 2.460e+02 3.900e+02, threshold=4.189e+02, percent-clipped=0.0 2023-09-30 10:06:18,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:06:18,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-30 10:06:20,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:06:23,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:06:25,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 10:06:25,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:06:26,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-30 10:06:28,439 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=677666.6666666666, ans=0.0 2023-09-30 10:06:29,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-30 10:06:35,633 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:06:35,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:06:37,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:06:41,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:06:41,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-30 10:06:45,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:06:47,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 10:06:47,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-30 10:06:47,750 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=677733.3333333334, ans=0.1 2023-09-30 10:06:49,414 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=677733.3333333334, ans=0.1 2023-09-30 10:06:50,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:06:52,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:06:55,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:07:00,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:07:00,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-30 10:07:06,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-30 10:07:06,667 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-30 10:07:10,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:07:10,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:07:11,973 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:07:13,611 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:07:13,620 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-30 10:07:15,504 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=677800.0, ans=0.0 2023-09-30 10:07:18,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-30 10:07:18,328 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-30 10:07:18,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-30 10:07:19,751 INFO [train.py:1039] (1/4) Epoch 20, batch 750, loss[loss=0.1597, simple_loss=0.2439, pruned_loss=0.03779, over 24451.00 frames. ], tot_loss[loss=0.1762, simple_loss=0.2516, pruned_loss=0.05044, over 4596407.31 frames. ], batch size: 63, lr: 5.19e-03, grad_scale: 8.0 2023-09-30 10:07:21,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-30 10:07:21,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-30 10:07:21,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:07:23,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-30 10:07:24,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:07:24,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-30 10:07:25,130 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.34 vs. limit=22.5 2023-09-30 10:07:27,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:07:30,614 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:07:30,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-30 10:07:32,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:07:32,544 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=677866.6666666666, ans=0.2 2023-09-30 10:07:33,902 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:07:35,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 10:07:36,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:07:40,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:07:40,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:07:40,855 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-30 10:07:43,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:07:43,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:07:45,249 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:07:46,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-30 10:07:47,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-30 10:07:47,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:07:49,086 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.21 vs. limit=15.0 2023-09-30 10:07:49,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-30 10:07:49,956 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-30 10:07:50,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-30 10:07:50,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-30 10:07:50,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 10:07:51,953 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=678000.0, ans=0.125 2023-09-30 10:07:53,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:07:59,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-30 10:07:59,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:07:59,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:08:01,215 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=678000.0, ans=0.2 2023-09-30 10:08:02,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:08:04,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:08:04,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-30 10:08:05,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:08:05,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-30 10:08:07,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:08:09,090 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=678066.6666666666, ans=0.2 2023-09-30 10:08:09,505 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.15 vs. limit=15.0 2023-09-30 10:08:11,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:08:12,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-30 10:08:12,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:08:18,373 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=678066.6666666666, ans=0.09899494936611666 2023-09-30 10:08:19,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:08:22,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:08:22,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:08:24,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 10:08:28,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-30 10:08:28,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:08:28,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:08:32,208 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:08:33,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:08:35,378 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=678133.3333333334, ans=0.1 2023-09-30 10:08:36,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:08:37,261 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.07 vs. limit=22.5 2023-09-30 10:08:37,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-30 10:08:41,003 INFO [train.py:1039] (1/4) Epoch 20, batch 800, loss[loss=0.1683, simple_loss=0.2552, pruned_loss=0.0407, over 24652.00 frames. ], tot_loss[loss=0.1763, simple_loss=0.252, pruned_loss=0.05032, over 4629570.49 frames. ], batch size: 73, lr: 5.19e-03, grad_scale: 16.0 2023-09-30 10:08:44,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:08:44,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:08:46,796 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=678200.0, ans=0.125 2023-09-30 10:08:47,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:08:47,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:08:48,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:08:48,177 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:08:50,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:08:54,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:08:55,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:08:58,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-30 10:09:00,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:09:01,437 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.508e+02 1.889e+02 2.125e+02 2.539e+02 3.349e+02, threshold=4.249e+02, percent-clipped=0.0 2023-09-30 10:09:01,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:09:01,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:09:03,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:09:03,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-30 10:09:03,137 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:09:03,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-30 10:09:07,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:09:10,699 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:09:11,083 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=678266.6666666666, ans=0.125 2023-09-30 10:09:11,113 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=678266.6666666666, ans=0.0 2023-09-30 10:09:12,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:09:12,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:09:16,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:09:16,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:09:20,554 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=678333.3333333334, ans=0.0 2023-09-30 10:09:23,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:09:23,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:09:24,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-30 10:09:25,404 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=678333.3333333334, ans=0.0 2023-09-30 10:09:25,851 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=8.02 vs. limit=15.0 2023-09-30 10:09:27,836 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-30 10:09:27,890 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-30 10:09:27,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 10:09:27,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:09:29,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:09:31,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:09:36,372 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-30 10:09:37,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-30 10:09:38,691 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.47 vs. limit=15.0 2023-09-30 10:09:39,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-30 10:09:39,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 10:09:42,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:09:45,874 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:09:46,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-30 10:09:47,425 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:09:51,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-30 10:09:58,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:10:00,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:10:03,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-30 10:10:04,514 INFO [train.py:1039] (1/4) Epoch 20, batch 850, loss[loss=0.1897, simple_loss=0.254, pruned_loss=0.06268, over 23400.00 frames. ], tot_loss[loss=0.177, simple_loss=0.2531, pruned_loss=0.05048, over 4661665.98 frames. ], batch size: 285, lr: 5.19e-03, grad_scale: 16.0 2023-09-30 10:10:04,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:10:04,780 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:10:07,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-30 10:10:07,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:10:07,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:10:08,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:10:09,280 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.74 vs. limit=6.0 2023-09-30 10:10:10,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:10:11,756 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:10:12,138 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=678533.3333333334, ans=0.125 2023-09-30 10:10:13,231 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-30 10:10:13,310 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-30 10:10:13,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-30 10:10:14,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:10:14,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:10:16,704 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=678533.3333333334, ans=0.125 2023-09-30 10:10:17,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:10:17,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:10:17,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 10:10:23,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:10:23,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:10:24,206 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=678600.0, ans=0.1 2023-09-30 10:10:25,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-30 10:10:25,775 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=678600.0, ans=0.0 2023-09-30 10:10:27,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-30 10:10:30,842 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:10:32,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-30 10:10:38,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-30 10:10:39,548 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-30 10:10:43,208 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-30 10:10:43,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:10:43,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:10:43,260 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 10:10:46,365 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:10:47,257 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.67 vs. limit=6.0 2023-09-30 10:10:47,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:10:47,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-30 10:10:48,188 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=678666.6666666666, ans=0.2 2023-09-30 10:10:49,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:10:51,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:10:51,222 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:10:51,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-30 10:10:52,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:10:55,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-30 10:10:55,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-30 10:10:56,236 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=678733.3333333334, ans=0.0 2023-09-30 10:10:58,327 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer_na.min_abs, batch_count=678733.3333333334, ans=0.02 2023-09-30 10:11:00,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:11:00,926 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:11:02,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:11:02,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:11:03,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:11:08,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:11:09,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-30 10:11:13,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-30 10:11:13,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:11:14,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-30 10:11:23,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-30 10:11:24,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:11:24,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-30 10:11:26,270 INFO [train.py:1039] (1/4) Epoch 20, batch 900, loss[loss=0.1797, simple_loss=0.2467, pruned_loss=0.05638, over 23332.00 frames. ], tot_loss[loss=0.1782, simple_loss=0.2543, pruned_loss=0.05106, over 4676463.60 frames. ], batch size: 105, lr: 5.19e-03, grad_scale: 16.0 2023-09-30 10:11:26,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:11:26,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:11:28,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-30 10:11:36,045 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:11:37,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:11:39,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-30 10:11:40,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:11:41,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-30 10:11:43,972 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-30 10:11:45,391 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.863e+02 2.352e+02 2.808e+02 3.950e+02, threshold=4.705e+02, percent-clipped=0.0 2023-09-30 10:11:45,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:11:45,496 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:11:45,590 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:11:45,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:11:55,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:11:55,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:11:55,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 10:11:59,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:12:05,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-30 10:12:05,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:12:13,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:12:15,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-30 10:12:15,163 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-30 10:12:16,715 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-30 10:12:22,123 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-30 10:12:22,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:12:22,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 10:12:24,122 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=679066.6666666666, ans=0.09899494936611666 2023-09-30 10:12:28,927 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:12:29,244 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=679066.6666666666, ans=0.125 2023-09-30 10:12:30,340 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:12:31,167 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.14 vs. limit=15.0 2023-09-30 10:12:31,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-30 10:12:31,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:12:34,769 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-30 10:12:36,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:12:37,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:12:39,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:12:39,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:12:43,014 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-30 10:12:43,065 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-30 10:12:43,260 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-30 10:12:43,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-30 10:12:47,490 INFO [train.py:1039] (1/4) Epoch 20, batch 950, loss[loss=0.1788, simple_loss=0.2672, pruned_loss=0.04522, over 24287.00 frames. ], tot_loss[loss=0.1788, simple_loss=0.2548, pruned_loss=0.0514, over 4687577.55 frames. ], batch size: 74, lr: 5.19e-03, grad_scale: 16.0 2023-09-30 10:12:47,654 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:12:50,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-30 10:12:56,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:13:00,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:13:00,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:13:00,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 10:13:03,176 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-30 10:13:06,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:13:07,884 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:13:07,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:13:08,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:13:09,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-30 10:13:11,598 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-30 10:13:13,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:13:13,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-30 10:13:14,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:13:18,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:13:18,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:13:18,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:13:19,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-30 10:13:22,637 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 10:13:24,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:13:26,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:13:32,250 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:13:32,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:13:36,159 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-30 10:13:37,807 WARNING [train.py:1197] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 10:13:37,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 10:13:39,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:13:40,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:13:40,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:13:44,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-30 10:13:46,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:13:47,728 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:13:47,829 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:13:47,861 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-30 10:13:49,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:13:49,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:13:49,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-30 10:13:52,623 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=679466.6666666666, ans=0.035 2023-09-30 10:13:54,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 10:13:57,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:13:59,759 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=679466.6666666666, ans=0.0 2023-09-30 10:14:00,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:14:02,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-30 10:14:02,499 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-30 10:14:07,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:14:10,686 INFO [train.py:1039] (1/4) Epoch 20, batch 1000, loss[loss=0.1903, simple_loss=0.2592, pruned_loss=0.06071, over 23675.00 frames. ], tot_loss[loss=0.1779, simple_loss=0.2536, pruned_loss=0.0511, over 4691899.04 frames. ], batch size: 149, lr: 5.19e-03, grad_scale: 16.0 2023-09-30 10:14:13,645 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-30 10:14:15,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:14:20,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:14:20,524 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-30 10:14:20,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-30 10:14:22,418 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=679533.3333333334, ans=0.0 2023-09-30 10:14:24,004 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=679533.3333333334, ans=0.0 2023-09-30 10:14:25,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:14:25,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:14:25,640 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=679600.0, ans=0.2 2023-09-30 10:14:28,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:14:29,656 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.560e+02 1.877e+02 1.981e+02 2.279e+02 3.470e+02, threshold=3.963e+02, percent-clipped=0.0 2023-09-30 10:14:29,955 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-30 10:14:30,143 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=679600.0, ans=0.1 2023-09-30 10:14:36,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-30 10:14:37,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-30 10:14:37,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:14:38,165 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-30 10:14:39,736 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-30 10:14:41,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-30 10:14:43,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:14:45,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:14:55,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:14:55,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:14:56,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:14:56,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:14:56,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-30 10:14:56,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:14:57,055 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:14:57,413 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=679666.6666666666, ans=0.2 2023-09-30 10:14:58,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:14:58,586 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-30 10:15:01,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-30 10:15:02,638 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.65 vs. limit=22.5 2023-09-30 10:15:03,794 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.05 vs. limit=12.0 2023-09-30 10:15:04,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-30 10:15:04,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-30 10:15:06,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:15:15,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:15:15,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:15:15,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:15:17,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:15:19,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-30 10:15:19,960 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=679800.0, ans=0.1 2023-09-30 10:15:21,118 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:15:21,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-30 10:15:22,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-30 10:15:24,169 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:15:24,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:15:26,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:15:30,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 10:15:31,185 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:15:34,049 INFO [train.py:1039] (1/4) Epoch 20, batch 1050, loss[loss=0.1891, simple_loss=0.2604, pruned_loss=0.05893, over 23772.00 frames. ], tot_loss[loss=0.1765, simple_loss=0.2513, pruned_loss=0.05084, over 4697278.90 frames. ], batch size: 180, lr: 5.19e-03, grad_scale: 16.0 2023-09-30 10:15:35,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:15:37,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:15:40,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 10:15:41,064 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:15:44,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:15:46,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:15:48,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:15:50,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:15:52,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:15:52,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:15:53,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:15:55,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-30 10:15:55,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:15:55,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-30 10:15:58,995 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:15:59,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-30 10:15:59,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-30 10:16:04,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:16:04,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-30 10:16:04,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:16:06,335 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=680000.0, ans=0.2 2023-09-30 10:16:07,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-30 10:16:07,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-30 10:16:09,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:16:11,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-30 10:16:13,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-30 10:16:14,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:16:19,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 10:16:22,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-30 10:16:23,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:16:24,537 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:16:29,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:16:32,951 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-30 10:16:34,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-30 10:16:34,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-30 10:16:34,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:16:34,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:16:34,914 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=680066.6666666666, ans=0.125 2023-09-30 10:16:38,153 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-30 10:16:41,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:16:42,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:16:42,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:16:44,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:16:44,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:16:48,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:16:48,178 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-30 10:16:49,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:16:49,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-30 10:16:49,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-30 10:16:51,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:16:55,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:16:56,283 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=680200.0, ans=0.1 2023-09-30 10:16:58,141 INFO [train.py:1039] (1/4) Epoch 20, batch 1100, loss[loss=0.1752, simple_loss=0.2533, pruned_loss=0.04851, over 23101.00 frames. ], tot_loss[loss=0.1762, simple_loss=0.2512, pruned_loss=0.05064, over 4698209.97 frames. ], batch size: 105, lr: 5.19e-03, grad_scale: 16.0 2023-09-30 10:17:01,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:17:01,621 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:17:06,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 10:17:08,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:17:10,443 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:17:10,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-30 10:17:12,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:17:14,154 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.03 vs. limit=6.0 2023-09-30 10:17:15,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-30 10:17:16,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:17:17,920 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.800e+02 1.958e+02 2.202e+02 3.142e+02, threshold=3.917e+02, percent-clipped=0.0 2023-09-30 10:17:19,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:17:20,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-30 10:17:21,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 10:17:23,277 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:17:23,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:17:24,967 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=680266.6666666666, ans=0.1 2023-09-30 10:17:26,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:17:27,893 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-30 10:17:30,358 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.12 vs. limit=15.0 2023-09-30 10:17:33,935 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:17:36,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-30 10:17:36,472 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-30 10:17:37,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:17:40,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:17:41,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-30 10:17:41,623 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:17:43,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-30 10:17:43,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:17:43,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:17:45,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:17:45,083 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:17:45,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-30 10:17:50,791 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:17:50,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-30 10:17:53,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 10:17:59,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 10:18:02,804 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-30 10:18:04,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-30 10:18:05,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:18:08,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:18:09,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:18:11,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-30 10:18:12,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:18:12,654 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:18:14,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-30 10:18:14,676 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:18:16,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-30 10:18:16,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:18:16,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:18:17,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:18:18,890 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.06 vs. limit=15.0 2023-09-30 10:18:21,503 INFO [train.py:1039] (1/4) Epoch 20, batch 1150, loss[loss=0.1556, simple_loss=0.239, pruned_loss=0.03607, over 24668.00 frames. ], tot_loss[loss=0.1781, simple_loss=0.253, pruned_loss=0.05157, over 4701151.59 frames. ], batch size: 65, lr: 5.18e-03, grad_scale: 16.0 2023-09-30 10:18:24,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:18:29,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:18:31,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:18:31,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:18:31,356 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-30 10:18:31,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:18:34,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-30 10:18:36,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:18:36,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 10:18:40,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-30 10:18:43,883 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:18:47,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:18:48,912 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:18:48,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-30 10:18:48,997 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-30 10:18:50,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:18:53,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-30 10:18:55,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:18:57,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:19:03,863 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.80 vs. limit=22.5 2023-09-30 10:19:06,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:19:15,907 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:19:17,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-30 10:19:17,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:19:17,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:19:26,250 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-30 10:19:27,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:19:32,860 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-30 10:19:38,155 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:19:38,329 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:19:39,717 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-30 10:19:39,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:19:43,295 INFO [train.py:1039] (1/4) Epoch 20, batch 1200, loss[loss=0.185, simple_loss=0.2519, pruned_loss=0.05905, over 23773.00 frames. ], tot_loss[loss=0.1775, simple_loss=0.2528, pruned_loss=0.05108, over 4713960.77 frames. ], batch size: 179, lr: 5.18e-03, grad_scale: 32.0 2023-09-30 10:19:44,289 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.68 vs. limit=12.0 2023-09-30 10:19:44,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:19:46,812 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=680866.6666666666, ans=0.125 2023-09-30 10:19:49,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:19:49,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:19:51,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:19:51,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:19:52,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:19:53,194 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=680866.6666666666, ans=0.125 2023-09-30 10:19:55,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:19:57,288 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 10:19:59,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:19:59,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:20:02,646 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.868e+02 2.067e+02 2.397e+02 3.713e+02, threshold=4.134e+02, percent-clipped=0.0 2023-09-30 10:20:02,912 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-30 10:20:03,403 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=680933.3333333334, ans=0.125 2023-09-30 10:20:04,567 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-30 10:20:08,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:20:11,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:20:14,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:20:15,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:20:15,845 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-30 10:20:17,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:20:23,122 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=681000.0, ans=0.2 2023-09-30 10:20:25,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-30 10:20:25,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:20:26,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-30 10:20:27,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:20:30,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-30 10:20:30,990 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=681066.6666666666, ans=0.0 2023-09-30 10:20:34,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-30 10:20:34,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:20:36,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:20:38,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:20:38,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:20:39,225 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=681066.6666666666, ans=0.0 2023-09-30 10:20:41,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:20:41,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:20:43,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:20:43,427 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-30 10:20:44,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 10:20:44,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:20:44,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 10:20:48,010 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:20:48,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:20:52,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-30 10:20:55,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:20:57,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-30 10:21:01,179 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-30 10:21:03,953 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:21:05,359 INFO [train.py:1039] (1/4) Epoch 20, batch 1250, loss[loss=0.1882, simple_loss=0.2596, pruned_loss=0.05835, over 23714.00 frames. ], tot_loss[loss=0.1776, simple_loss=0.2534, pruned_loss=0.05091, over 4736335.59 frames. ], batch size: 149, lr: 5.18e-03, grad_scale: 16.0 2023-09-30 10:21:06,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:21:09,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:21:10,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:21:14,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-30 10:21:17,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:21:19,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:21:20,398 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.60 vs. limit=5.0 2023-09-30 10:21:20,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-30 10:21:23,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:21:25,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 10:21:29,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 10:21:30,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:21:31,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 10:21:31,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:21:33,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-30 10:21:38,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 10:21:38,588 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-30 10:21:38,609 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:21:39,777 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.50 vs. limit=15.0 2023-09-30 10:21:40,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:21:41,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:21:42,066 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=681333.3333333334, ans=0.125 2023-09-30 10:21:43,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:21:45,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-30 10:21:51,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-30 10:21:51,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:21:53,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:21:54,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-30 10:21:54,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:21:54,711 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-30 10:21:56,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:21:56,134 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:21:58,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:22:02,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:22:02,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:22:04,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-30 10:22:05,674 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-30 10:22:05,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-30 10:22:08,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:22:09,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-30 10:22:11,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:22:15,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-30 10:22:15,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:22:16,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-30 10:22:16,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-30 10:22:18,249 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:22:18,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-30 10:22:18,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:22:19,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-30 10:22:23,925 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:22:24,313 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=681466.6666666666, ans=0.2 2023-09-30 10:22:25,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:22:27,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 10:22:27,901 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.40 vs. limit=15.0 2023-09-30 10:22:28,732 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-30 10:22:30,072 INFO [train.py:1039] (1/4) Epoch 20, batch 1300, loss[loss=0.163, simple_loss=0.2488, pruned_loss=0.03857, over 24653.00 frames. ], tot_loss[loss=0.178, simple_loss=0.254, pruned_loss=0.05101, over 4742916.18 frames. ], batch size: 68, lr: 5.18e-03, grad_scale: 8.0 2023-09-30 10:22:31,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:22:33,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-30 10:22:35,325 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=681533.3333333334, ans=0.1 2023-09-30 10:22:37,981 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:22:41,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-30 10:22:41,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:22:44,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:22:44,671 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:22:46,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-30 10:22:52,545 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.925e+02 2.165e+02 2.491e+02 3.486e+02, threshold=4.330e+02, percent-clipped=0.0 2023-09-30 10:22:52,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:22:54,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-30 10:22:56,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-30 10:22:57,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 10:23:03,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:23:03,298 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=681666.6666666666, ans=0.125 2023-09-30 10:23:04,566 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:23:06,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:23:06,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:23:07,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 10:23:07,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-30 10:23:07,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-30 10:23:14,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:23:15,445 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.41 vs. limit=15.0 2023-09-30 10:23:16,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 10:23:17,688 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-30 10:23:17,793 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 10:23:20,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:23:21,146 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=681733.3333333334, ans=0.1 2023-09-30 10:23:21,780 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.69 vs. limit=22.5 2023-09-30 10:23:23,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:23:23,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-30 10:23:25,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:23:25,763 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-30 10:23:27,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:23:31,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:23:31,183 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:23:34,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-30 10:23:34,385 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-30 10:23:36,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-30 10:23:40,971 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:23:44,001 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-30 10:23:45,685 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:23:50,984 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.min_positive, batch_count=681866.6666666666, ans=0.025 2023-09-30 10:23:52,102 INFO [train.py:1039] (1/4) Epoch 20, batch 1350, loss[loss=0.1821, simple_loss=0.2757, pruned_loss=0.04423, over 24367.00 frames. ], tot_loss[loss=0.1779, simple_loss=0.2533, pruned_loss=0.05121, over 4724943.08 frames. ], batch size: 77, lr: 5.18e-03, grad_scale: 8.0 2023-09-30 10:23:52,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-30 10:23:54,130 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=681866.6666666666, ans=0.0 2023-09-30 10:23:56,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:24:00,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:24:00,920 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=681866.6666666666, ans=0.0 2023-09-30 10:24:04,054 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:24:04,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:24:06,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:24:07,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:24:07,536 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=681933.3333333334, ans=0.0 2023-09-30 10:24:10,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:24:12,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-30 10:24:15,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-30 10:24:15,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:24:18,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-30 10:24:18,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:24:19,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:24:20,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-30 10:24:21,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-30 10:24:24,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-30 10:24:26,629 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=682000.0, ans=0.0 2023-09-30 10:24:27,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:24:27,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-30 10:24:40,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:24:50,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:24:50,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:24:50,197 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-30 10:24:53,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:24:54,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-30 10:24:54,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-30 10:24:54,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:24:58,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:25:00,487 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=682133.3333333334, ans=0.0 2023-09-30 10:25:01,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-30 10:25:01,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:25:08,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-30 10:25:09,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-30 10:25:14,393 INFO [train.py:1039] (1/4) Epoch 20, batch 1400, loss[loss=0.181, simple_loss=0.255, pruned_loss=0.05352, over 23593.00 frames. ], tot_loss[loss=0.177, simple_loss=0.2527, pruned_loss=0.05066, over 4738328.23 frames. ], batch size: 149, lr: 5.18e-03, grad_scale: 8.0 2023-09-30 10:25:16,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-30 10:25:18,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:25:21,457 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:25:23,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:25:29,108 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-30 10:25:32,555 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-30 10:25:37,259 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.836e+02 1.987e+02 2.257e+02 3.606e+02, threshold=3.975e+02, percent-clipped=0.0 2023-09-30 10:25:42,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:25:44,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:25:46,550 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=682333.3333333334, ans=0.125 2023-09-30 10:25:47,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:25:47,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-30 10:25:52,499 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:25:52,629 WARNING [train.py:1197] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 10:25:55,101 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=682333.3333333334, ans=0.1 2023-09-30 10:26:03,839 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:26:03,945 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:26:09,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-30 10:26:09,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:26:09,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:26:09,734 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=682400.0, ans=0.0 2023-09-30 10:26:10,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:26:10,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:26:12,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:26:12,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:26:13,872 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:26:14,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-30 10:26:14,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:26:19,296 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=682466.6666666666, ans=0.0 2023-09-30 10:26:21,467 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=682466.6666666666, ans=0.1 2023-09-30 10:26:22,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:26:25,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:26:26,012 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=682466.6666666666, ans=0.125 2023-09-30 10:26:31,125 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-30 10:26:32,171 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.58 vs. limit=22.5 2023-09-30 10:26:32,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 10:26:34,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:26:35,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 10:26:36,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:26:37,957 INFO [train.py:1039] (1/4) Epoch 20, batch 1450, loss[loss=0.1677, simple_loss=0.2521, pruned_loss=0.04166, over 24469.00 frames. ], tot_loss[loss=0.176, simple_loss=0.251, pruned_loss=0.05054, over 4723808.72 frames. ], batch size: 66, lr: 5.18e-03, grad_scale: 8.0 2023-09-30 10:26:38,162 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:26:41,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-30 10:26:44,247 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:26:44,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:26:44,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-30 10:26:50,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:26:51,093 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 10:26:52,046 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.53 vs. limit=15.0 2023-09-30 10:26:53,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:26:53,194 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-30 10:26:54,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 10:26:56,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-30 10:26:56,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:26:56,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:26:56,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-30 10:26:59,406 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:26:59,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:27:00,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 10:27:00,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:27:01,214 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=682600.0, ans=0.1 2023-09-30 10:27:03,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:27:04,666 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:27:07,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:27:11,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:27:11,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:27:16,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:27:16,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:27:16,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:27:17,836 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:27:17,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:27:17,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:27:22,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-30 10:27:26,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:27:31,136 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-30 10:27:32,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:27:32,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:27:34,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:27:36,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-30 10:27:39,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:27:41,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-30 10:27:42,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-30 10:27:43,069 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:27:46,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:27:47,674 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:27:51,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-30 10:27:51,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-30 10:27:52,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-30 10:27:54,215 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:27:54,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 10:28:01,514 INFO [train.py:1039] (1/4) Epoch 20, batch 1500, loss[loss=0.1825, simple_loss=0.2659, pruned_loss=0.04953, over 24429.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2511, pruned_loss=0.05019, over 4725474.38 frames. ], batch size: 69, lr: 5.18e-03, grad_scale: 8.0 2023-09-30 10:28:06,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-30 10:28:07,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:28:07,936 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:28:09,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:28:09,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:28:11,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:28:12,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-30 10:28:14,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 10:28:14,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-30 10:28:14,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:28:16,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:28:16,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:28:16,623 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=682933.3333333334, ans=0.1 2023-09-30 10:28:17,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:28:18,315 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=682933.3333333334, ans=0.0 2023-09-30 10:28:24,678 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.891e+02 2.112e+02 2.423e+02 4.358e+02, threshold=4.223e+02, percent-clipped=4.0 2023-09-30 10:28:24,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:28:24,873 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-30 10:28:24,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-30 10:28:25,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:28:25,168 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=682933.3333333334, ans=0.0 2023-09-30 10:28:26,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:28:29,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-30 10:28:32,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-30 10:28:35,221 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:28:36,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-30 10:28:37,074 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer_ff2.min_abs, batch_count=683000.0, ans=0.1 2023-09-30 10:28:38,492 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=683000.0, ans=0.125 2023-09-30 10:28:39,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-30 10:28:41,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:28:42,897 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:28:42,932 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:28:44,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-30 10:28:45,908 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:28:45,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:28:46,197 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=683000.0, ans=0.125 2023-09-30 10:28:47,290 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-30 10:28:47,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:28:51,137 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=683066.6666666666, ans=0.125 2023-09-30 10:28:54,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:28:54,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-30 10:28:54,753 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=683066.6666666666, ans=0.125 2023-09-30 10:28:59,204 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 10:29:02,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 10:29:06,576 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-30 10:29:06,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:29:06,671 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-30 10:29:09,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:29:11,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:29:12,110 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-30 10:29:12,246 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-30 10:29:16,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-30 10:29:18,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:29:18,511 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=683133.3333333334, ans=0.09899494936611666 2023-09-30 10:29:21,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:29:21,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:29:21,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:29:22,799 INFO [train.py:1039] (1/4) Epoch 20, batch 1550, loss[loss=0.1711, simple_loss=0.2497, pruned_loss=0.04625, over 20252.00 frames. ], tot_loss[loss=0.1765, simple_loss=0.2516, pruned_loss=0.05072, over 4718461.88 frames. ], batch size: 44, lr: 5.17e-03, grad_scale: 8.0 2023-09-30 10:29:22,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:29:23,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:29:26,515 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-30 10:29:26,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-30 10:29:26,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:29:28,020 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-30 10:29:28,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-30 10:29:31,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:29:33,460 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:29:33,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:29:33,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:29:35,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:29:35,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:29:37,006 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-30 10:29:38,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:29:38,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:29:39,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 10:29:40,315 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=683266.6666666666, ans=0.0 2023-09-30 10:29:42,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-30 10:29:43,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-30 10:29:43,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:29:43,877 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-30 10:29:45,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-30 10:29:45,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-30 10:29:47,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:29:49,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:29:52,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:29:55,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-30 10:29:55,217 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-30 10:30:02,899 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=683333.3333333334, ans=0.1 2023-09-30 10:30:05,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:30:08,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:30:08,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-30 10:30:08,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:30:10,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-30 10:30:15,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 10:30:18,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:30:20,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:30:23,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:30:24,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:30:24,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-30 10:30:25,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:30:26,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:30:28,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:30:28,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-30 10:30:29,835 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-30 10:30:31,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:30:35,210 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=683466.6666666666, ans=0.125 2023-09-30 10:30:38,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-30 10:30:40,694 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.05 vs. limit=22.5 2023-09-30 10:30:44,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:30:46,000 INFO [train.py:1039] (1/4) Epoch 20, batch 1600, loss[loss=0.1607, simple_loss=0.2418, pruned_loss=0.03977, over 24541.00 frames. ], tot_loss[loss=0.1771, simple_loss=0.2524, pruned_loss=0.05091, over 4723358.74 frames. ], batch size: 63, lr: 5.17e-03, grad_scale: 16.0 2023-09-30 10:30:46,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:30:46,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-30 10:30:47,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:30:49,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:30:49,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 10:30:49,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:30:50,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:30:54,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:30:54,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-30 10:30:54,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-30 10:30:56,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-30 10:30:56,647 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=683533.3333333334, ans=0.2 2023-09-30 10:30:58,721 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:31:01,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-30 10:31:01,779 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.min_positive, batch_count=683600.0, ans=0.025 2023-09-30 10:31:03,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:31:04,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:31:09,444 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.487e+02 1.873e+02 2.054e+02 2.261e+02 3.957e+02, threshold=4.107e+02, percent-clipped=0.0 2023-09-30 10:31:11,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:31:12,210 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=683600.0, ans=0.125 2023-09-30 10:31:13,373 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=683600.0, ans=0.0 2023-09-30 10:31:14,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-30 10:31:16,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:31:17,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-30 10:31:19,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:31:19,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-30 10:31:21,583 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.74 vs. limit=22.5 2023-09-30 10:31:25,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-30 10:31:34,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:31:35,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-30 10:31:36,355 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.17 vs. limit=15.0 2023-09-30 10:31:37,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:31:37,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:31:37,200 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:31:40,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-30 10:31:45,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 10:31:47,048 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:31:48,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:31:48,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:31:49,285 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:31:52,166 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:31:53,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:31:55,203 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 10:31:55,466 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=683800.0, ans=10.0 2023-09-30 10:32:00,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:32:01,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:32:02,258 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=683800.0, ans=10.0 2023-09-30 10:32:04,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-30 10:32:04,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:32:05,064 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-30 10:32:08,693 INFO [train.py:1039] (1/4) Epoch 20, batch 1650, loss[loss=0.1837, simple_loss=0.243, pruned_loss=0.06215, over 23825.00 frames. ], tot_loss[loss=0.1781, simple_loss=0.2537, pruned_loss=0.05129, over 4698924.28 frames. ], batch size: 164, lr: 5.17e-03, grad_scale: 16.0 2023-09-30 10:32:09,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:32:11,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:32:13,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:32:13,459 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-30 10:32:13,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-30 10:32:13,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-30 10:32:13,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-30 10:32:16,889 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=683866.6666666666, ans=0.1 2023-09-30 10:32:18,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:32:20,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:32:20,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:32:20,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-30 10:32:21,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:32:24,197 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-30 10:32:27,353 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:32:27,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:32:27,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:32:27,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:32:28,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-30 10:32:28,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-30 10:32:29,314 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=2.636e-03 2023-09-30 10:32:35,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 10:32:38,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-30 10:32:47,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-30 10:32:47,953 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=684000.0, ans=0.2 2023-09-30 10:32:48,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:32:52,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-30 10:32:55,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:32:58,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:32:58,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:32:58,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:32:59,076 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=684066.6666666666, ans=0.0 2023-09-30 10:33:01,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:33:01,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:33:03,222 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.57 vs. limit=12.0 2023-09-30 10:33:04,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:33:04,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:33:06,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:33:06,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:33:07,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:33:10,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:33:12,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:33:13,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-30 10:33:15,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:33:15,680 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=684133.3333333334, ans=0.125 2023-09-30 10:33:16,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-30 10:33:19,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-30 10:33:19,090 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-30 10:33:19,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:33:20,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:33:20,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:33:20,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:33:20,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-30 10:33:24,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:33:25,601 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:33:25,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:33:27,312 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=684133.3333333334, ans=0.125 2023-09-30 10:33:30,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-30 10:33:32,085 INFO [train.py:1039] (1/4) Epoch 20, batch 1700, loss[loss=0.1667, simple_loss=0.2291, pruned_loss=0.05216, over 23321.00 frames. ], tot_loss[loss=0.1779, simple_loss=0.2534, pruned_loss=0.05122, over 4706511.14 frames. ], batch size: 285, lr: 5.17e-03, grad_scale: 16.0 2023-09-30 10:33:33,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:33:33,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:33:33,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-30 10:33:35,415 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.15 vs. limit=15.0 2023-09-30 10:33:36,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:33:36,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:33:36,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:33:38,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:33:38,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:33:39,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-30 10:33:41,696 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 10:33:41,978 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=684200.0, ans=0.125 2023-09-30 10:33:51,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:33:54,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:33:55,785 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.477e+02 1.837e+02 2.036e+02 2.305e+02 3.271e+02, threshold=4.073e+02, percent-clipped=0.0 2023-09-30 10:33:59,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:33:59,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:34:00,610 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:34:00,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:34:05,047 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-30 10:34:05,306 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:34:05,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:34:09,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-30 10:34:11,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-30 10:34:12,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-30 10:34:12,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-30 10:34:14,287 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:34:15,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-30 10:34:17,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:34:22,655 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.13 vs. limit=10.0 2023-09-30 10:34:27,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:34:27,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:34:28,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:34:30,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-30 10:34:30,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-30 10:34:31,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:34:33,383 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:34:33,384 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-30 10:34:34,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:34:34,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:34:36,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:34:36,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:34:39,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:34:39,820 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:34:41,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:34:43,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:34:43,505 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:34:47,370 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:34:47,679 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=684466.6666666666, ans=0.0 2023-09-30 10:34:48,883 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-30 10:34:50,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:34:51,982 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:34:52,334 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=684466.6666666666, ans=0.0 2023-09-30 10:34:53,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-30 10:34:54,997 INFO [train.py:1039] (1/4) Epoch 20, batch 1750, loss[loss=0.1636, simple_loss=0.2446, pruned_loss=0.04128, over 24500.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.2526, pruned_loss=0.05087, over 4703129.83 frames. ], batch size: 63, lr: 5.17e-03, grad_scale: 16.0 2023-09-30 10:35:00,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:35:01,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:35:03,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-30 10:35:03,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-30 10:35:03,551 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:35:03,878 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=684533.3333333334, ans=0.0 2023-09-30 10:35:06,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:35:06,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:35:11,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-30 10:35:13,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:35:17,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-30 10:35:17,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:35:19,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:35:19,670 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=684600.0, ans=0.125 2023-09-30 10:35:20,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 10:35:22,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-30 10:35:24,121 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:35:24,433 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=684600.0, ans=0.125 2023-09-30 10:35:25,500 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-30 10:35:32,403 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:35:32,633 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=684666.6666666666, ans=0.0 2023-09-30 10:35:35,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:35:35,390 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:35:38,414 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:35:38,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:35:40,154 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:35:43,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:35:46,481 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:35:46,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:35:46,771 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=684733.3333333334, ans=0.125 2023-09-30 10:35:48,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-30 10:35:50,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:35:52,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-30 10:35:54,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:35:55,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:35:55,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:35:58,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 10:36:00,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-30 10:36:01,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:36:01,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:36:02,893 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=684800.0, ans=0.125 2023-09-30 10:36:05,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:36:08,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:36:10,266 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:36:10,670 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=684800.0, ans=0.05 2023-09-30 10:36:11,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-30 10:36:11,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:36:13,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:36:13,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:36:13,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:36:14,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:36:16,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:36:18,220 INFO [train.py:1039] (1/4) Epoch 20, batch 1800, loss[loss=0.1584, simple_loss=0.2376, pruned_loss=0.03961, over 24313.00 frames. ], tot_loss[loss=0.176, simple_loss=0.2509, pruned_loss=0.05052, over 4687914.35 frames. ], batch size: 61, lr: 5.17e-03, grad_scale: 16.0 2023-09-30 10:36:18,440 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:36:20,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:36:22,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 10:36:24,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:36:27,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 10:36:30,417 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:36:32,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:36:34,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:36:35,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:36:37,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:36:37,497 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:36:38,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-30 10:36:38,956 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:36:41,814 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.858e+02 2.042e+02 2.295e+02 4.168e+02, threshold=4.085e+02, percent-clipped=1.0 2023-09-30 10:36:42,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:36:45,418 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=684933.3333333334, ans=0.1 2023-09-30 10:36:46,619 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-30 10:36:48,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-30 10:36:48,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-30 10:36:49,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:36:51,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:36:51,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:36:53,835 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:37:02,068 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-30 10:37:02,844 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.07 vs. limit=22.5 2023-09-30 10:37:03,576 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:37:05,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:37:07,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-30 10:37:07,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-30 10:37:08,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:37:10,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:37:11,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:37:12,267 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.32 vs. limit=15.0 2023-09-30 10:37:15,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-30 10:37:18,684 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.72 vs. limit=15.0 2023-09-30 10:37:23,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:37:24,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-30 10:37:25,468 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:37:25,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:37:25,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:37:25,704 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=685133.3333333334, ans=0.0 2023-09-30 10:37:27,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-30 10:37:29,658 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=685133.3333333334, ans=0.125 2023-09-30 10:37:30,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:37:30,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:37:33,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-30 10:37:33,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:37:36,870 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:37:36,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:37:36,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:37:38,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:37:39,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:37:41,845 INFO [train.py:1039] (1/4) Epoch 20, batch 1850, loss[loss=0.181, simple_loss=0.2493, pruned_loss=0.05639, over 23739.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2517, pruned_loss=0.05057, over 4706112.35 frames. ], batch size: 135, lr: 5.17e-03, grad_scale: 16.0 2023-09-30 10:37:42,087 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:37:42,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:37:45,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:37:45,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:37:51,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:37:51,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-30 10:37:51,918 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=685200.0, ans=0.2 2023-09-30 10:37:54,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-30 10:37:59,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-30 10:38:04,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:38:04,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-30 10:38:04,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 10:38:09,716 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=685266.6666666666, ans=0.125 2023-09-30 10:38:14,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:38:17,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-30 10:38:19,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:38:20,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:38:25,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-30 10:38:25,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:38:25,780 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 10:38:27,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:38:28,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:38:30,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:38:32,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-30 10:38:33,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:38:33,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 10:38:33,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:38:37,203 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:38:39,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:38:41,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-30 10:38:43,112 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:38:43,347 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=685400.0, ans=0.2 2023-09-30 10:38:47,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:38:49,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:38:49,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-30 10:38:49,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-30 10:38:50,858 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-30 10:38:52,875 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-30 10:38:54,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 10:38:54,536 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:38:54,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:38:54,817 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=685466.6666666666, ans=0.1 2023-09-30 10:38:56,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:38:57,443 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-30 10:38:57,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:38:57,533 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:38:59,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-30 10:39:00,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 10:39:02,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:39:02,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-30 10:39:03,547 INFO [train.py:1039] (1/4) Epoch 20, batch 1900, loss[loss=0.1644, simple_loss=0.2506, pruned_loss=0.03908, over 24501.00 frames. ], tot_loss[loss=0.1765, simple_loss=0.2523, pruned_loss=0.05037, over 4715529.18 frames. ], batch size: 66, lr: 5.17e-03, grad_scale: 16.0 2023-09-30 10:39:05,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:39:05,105 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-30 10:39:05,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:39:06,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:39:13,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:39:16,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:39:17,100 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-30 10:39:18,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-30 10:39:18,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:39:20,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:39:20,367 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-30 10:39:20,423 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-30 10:39:24,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-30 10:39:25,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:39:27,715 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.492e+02 1.870e+02 2.135e+02 2.444e+02 3.596e+02, threshold=4.270e+02, percent-clipped=0.0 2023-09-30 10:39:29,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-30 10:39:32,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-30 10:39:41,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-30 10:39:45,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-30 10:39:45,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:39:45,844 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-30 10:39:45,851 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-30 10:39:47,178 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-30 10:39:47,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-30 10:39:47,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:39:47,648 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=685666.6666666666, ans=0.125 2023-09-30 10:39:52,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-30 10:39:56,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:39:57,035 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.47 vs. limit=10.0 2023-09-30 10:39:59,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:39:59,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-30 10:40:01,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:40:03,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-30 10:40:03,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:40:10,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:40:10,774 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:40:10,811 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:40:12,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:40:13,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 10:40:15,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-30 10:40:15,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:40:18,511 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:40:18,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:40:22,300 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:40:22,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:40:23,724 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:40:23,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:40:28,014 INFO [train.py:1039] (1/4) Epoch 20, batch 1950, loss[loss=0.155, simple_loss=0.2349, pruned_loss=0.03753, over 24312.00 frames. ], tot_loss[loss=0.1779, simple_loss=0.2532, pruned_loss=0.05132, over 4714668.75 frames. ], batch size: 56, lr: 5.16e-03, grad_scale: 16.0 2023-09-30 10:40:28,208 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 10:40:29,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:40:29,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:40:29,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 10:40:32,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-30 10:40:33,082 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=685866.6666666666, ans=0.0 2023-09-30 10:40:34,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 10:40:34,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:40:36,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:40:39,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:40:39,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:40:39,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:40:42,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:40:45,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 10:40:45,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 10:40:45,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:40:45,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:40:45,884 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=685933.3333333334, ans=0.125 2023-09-30 10:40:48,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:40:51,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:40:51,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:40:51,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-30 10:40:51,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-30 10:40:53,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 10:40:54,162 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.90 vs. limit=15.0 2023-09-30 10:40:54,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:40:55,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:40:58,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:41:02,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:41:05,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 10:41:06,103 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=686000.0, ans=0.125 2023-09-30 10:41:07,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:41:07,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-30 10:41:09,553 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-30 10:41:09,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:41:14,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:41:14,556 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=686000.0, ans=0.125 2023-09-30 10:41:15,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:41:15,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:41:25,161 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:41:25,293 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:41:29,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:41:32,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:41:36,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:41:36,377 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:41:37,895 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-30 10:41:37,903 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 10:41:38,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:41:39,594 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-30 10:41:40,327 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.19 vs. limit=15.0 2023-09-30 10:41:41,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:41:46,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:41:47,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:41:47,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:41:50,530 INFO [train.py:1039] (1/4) Epoch 20, batch 2000, loss[loss=0.1557, simple_loss=0.2296, pruned_loss=0.0409, over 24612.00 frames. ], tot_loss[loss=0.1782, simple_loss=0.2538, pruned_loss=0.05126, over 4724991.43 frames. ], batch size: 60, lr: 5.16e-03, grad_scale: 32.0 2023-09-30 10:41:50,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:41:53,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:41:55,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-30 10:41:56,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-30 10:41:57,030 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=686200.0, ans=0.125 2023-09-30 10:41:59,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:42:02,793 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-30 10:42:04,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 10:42:04,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:42:06,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:42:08,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-30 10:42:09,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:42:11,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:42:11,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:42:13,099 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 1.882e+02 2.052e+02 2.299e+02 3.277e+02, threshold=4.104e+02, percent-clipped=0.0 2023-09-30 10:42:13,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-30 10:42:13,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 10:42:16,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-30 10:42:16,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:42:16,694 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=686266.6666666666, ans=0.0 2023-09-30 10:42:20,056 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:42:21,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-30 10:42:21,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:42:23,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:42:24,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:42:26,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-30 10:42:29,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-30 10:42:29,246 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:42:29,260 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:42:35,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:42:36,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:42:36,860 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 10:42:38,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:42:39,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:42:39,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:42:42,078 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 10:42:42,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:42:44,370 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:42:44,791 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=686400.0, ans=0.0 2023-09-30 10:42:48,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:42:48,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-30 10:42:56,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 10:42:56,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:43:01,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:43:01,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:43:05,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:43:06,326 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten.whitening_limit, batch_count=686466.6666666666, ans=15.0 2023-09-30 10:43:08,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:43:08,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:43:08,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 10:43:08,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:43:10,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:43:10,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:43:11,759 INFO [train.py:1039] (1/4) Epoch 20, batch 2050, loss[loss=0.1699, simple_loss=0.2444, pruned_loss=0.0477, over 23368.00 frames. ], tot_loss[loss=0.1778, simple_loss=0.2534, pruned_loss=0.05106, over 4717053.06 frames. ], batch size: 119, lr: 5.16e-03, grad_scale: 32.0 2023-09-30 10:43:13,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:43:15,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:43:19,611 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=686533.3333333334, ans=0.125 2023-09-30 10:43:21,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:43:22,057 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=686533.3333333334, ans=0.1 2023-09-30 10:43:23,058 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:43:24,486 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:43:25,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:43:27,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-30 10:43:27,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:43:29,382 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:43:31,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:43:31,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-30 10:43:39,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:43:40,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:43:43,688 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-30 10:43:45,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:43:47,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-30 10:43:48,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:43:51,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:43:55,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:43:57,570 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-30 10:43:57,660 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:43:59,162 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:44:00,635 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:44:00,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:44:05,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:44:06,601 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:44:10,136 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-30 10:44:10,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:44:13,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 10:44:18,294 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:44:19,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-30 10:44:25,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:44:26,974 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:44:29,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:44:32,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-30 10:44:32,912 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=686866.6666666666, ans=0.0 2023-09-30 10:44:34,063 INFO [train.py:1039] (1/4) Epoch 20, batch 2100, loss[loss=0.1687, simple_loss=0.2513, pruned_loss=0.04305, over 24486.00 frames. ], tot_loss[loss=0.1768, simple_loss=0.2518, pruned_loss=0.05092, over 4700216.47 frames. ], batch size: 66, lr: 5.16e-03, grad_scale: 16.0 2023-09-30 10:44:35,765 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-30 10:44:35,766 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:44:35,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:44:37,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:44:37,548 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:44:37,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-30 10:44:37,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-30 10:44:39,164 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 10:44:43,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:44:44,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:44:47,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:44:47,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:44:47,604 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-30 10:44:49,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 10:44:49,161 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-30 10:44:49,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-30 10:44:49,503 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=686933.3333333334, ans=0.0 2023-09-30 10:44:52,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:44:52,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:44:52,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-30 10:44:52,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 10:44:58,446 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.615e+02 2.064e+02 2.437e+02 3.000e+02 4.850e+02, threshold=4.873e+02, percent-clipped=5.0 2023-09-30 10:44:58,696 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-30 10:44:58,697 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:45:02,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:45:03,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:45:07,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:45:07,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-30 10:45:09,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:45:09,234 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 10:45:12,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-30 10:45:13,723 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:45:13,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-30 10:45:13,803 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-30 10:45:13,865 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-30 10:45:17,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:45:19,155 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:45:22,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 10:45:25,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 10:45:25,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:45:26,896 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:45:26,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-30 10:45:26,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:45:26,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:45:28,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:45:28,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-30 10:45:30,057 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-30 10:45:30,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-30 10:45:33,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:45:37,880 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:45:39,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-30 10:45:44,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:45:47,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:45:49,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:45:49,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:45:49,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-30 10:45:50,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 10:45:52,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:45:52,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-30 10:45:52,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:45:52,896 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:45:55,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-30 10:45:57,361 INFO [train.py:1039] (1/4) Epoch 20, batch 2150, loss[loss=0.1527, simple_loss=0.2354, pruned_loss=0.03499, over 20080.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2508, pruned_loss=0.05029, over 4700995.42 frames. ], batch size: 44, lr: 5.16e-03, grad_scale: 16.0 2023-09-30 10:45:57,467 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-30 10:45:57,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:45:59,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:45:59,244 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:45:59,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:45:59,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:46:04,127 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=687200.0, ans=0.2 2023-09-30 10:46:05,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 10:46:07,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:46:07,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:46:07,989 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.11 vs. limit=15.0 2023-09-30 10:46:09,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:46:09,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:46:09,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:46:13,877 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:46:15,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:46:15,294 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:46:21,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:46:21,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-30 10:46:26,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:46:28,033 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:46:28,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:46:28,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:46:29,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:46:29,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-30 10:46:31,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:46:31,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:46:31,143 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:46:32,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-30 10:46:35,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:46:35,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:46:35,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:46:37,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 10:46:38,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:46:42,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:46:42,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-30 10:46:46,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:46:46,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-30 10:46:47,731 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:46:49,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:46:51,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:46:52,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:46:54,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 10:46:54,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:46:55,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:46:56,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-30 10:46:58,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-30 10:46:58,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:47:00,930 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-30 10:47:01,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:47:01,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:47:02,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-30 10:47:02,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:47:02,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-30 10:47:02,576 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-30 10:47:02,576 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-30 10:47:02,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-30 10:47:04,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:47:04,303 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:47:04,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:47:05,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:47:07,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 10:47:08,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:47:08,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:47:18,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:47:18,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-30 10:47:20,067 INFO [train.py:1039] (1/4) Epoch 20, batch 2200, loss[loss=0.1438, simple_loss=0.2262, pruned_loss=0.0307, over 24591.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2505, pruned_loss=0.05008, over 4706028.98 frames. ], batch size: 60, lr: 5.16e-03, grad_scale: 16.0 2023-09-30 10:47:20,438 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=687533.3333333334, ans=0.125 2023-09-30 10:47:24,509 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:47:28,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:47:30,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:47:30,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:47:32,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-30 10:47:35,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:47:35,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:47:35,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-30 10:47:40,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-30 10:47:42,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 10:47:45,303 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.463e+02 1.904e+02 2.106e+02 2.500e+02 4.276e+02, threshold=4.212e+02, percent-clipped=0.0 2023-09-30 10:47:45,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-30 10:47:48,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:47:50,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:47:50,589 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:47:53,686 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:47:53,730 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-30 10:47:59,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-30 10:48:01,001 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:48:01,110 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-30 10:48:04,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-30 10:48:06,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:48:09,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:48:09,854 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_positive, batch_count=687733.3333333334, ans=0.05 2023-09-30 10:48:11,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:48:12,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-30 10:48:14,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:48:15,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-30 10:48:16,268 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.17 vs. limit=15.0 2023-09-30 10:48:18,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:48:18,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-30 10:48:18,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:48:21,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:48:23,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:48:23,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:48:23,220 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:48:25,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:48:25,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:48:28,234 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 10:48:31,825 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 10:48:31,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:48:35,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-30 10:48:35,665 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-30 10:48:38,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:48:38,887 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-30 10:48:40,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-30 10:48:41,860 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-30 10:48:43,251 INFO [train.py:1039] (1/4) Epoch 20, batch 2250, loss[loss=0.1859, simple_loss=0.2527, pruned_loss=0.05949, over 23568.00 frames. ], tot_loss[loss=0.1763, simple_loss=0.2512, pruned_loss=0.05066, over 4710594.62 frames. ], batch size: 256, lr: 5.16e-03, grad_scale: 16.0 2023-09-30 10:48:43,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:48:43,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-30 10:48:45,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:48:47,166 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-30 10:48:48,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:48:50,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:48:53,928 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.40 vs. limit=12.0 2023-09-30 10:48:56,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:48:57,962 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:49:01,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:49:03,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:49:03,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:49:07,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-30 10:49:07,149 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:49:07,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:49:08,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-30 10:49:09,046 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:49:10,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:49:11,970 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:49:14,038 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=687933.3333333334, ans=0.04949747468305833 2023-09-30 10:49:17,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:49:18,393 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.85 vs. limit=10.0 2023-09-30 10:49:18,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 10:49:19,001 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-30 10:49:19,215 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=688000.0, ans=0.125 2023-09-30 10:49:20,670 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=688000.0, ans=0.1 2023-09-30 10:49:21,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-30 10:49:21,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:49:23,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:49:25,592 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=688000.0, ans=0.1 2023-09-30 10:49:28,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:49:28,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:49:28,581 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:49:31,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:49:31,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:49:35,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:49:36,123 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.08 vs. limit=15.0 2023-09-30 10:49:36,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:49:38,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:49:42,228 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-30 10:49:44,523 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.45 vs. limit=12.0 2023-09-30 10:49:47,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 10:49:47,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-30 10:49:47,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:49:49,108 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=688133.3333333334, ans=0.125 2023-09-30 10:49:51,345 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=688133.3333333334, ans=0.0 2023-09-30 10:49:52,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 10:49:55,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-30 10:49:55,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-30 10:49:55,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:49:56,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:49:59,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-30 10:50:01,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:50:03,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:50:06,369 INFO [train.py:1039] (1/4) Epoch 20, batch 2300, loss[loss=0.1604, simple_loss=0.2415, pruned_loss=0.03971, over 24340.00 frames. ], tot_loss[loss=0.1778, simple_loss=0.2526, pruned_loss=0.05149, over 4710840.88 frames. ], batch size: 61, lr: 5.16e-03, grad_scale: 16.0 2023-09-30 10:50:10,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:50:10,145 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:50:11,755 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-30 10:50:14,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:50:21,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:50:21,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-30 10:50:23,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:50:23,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:50:23,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-30 10:50:25,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:50:26,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:50:28,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:50:30,982 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 1.830e+02 1.981e+02 2.237e+02 3.602e+02, threshold=3.962e+02, percent-clipped=0.0 2023-09-30 10:50:32,677 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:50:34,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-30 10:50:38,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:50:38,673 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=688333.3333333334, ans=0.125 2023-09-30 10:50:42,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:50:43,014 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:50:46,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:50:47,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:50:52,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:50:54,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 10:50:54,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:50:54,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-30 10:50:57,846 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 10:50:57,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:51:00,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:51:00,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:51:01,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:51:04,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 10:51:04,593 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-30 10:51:04,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-30 10:51:04,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:51:04,719 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:51:06,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-30 10:51:14,313 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:51:17,500 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.70 vs. limit=15.0 2023-09-30 10:51:18,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:51:22,781 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:51:22,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:51:22,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-30 10:51:25,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 10:51:25,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:51:26,180 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=688466.6666666666, ans=0.07 2023-09-30 10:51:27,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 10:51:28,888 INFO [train.py:1039] (1/4) Epoch 20, batch 2350, loss[loss=0.1714, simple_loss=0.2489, pruned_loss=0.04691, over 24652.00 frames. ], tot_loss[loss=0.1788, simple_loss=0.2535, pruned_loss=0.05207, over 4703001.70 frames. ], batch size: 65, lr: 5.15e-03, grad_scale: 16.0 2023-09-30 10:51:28,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-30 10:51:35,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:51:35,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-30 10:51:38,006 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=688533.3333333334, ans=0.1 2023-09-30 10:51:43,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-30 10:51:46,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:51:49,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:51:49,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:51:49,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:51:50,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:51:53,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-30 10:51:56,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:52:01,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-30 10:52:02,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:52:04,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:52:04,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:52:08,098 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:52:10,895 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-30 10:52:10,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:52:13,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:52:13,106 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:52:14,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:52:17,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:52:19,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-30 10:52:20,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:52:24,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:52:24,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:52:25,353 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=688733.3333333334, ans=0.0 2023-09-30 10:52:26,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-30 10:52:26,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:52:29,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-30 10:52:29,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-30 10:52:34,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-30 10:52:36,080 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=688800.0, ans=0.125 2023-09-30 10:52:38,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-30 10:52:38,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:52:38,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-30 10:52:38,963 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-30 10:52:41,002 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-30 10:52:43,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-30 10:52:46,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:52:50,542 INFO [train.py:1039] (1/4) Epoch 20, batch 2400, loss[loss=0.1751, simple_loss=0.249, pruned_loss=0.0506, over 18304.00 frames. ], tot_loss[loss=0.178, simple_loss=0.2531, pruned_loss=0.05147, over 4709396.52 frames. ], batch size: 39, lr: 5.15e-03, grad_scale: 32.0 2023-09-30 10:52:52,116 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:52:55,347 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:52:55,688 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=688866.6666666666, ans=0.125 2023-09-30 10:52:57,010 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:52:57,690 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.77 vs. limit=15.0 2023-09-30 10:52:58,985 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-30 10:52:59,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-30 10:53:08,718 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 10:53:08,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:53:10,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-30 10:53:11,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:53:11,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:53:12,250 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=688933.3333333334, ans=0.125 2023-09-30 10:53:13,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-30 10:53:14,618 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.889e+02 2.089e+02 2.400e+02 4.035e+02, threshold=4.178e+02, percent-clipped=1.0 2023-09-30 10:53:18,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:53:19,041 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=688933.3333333334, ans=0.125 2023-09-30 10:53:22,460 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-30 10:53:27,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-30 10:53:31,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-30 10:53:35,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:53:35,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:53:40,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:53:42,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-30 10:53:42,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:53:50,215 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:53:52,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:53:56,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:53:56,257 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:53:56,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-30 10:53:57,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:53:57,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:53:57,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:53:57,864 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 10:54:01,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:54:02,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:54:02,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-30 10:54:04,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-30 10:54:07,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:54:07,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:54:08,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-30 10:54:09,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-30 10:54:09,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-30 10:54:09,135 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-30 10:54:10,668 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-30 10:54:10,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:54:12,282 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:54:12,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:54:14,367 INFO [train.py:1039] (1/4) Epoch 20, batch 2450, loss[loss=0.1521, simple_loss=0.2373, pruned_loss=0.0334, over 24448.00 frames. ], tot_loss[loss=0.1762, simple_loss=0.2517, pruned_loss=0.05031, over 4702239.79 frames. ], batch size: 63, lr: 5.15e-03, grad_scale: 32.0 2023-09-30 10:54:14,554 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-30 10:54:15,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:54:16,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-30 10:54:16,240 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=689200.0, ans=0.1 2023-09-30 10:54:17,919 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=689200.0, ans=0.125 2023-09-30 10:54:20,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-30 10:54:20,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:54:24,263 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:54:24,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:54:26,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-30 10:54:30,448 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.26 vs. limit=15.0 2023-09-30 10:54:32,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:54:32,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:54:35,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:54:35,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:54:35,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:54:35,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-30 10:54:42,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:54:45,165 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 10:54:45,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:54:49,103 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=689333.3333333334, ans=0.0 2023-09-30 10:54:50,149 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.78 vs. limit=22.5 2023-09-30 10:54:50,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-30 10:54:50,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:54:52,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:54:52,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:54:52,835 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:54:53,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-30 10:54:55,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:54:55,849 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=689333.3333333334, ans=0.125 2023-09-30 10:55:02,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:55:05,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:55:05,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:55:06,873 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:55:06,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:55:08,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:55:08,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-30 10:55:13,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:55:13,633 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:55:15,665 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=689400.0, ans=0.125 2023-09-30 10:55:16,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:55:16,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:55:22,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:55:22,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-30 10:55:25,060 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:55:25,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:55:25,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-30 10:55:25,479 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=689466.6666666666, ans=0.1 2023-09-30 10:55:26,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:55:28,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:55:28,638 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=689466.6666666666, ans=0.1 2023-09-30 10:55:31,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:55:33,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:55:33,254 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:55:36,686 INFO [train.py:1039] (1/4) Epoch 20, batch 2500, loss[loss=0.1678, simple_loss=0.2458, pruned_loss=0.0449, over 24660.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.2516, pruned_loss=0.0501, over 4713521.95 frames. ], batch size: 65, lr: 5.15e-03, grad_scale: 32.0 2023-09-30 10:55:36,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-30 10:55:38,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:55:44,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:55:50,787 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=689533.3333333334, ans=0.125 2023-09-30 10:55:55,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 10:55:55,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:55:55,430 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=689600.0, ans=0.1 2023-09-30 10:55:56,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:55:56,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-30 10:56:01,701 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.945e+02 2.213e+02 2.493e+02 3.500e+02, threshold=4.426e+02, percent-clipped=0.0 2023-09-30 10:56:03,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 10:56:03,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:56:03,873 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=689600.0, ans=0.125 2023-09-30 10:56:05,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-30 10:56:05,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 10:56:05,200 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-30 10:56:07,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:56:07,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:56:08,651 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-30 10:56:08,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:56:10,270 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-30 10:56:10,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:56:15,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:56:16,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:56:20,538 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 10:56:20,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-30 10:56:22,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:56:22,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:56:26,922 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:56:31,987 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:56:35,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:56:40,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-30 10:56:43,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-30 10:56:43,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:56:43,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-30 10:56:46,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:56:46,923 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 10:56:48,431 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-30 10:56:48,432 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-30 10:56:48,441 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-30 10:56:50,336 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=689800.0, ans=0.0 2023-09-30 10:56:53,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:56:55,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-30 10:56:55,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-30 10:56:56,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:56:56,753 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-30 10:56:56,977 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=689800.0, ans=0.1 2023-09-30 10:56:59,587 INFO [train.py:1039] (1/4) Epoch 20, batch 2550, loss[loss=0.1694, simple_loss=0.2496, pruned_loss=0.04459, over 24638.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2515, pruned_loss=0.04981, over 4720333.46 frames. ], batch size: 65, lr: 5.15e-03, grad_scale: 16.0 2023-09-30 10:56:59,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-30 10:57:04,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:57:05,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:57:06,625 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=689866.6666666666, ans=0.0 2023-09-30 10:57:07,541 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:57:10,460 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:57:12,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-30 10:57:12,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-30 10:57:15,365 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-30 10:57:15,543 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:57:19,123 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:57:22,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:57:22,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 10:57:23,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 10:57:24,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:57:24,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:57:27,102 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:57:27,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-30 10:57:28,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-30 10:57:28,576 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:57:28,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-30 10:57:37,594 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=690000.0, ans=0.125 2023-09-30 10:57:43,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:57:46,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:57:48,385 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:57:48,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:57:49,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 10:57:55,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:57:58,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 10:57:58,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:57:58,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:58:00,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:58:00,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-30 10:58:05,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:58:05,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:58:11,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:58:11,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-30 10:58:11,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:58:11,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:58:13,248 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:58:14,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 10:58:15,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:58:20,507 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=690200.0, ans=0.125 2023-09-30 10:58:21,640 INFO [train.py:1039] (1/4) Epoch 20, batch 2600, loss[loss=0.1933, simple_loss=0.2588, pruned_loss=0.0639, over 23726.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2522, pruned_loss=0.05026, over 4721273.53 frames. ], batch size: 232, lr: 5.15e-03, grad_scale: 8.0 2023-09-30 10:58:23,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:58:26,394 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:58:26,740 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=690200.0, ans=0.125 2023-09-30 10:58:26,821 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=690200.0, ans=0.125 2023-09-30 10:58:28,810 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-30 10:58:30,703 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=690200.0, ans=0.0 2023-09-30 10:58:31,806 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-30 10:58:31,831 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 10:58:31,890 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-30 10:58:33,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-30 10:58:33,871 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-30 10:58:38,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:58:38,262 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-30 10:58:38,731 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=690266.6666666666, ans=0.125 2023-09-30 10:58:40,486 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-30 10:58:42,020 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-30 10:58:43,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:58:45,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-30 10:58:46,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-30 10:58:46,803 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-30 10:58:48,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-30 10:58:49,613 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.499e+02 1.870e+02 2.130e+02 2.382e+02 3.027e+02, threshold=4.260e+02, percent-clipped=0.0 2023-09-30 10:58:50,096 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:58:51,220 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-30 10:58:51,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-30 10:58:55,418 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=690333.3333333334, ans=0.0 2023-09-30 10:58:58,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:58:59,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:58:59,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:58:59,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-30 10:59:01,956 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=690333.3333333334, ans=0.125 2023-09-30 10:59:03,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:59:09,274 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-30 10:59:13,158 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=690400.0, ans=0.04949747468305833 2023-09-30 10:59:15,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:59:15,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:59:16,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-30 10:59:16,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:59:16,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:59:18,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-30 10:59:21,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-30 10:59:21,555 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=690400.0, ans=0.125 2023-09-30 10:59:22,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:59:24,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:59:27,913 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-30 10:59:29,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:59:29,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:59:34,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:59:34,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:59:34,251 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-30 10:59:36,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:59:39,332 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:59:39,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:59:43,949 INFO [train.py:1039] (1/4) Epoch 20, batch 2650, loss[loss=0.1772, simple_loss=0.2523, pruned_loss=0.05111, over 23368.00 frames. ], tot_loss[loss=0.1777, simple_loss=0.2529, pruned_loss=0.05126, over 4714169.97 frames. ], batch size: 105, lr: 5.15e-03, grad_scale: 8.0 2023-09-30 10:59:44,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-30 10:59:45,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:59:49,476 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 10:59:54,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-30 10:59:54,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:59:55,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 10:59:57,115 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-30 10:59:57,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:59:59,302 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.29 vs. limit=10.0 2023-09-30 11:00:00,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:00:01,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 11:00:04,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:00:06,265 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:00:07,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-30 11:00:07,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:00:07,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:00:09,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-30 11:00:09,809 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=690600.0, ans=0.125 2023-09-30 11:00:11,735 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-30 11:00:13,621 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=690600.0, ans=0.125 2023-09-30 11:00:16,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:00:18,665 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.01 vs. limit=15.0 2023-09-30 11:00:18,696 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.39 vs. limit=15.0 2023-09-30 11:00:19,524 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-30 11:00:19,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:00:21,023 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-30 11:00:22,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:00:23,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:00:23,032 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:00:25,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:00:30,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-30 11:00:30,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-30 11:00:33,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-30 11:00:36,690 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-30 11:00:38,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:00:38,165 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:00:38,219 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-30 11:00:39,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:00:39,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:00:41,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:00:43,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:00:44,948 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:00:46,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-30 11:00:47,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:00:49,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:00:49,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:00:52,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:00:52,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:00:52,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-30 11:00:55,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:00:57,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:00:57,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:00:57,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-30 11:01:01,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:01:03,137 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:01:05,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:01:06,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:01:06,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-30 11:01:08,190 INFO [train.py:1039] (1/4) Epoch 20, batch 2700, loss[loss=0.1781, simple_loss=0.2656, pruned_loss=0.04532, over 24335.00 frames. ], tot_loss[loss=0.1781, simple_loss=0.2534, pruned_loss=0.05141, over 4727596.47 frames. ], batch size: 74, lr: 5.15e-03, grad_scale: 8.0 2023-09-30 11:01:08,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:01:11,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:01:11,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-30 11:01:12,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:01:14,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 11:01:16,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:01:16,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:01:16,839 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:01:17,106 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=690866.6666666666, ans=0.1 2023-09-30 11:01:19,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:01:19,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:01:19,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:01:19,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-30 11:01:19,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-30 11:01:21,183 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:01:22,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-30 11:01:24,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 11:01:24,262 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:01:28,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:01:30,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-30 11:01:30,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:01:35,956 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.889e+02 2.106e+02 2.437e+02 3.198e+02, threshold=4.211e+02, percent-clipped=0.0 2023-09-30 11:01:37,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:01:38,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:01:44,846 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=691000.0, ans=0.125 2023-09-30 11:01:45,938 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:01:45,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:01:45,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:01:45,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:01:47,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:01:51,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:01:51,417 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:01:51,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:01:53,145 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.min_positive, batch_count=691000.0, ans=0.025 2023-09-30 11:01:56,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:01:56,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:01:57,882 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=691066.6666666666, ans=0.0 2023-09-30 11:01:57,933 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=691066.6666666666, ans=0.1 2023-09-30 11:02:00,975 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:02:04,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:02:04,515 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:02:04,798 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=691066.6666666666, ans=0.125 2023-09-30 11:02:04,809 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=691066.6666666666, ans=0.125 2023-09-30 11:02:08,401 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:02:08,405 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:02:10,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:02:12,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:02:13,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:02:15,142 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:02:16,601 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:02:16,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:02:19,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:02:21,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:02:21,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:02:24,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-30 11:02:26,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:02:28,022 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:02:28,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-30 11:02:29,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-30 11:02:29,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:02:30,956 INFO [train.py:1039] (1/4) Epoch 20, batch 2750, loss[loss=0.1586, simple_loss=0.2407, pruned_loss=0.03824, over 24451.00 frames. ], tot_loss[loss=0.178, simple_loss=0.2533, pruned_loss=0.05137, over 4716954.16 frames. ], batch size: 63, lr: 5.14e-03, grad_scale: 8.0 2023-09-30 11:02:34,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:02:34,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:02:37,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:02:38,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-30 11:02:39,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:02:42,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:02:42,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 11:02:44,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:02:44,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:02:44,166 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-30 11:02:44,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:02:44,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:02:50,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-30 11:02:53,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:02:55,190 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:02:55,309 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:02:55,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-30 11:02:56,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:02:58,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:03:00,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:03:00,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:03:03,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 11:03:03,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 11:03:04,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:03:05,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=691333.3333333334, ans=0.125 2023-09-30 11:03:06,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:03:06,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 11:03:10,426 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.64 vs. limit=15.0 2023-09-30 11:03:13,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:03:14,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 11:03:14,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:03:20,046 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.67 vs. limit=15.0 2023-09-30 11:03:20,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:03:20,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-30 11:03:20,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:03:21,116 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=691400.0, ans=0.125 2023-09-30 11:03:28,042 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=691400.0, ans=0.0 2023-09-30 11:03:28,112 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=691400.0, ans=0.125 2023-09-30 11:03:29,228 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:03:29,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:03:29,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-30 11:03:35,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:03:36,192 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=691466.6666666666, ans=0.1 2023-09-30 11:03:37,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-30 11:03:42,253 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-30 11:03:45,698 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:03:45,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-30 11:03:47,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:03:48,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:03:48,902 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-30 11:03:50,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:03:52,273 INFO [train.py:1039] (1/4) Epoch 20, batch 2800, loss[loss=0.1604, simple_loss=0.2294, pruned_loss=0.04573, over 23668.00 frames. ], tot_loss[loss=0.1776, simple_loss=0.253, pruned_loss=0.05107, over 4715610.61 frames. ], batch size: 149, lr: 5.14e-03, grad_scale: 16.0 2023-09-30 11:03:53,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-30 11:03:53,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:03:53,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:03:55,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-30 11:03:56,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:03:57,653 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:03:59,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:03:59,357 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-30 11:03:59,358 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-30 11:04:01,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:04:02,907 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=691533.3333333334, ans=0.125 2023-09-30 11:04:02,978 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=691533.3333333334, ans=0.125 2023-09-30 11:04:03,013 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=691533.3333333334, ans=0.125 2023-09-30 11:04:04,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 11:04:04,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:04:04,773 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=691533.3333333334, ans=0.1 2023-09-30 11:04:06,221 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=691533.3333333334, ans=0.2 2023-09-30 11:04:06,669 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.73 vs. limit=15.0 2023-09-30 11:04:07,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:04:10,424 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-30 11:04:12,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-30 11:04:13,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-30 11:04:13,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:04:14,670 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.50 vs. limit=15.0 2023-09-30 11:04:15,198 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:04:15,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:04:20,054 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 1.934e+02 2.198e+02 2.522e+02 3.773e+02, threshold=4.395e+02, percent-clipped=0.0 2023-09-30 11:04:20,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:04:20,499 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=691600.0, ans=0.125 2023-09-30 11:04:21,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:04:21,690 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-30 11:04:21,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:04:22,138 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=691600.0, ans=0.125 2023-09-30 11:04:30,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:04:32,455 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:04:34,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:04:35,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:04:36,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:04:43,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:04:43,864 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-30 11:04:44,309 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=691733.3333333334, ans=0.0 2023-09-30 11:04:44,804 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.98 vs. limit=22.5 2023-09-30 11:04:45,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:04:45,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:04:45,653 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:04:49,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:04:50,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:04:55,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:04:57,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:04:57,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:04:57,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 11:04:58,617 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 11:05:00,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:05:00,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:05:00,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-30 11:05:00,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:05:02,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:05:02,388 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:05:02,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-30 11:05:04,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:05:04,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:05:06,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:05:07,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-30 11:05:12,426 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=691800.0, ans=0.0 2023-09-30 11:05:15,441 INFO [train.py:1039] (1/4) Epoch 20, batch 2850, loss[loss=0.1851, simple_loss=0.2509, pruned_loss=0.05961, over 23428.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.2518, pruned_loss=0.05069, over 4717068.77 frames. ], batch size: 285, lr: 5.14e-03, grad_scale: 16.0 2023-09-30 11:05:15,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:05:15,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 11:05:16,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:05:20,042 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:05:23,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:05:23,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:05:25,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:05:28,405 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:05:28,810 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=691866.6666666666, ans=0.2 2023-09-30 11:05:29,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:05:31,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:05:31,592 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-30 11:05:34,910 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=691933.3333333334, ans=0.125 2023-09-30 11:05:38,488 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-30 11:05:38,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:05:40,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-30 11:05:40,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:05:43,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-30 11:05:45,111 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-30 11:05:46,703 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:06:00,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:06:01,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:06:01,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:06:01,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 11:06:01,968 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=692000.0, ans=0.0 2023-09-30 11:06:03,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 11:06:03,184 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:06:06,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:06:06,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-30 11:06:06,769 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=692066.6666666666, ans=0.0 2023-09-30 11:06:06,803 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=692066.6666666666, ans=0.125 2023-09-30 11:06:09,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-30 11:06:09,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:06:09,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:06:11,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:06:12,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:06:12,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:06:16,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:06:18,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:06:19,988 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:06:21,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:06:21,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:06:25,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:06:25,573 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=692133.3333333334, ans=0.125 2023-09-30 11:06:28,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:06:29,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-30 11:06:30,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-30 11:06:33,498 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 11:06:33,900 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=692133.3333333334, ans=0.125 2023-09-30 11:06:34,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:06:34,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-30 11:06:35,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:06:35,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:06:36,499 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:06:36,534 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:06:36,535 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-30 11:06:36,625 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-30 11:06:36,631 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:06:37,965 INFO [train.py:1039] (1/4) Epoch 20, batch 2900, loss[loss=0.1889, simple_loss=0.2659, pruned_loss=0.05595, over 23277.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.2523, pruned_loss=0.05045, over 4722713.80 frames. ], batch size: 93, lr: 5.14e-03, grad_scale: 16.0 2023-09-30 11:06:38,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:06:44,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-30 11:06:44,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:06:44,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:06:45,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-30 11:06:49,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:06:51,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-30 11:06:51,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-30 11:06:52,240 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.38 vs. limit=15.0 2023-09-30 11:06:52,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:06:52,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:06:55,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:06:55,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:06:58,327 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:06:59,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:07:02,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-30 11:07:04,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-30 11:07:05,492 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.493e+02 1.792e+02 1.963e+02 2.139e+02 2.917e+02, threshold=3.926e+02, percent-clipped=0.0 2023-09-30 11:07:05,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-30 11:07:07,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:07:09,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-30 11:07:11,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-30 11:07:14,326 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:07:14,331 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-30 11:07:14,373 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:07:16,094 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:07:16,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-30 11:07:19,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:07:20,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:07:24,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:07:29,655 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:07:31,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-30 11:07:33,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-30 11:07:33,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:07:36,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:07:39,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-30 11:07:40,951 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:07:46,305 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:07:55,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:07:55,410 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:07:57,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-30 11:08:00,651 INFO [train.py:1039] (1/4) Epoch 20, batch 2950, loss[loss=0.1874, simple_loss=0.2644, pruned_loss=0.05524, over 23388.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.2532, pruned_loss=0.05065, over 4720968.94 frames. ], batch size: 93, lr: 5.14e-03, grad_scale: 16.0 2023-09-30 11:08:00,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:08:00,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-30 11:08:01,029 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=692533.3333333334, ans=0.125 2023-09-30 11:08:02,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:08:02,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-30 11:08:08,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:08:09,760 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-30 11:08:11,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:08:11,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:08:14,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:08:14,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:08:15,666 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-30 11:08:15,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-30 11:08:17,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 11:08:17,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:08:22,662 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:08:24,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:08:24,869 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.69 vs. limit=15.0 2023-09-30 11:08:25,922 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=692600.0, ans=0.0 2023-09-30 11:08:27,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:08:27,399 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=692600.0, ans=0.125 2023-09-30 11:08:28,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:08:33,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:08:33,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:08:35,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:08:35,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:08:35,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:08:38,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-30 11:08:39,109 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=692666.6666666666, ans=0.0 2023-09-30 11:08:41,143 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=692666.6666666666, ans=0.125 2023-09-30 11:08:43,886 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-30 11:08:43,949 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-30 11:08:45,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:08:46,888 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-30 11:08:48,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-30 11:08:48,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:08:50,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:08:50,124 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-30 11:08:50,132 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-30 11:08:53,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-30 11:08:55,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:08:55,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:08:58,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:08:59,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:09:01,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:09:01,273 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-30 11:09:01,341 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:09:01,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-30 11:09:08,787 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:09:10,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:09:10,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-30 11:09:11,678 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:09:12,004 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=692800.0, ans=0.0 2023-09-30 11:09:13,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-30 11:09:16,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:09:16,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:09:17,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 11:09:18,590 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:09:19,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 11:09:21,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:09:22,955 INFO [train.py:1039] (1/4) Epoch 20, batch 3000, loss[loss=0.1818, simple_loss=0.2553, pruned_loss=0.05417, over 23335.00 frames. ], tot_loss[loss=0.1769, simple_loss=0.2531, pruned_loss=0.05038, over 4733887.81 frames. ], batch size: 105, lr: 5.14e-03, grad_scale: 16.0 2023-09-30 11:09:22,956 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-30 11:09:37,406 INFO [train.py:1071] (1/4) Epoch 20, validation: loss=0.3156, simple_loss=0.2725, pruned_loss=0.1794, over 1125622.00 frames. 2023-09-30 11:09:37,407 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-30 11:09:37,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:09:37,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-30 11:09:37,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:09:39,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:09:40,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:09:40,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:09:40,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-30 11:09:41,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:09:43,374 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:09:44,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-30 11:09:48,502 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-30 11:09:49,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-30 11:09:51,475 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:09:52,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:09:54,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-30 11:09:54,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:09:59,162 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=692933.3333333334, ans=0.0 2023-09-30 11:10:00,483 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 11:10:05,478 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.484e+02 1.867e+02 2.114e+02 2.609e+02 3.839e+02, threshold=4.228e+02, percent-clipped=0.0 2023-09-30 11:10:10,389 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:10:15,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-30 11:10:17,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:10:21,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:10:22,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:10:23,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:10:26,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:10:26,027 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-30 11:10:27,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-30 11:10:29,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:10:30,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 11:10:32,245 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:10:32,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:10:33,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:10:33,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:10:38,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 11:10:39,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:10:39,044 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:10:40,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:10:42,428 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-30 11:10:42,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:10:42,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:10:42,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:10:44,453 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:10:47,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:10:47,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:10:49,273 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-30 11:10:49,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-30 11:10:49,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:10:51,407 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-30 11:10:51,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:10:54,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-30 11:10:57,364 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:10:58,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 11:10:58,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-30 11:11:00,424 INFO [train.py:1039] (1/4) Epoch 20, batch 3050, loss[loss=0.178, simple_loss=0.2599, pruned_loss=0.04806, over 23856.00 frames. ], tot_loss[loss=0.1778, simple_loss=0.2541, pruned_loss=0.05082, over 4733546.98 frames. ], batch size: 86, lr: 5.14e-03, grad_scale: 16.0 2023-09-30 11:11:00,485 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-30 11:11:00,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 11:11:01,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:11:03,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:11:03,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-30 11:11:03,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:11:03,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:11:06,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-30 11:11:08,901 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:11:10,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:11:10,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 11:11:15,012 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:11:18,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-30 11:11:23,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-30 11:11:23,436 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-30 11:11:25,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:11:28,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:11:29,263 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=693266.6666666666, ans=0.125 2023-09-30 11:11:31,906 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:11:31,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:11:32,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:11:32,454 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=693333.3333333334, ans=0.1 2023-09-30 11:11:38,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:11:38,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:11:40,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:11:40,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:11:40,204 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:11:42,276 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:11:45,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:11:46,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:11:47,022 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=693333.3333333334, ans=0.1 2023-09-30 11:11:48,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-30 11:11:48,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:11:48,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:11:53,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:11:54,443 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:11:54,548 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:11:56,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:11:57,859 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=693400.0, ans=0.125 2023-09-30 11:12:00,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:12:00,752 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:12:06,357 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=693400.0, ans=0.5 2023-09-30 11:12:09,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:12:10,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:12:10,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:12:12,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:12:12,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 11:12:12,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:12:13,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-30 11:12:15,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:12:15,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:12:17,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-30 11:12:19,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:12:19,366 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:12:23,728 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:12:25,026 INFO [train.py:1039] (1/4) Epoch 20, batch 3100, loss[loss=0.1576, simple_loss=0.2411, pruned_loss=0.03704, over 24430.00 frames. ], tot_loss[loss=0.1778, simple_loss=0.2536, pruned_loss=0.051, over 4736317.18 frames. ], batch size: 66, lr: 5.14e-03, grad_scale: 8.0 2023-09-30 11:12:26,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:12:28,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 11:12:30,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-30 11:12:31,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-30 11:12:33,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-30 11:12:35,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:12:39,046 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:12:39,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:12:42,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-30 11:12:47,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:12:51,640 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=693600.0, ans=0.125 2023-09-30 11:12:52,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-30 11:12:54,594 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=693600.0, ans=0.0 2023-09-30 11:12:55,875 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.886e+02 2.147e+02 2.525e+02 3.564e+02, threshold=4.295e+02, percent-clipped=0.0 2023-09-30 11:12:57,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 11:12:57,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:12:57,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:12:59,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:12:59,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-30 11:13:00,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:13:02,390 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-30 11:13:02,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:13:03,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:13:05,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-30 11:13:06,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:13:12,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:13:13,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-30 11:13:14,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-30 11:13:14,485 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=693733.3333333334, ans=0.2 2023-09-30 11:13:15,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:13:18,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:13:20,687 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:13:20,716 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:13:20,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:13:22,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:13:22,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:13:24,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:13:24,058 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:13:24,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:13:24,072 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 11:13:26,864 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.46 vs. limit=12.0 2023-09-30 11:13:29,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:13:30,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-30 11:13:32,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:13:33,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-30 11:13:35,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:13:35,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:13:35,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-30 11:13:35,677 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=693800.0, ans=0.04949747468305833 2023-09-30 11:13:42,390 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=693800.0, ans=0.2 2023-09-30 11:13:45,374 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=693800.0, ans=0.07 2023-09-30 11:13:47,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-30 11:13:47,726 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=693866.6666666666, ans=0.2 2023-09-30 11:13:49,307 INFO [train.py:1039] (1/4) Epoch 20, batch 3150, loss[loss=0.1547, simple_loss=0.2371, pruned_loss=0.03621, over 24627.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.2507, pruned_loss=0.05055, over 4710927.64 frames. ], batch size: 65, lr: 5.13e-03, grad_scale: 8.0 2023-09-30 11:13:51,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:13:51,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:13:52,847 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:13:52,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:13:52,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-30 11:13:54,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:13:55,261 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=8.66 vs. limit=15.0 2023-09-30 11:13:55,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-30 11:13:57,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-30 11:13:59,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:14:01,603 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-30 11:14:04,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-30 11:14:06,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:14:06,269 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-30 11:14:07,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-30 11:14:09,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-30 11:14:09,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-30 11:14:09,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-30 11:14:09,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:14:09,433 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:14:10,943 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:14:14,489 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-30 11:14:15,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:14:15,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:14:17,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:14:19,145 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=693933.3333333334, ans=0.125 2023-09-30 11:14:20,820 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-30 11:14:24,003 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=694000.0, ans=0.95 2023-09-30 11:14:25,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-30 11:14:25,454 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:14:29,131 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-30 11:14:30,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:14:30,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-30 11:14:33,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-30 11:14:35,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:14:36,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 11:14:36,052 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 11:14:36,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:14:36,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 11:14:37,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-30 11:14:37,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-30 11:14:39,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-30 11:14:40,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 11:14:40,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:14:42,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:14:42,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:14:42,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=694066.6666666666, ans=0.0 2023-09-30 11:14:43,892 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-30 11:14:43,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:14:45,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-30 11:14:45,866 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:14:45,931 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=694066.6666666666, ans=0.125 2023-09-30 11:14:47,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:14:47,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-30 11:14:47,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-30 11:14:50,933 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:14:50,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:14:51,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-30 11:14:52,549 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 11:14:52,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:14:54,485 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=694133.3333333334, ans=0.1 2023-09-30 11:14:56,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:14:57,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:14:57,862 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:15:03,252 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=694133.3333333334, ans=0.125 2023-09-30 11:15:05,822 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:15:05,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:15:08,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-30 11:15:12,655 INFO [train.py:1039] (1/4) Epoch 20, batch 3200, loss[loss=0.1673, simple_loss=0.2412, pruned_loss=0.04671, over 24468.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2501, pruned_loss=0.0504, over 4711258.97 frames. ], batch size: 58, lr: 5.13e-03, grad_scale: 16.0 2023-09-30 11:15:12,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:15:12,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-30 11:15:18,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:15:20,305 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:15:20,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-30 11:15:22,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:15:25,773 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:15:29,840 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.75 vs. limit=22.5 2023-09-30 11:15:30,793 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:15:39,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:15:42,124 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 1.858e+02 2.103e+02 2.505e+02 4.292e+02, threshold=4.206e+02, percent-clipped=0.0 2023-09-30 11:15:50,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-30 11:15:50,560 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=694333.3333333334, ans=0.0 2023-09-30 11:15:51,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:15:54,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-30 11:15:54,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 11:15:54,931 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=694333.3333333334, ans=0.0 2023-09-30 11:15:58,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:15:58,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 11:16:00,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:16:03,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-30 11:16:06,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-30 11:16:08,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-30 11:16:10,252 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=694400.0, ans=0.0 2023-09-30 11:16:12,180 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=694400.0, ans=0.0 2023-09-30 11:16:13,425 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-30 11:16:15,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:16:18,920 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=694466.6666666666, ans=0.125 2023-09-30 11:16:20,470 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=694466.6666666666, ans=0.1 2023-09-30 11:16:22,923 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:16:22,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 11:16:23,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:16:23,088 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-30 11:16:23,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 11:16:26,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:16:27,865 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-30 11:16:29,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-30 11:16:30,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-30 11:16:30,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-30 11:16:33,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:16:35,053 INFO [train.py:1039] (1/4) Epoch 20, batch 3250, loss[loss=0.1794, simple_loss=0.2468, pruned_loss=0.05604, over 23585.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2503, pruned_loss=0.05004, over 4721738.55 frames. ], batch size: 285, lr: 5.13e-03, grad_scale: 16.0 2023-09-30 11:16:36,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-30 11:16:36,617 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-30 11:16:36,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:16:36,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:16:39,551 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-30 11:16:44,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:16:49,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:16:49,530 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=694600.0, ans=0.2 2023-09-30 11:16:50,006 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.81 vs. limit=6.0 2023-09-30 11:16:51,222 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=694600.0, ans=0.0 2023-09-30 11:16:56,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:16:56,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-30 11:16:57,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:16:57,840 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:16:57,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:16:59,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:16:59,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 11:17:01,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=694600.0, ans=0.125 2023-09-30 11:17:02,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:17:02,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:17:02,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:17:04,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:17:04,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:17:04,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:17:10,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:17:11,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:17:14,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:17:14,858 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:17:15,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:17:15,094 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:17:15,120 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:17:19,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-30 11:17:21,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:17:21,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:17:22,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:17:24,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:17:31,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:17:39,426 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:17:39,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:17:39,467 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-30 11:17:39,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:17:39,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 11:17:41,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:17:43,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-30 11:17:44,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-30 11:17:45,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:17:47,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:17:48,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:17:48,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-30 11:17:48,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:17:51,854 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:17:53,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:17:54,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:17:56,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-30 11:17:56,088 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:17:57,622 INFO [train.py:1039] (1/4) Epoch 20, batch 3300, loss[loss=0.1807, simple_loss=0.2608, pruned_loss=0.05026, over 23606.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.251, pruned_loss=0.05015, over 4719778.56 frames. ], batch size: 85, lr: 5.13e-03, grad_scale: 8.0 2023-09-30 11:17:57,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:17:57,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-30 11:17:58,393 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.14 vs. limit=22.5 2023-09-30 11:18:01,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:18:01,416 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-30 11:18:04,381 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-30 11:18:04,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-30 11:18:06,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:18:09,822 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=694866.6666666666, ans=0.0 2023-09-30 11:18:11,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:18:12,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:18:12,605 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:18:12,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 11:18:14,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 11:18:16,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:18:18,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:18:24,437 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-30 11:18:24,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:18:24,562 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:18:26,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:18:27,590 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-30 11:18:27,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:18:29,024 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.513e+02 1.888e+02 2.046e+02 2.323e+02 3.227e+02, threshold=4.091e+02, percent-clipped=0.0 2023-09-30 11:18:29,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 11:18:30,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:18:30,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:18:30,763 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-30 11:18:37,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:18:37,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-30 11:18:39,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:18:39,047 WARNING [train.py:1197] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-30 11:18:39,332 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=695000.0, ans=0.125 2023-09-30 11:18:40,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-30 11:18:41,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:18:42,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:18:44,038 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-30 11:18:45,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-30 11:18:45,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:18:47,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-30 11:18:51,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:18:53,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-30 11:18:53,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:18:56,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:18:56,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:18:56,925 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:18:56,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-30 11:19:00,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:19:01,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:19:01,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:19:03,605 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-30 11:19:03,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-30 11:19:08,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-30 11:19:08,128 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:19:08,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:19:09,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:19:09,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:19:10,007 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=695133.3333333334, ans=0.5 2023-09-30 11:19:11,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 11:19:11,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:19:11,993 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-30 11:19:14,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:19:15,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 11:19:18,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-30 11:19:20,122 INFO [train.py:1039] (1/4) Epoch 20, batch 3350, loss[loss=0.1759, simple_loss=0.2449, pruned_loss=0.05342, over 23579.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.2521, pruned_loss=0.05068, over 4719452.88 frames. ], batch size: 149, lr: 5.13e-03, grad_scale: 8.0 2023-09-30 11:19:20,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:19:20,631 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=695200.0, ans=0.0 2023-09-30 11:19:21,802 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:19:23,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:19:23,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:19:25,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:19:27,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:19:27,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:19:30,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:19:31,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:19:33,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:19:35,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:19:35,928 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=695266.6666666666, ans=0.2 2023-09-30 11:19:37,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:19:38,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:19:40,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:19:40,466 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=695266.6666666666, ans=0.125 2023-09-30 11:19:41,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-30 11:19:43,312 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-30 11:19:44,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:19:48,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-30 11:19:48,470 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-30 11:19:48,629 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:19:48,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:19:50,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:19:51,067 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=695266.6666666666, ans=0.1 2023-09-30 11:19:52,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-30 11:19:52,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:19:52,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:19:55,449 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:19:56,320 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.91 vs. limit=12.0 2023-09-30 11:19:57,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:19:57,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:19:58,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:20:02,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:20:03,978 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=695333.3333333334, ans=0.125 2023-09-30 11:20:05,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:20:06,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:20:10,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:20:10,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:20:12,124 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:20:12,139 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:20:15,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:20:16,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-30 11:20:16,699 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 11:20:16,742 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-30 11:20:16,803 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:20:18,310 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-30 11:20:20,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:20:21,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:20:26,215 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.48 vs. limit=15.0 2023-09-30 11:20:30,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:20:32,049 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-30 11:20:32,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 11:20:33,606 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:20:33,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:20:35,662 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=695466.6666666666, ans=0.2 2023-09-30 11:20:39,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:20:43,110 INFO [train.py:1039] (1/4) Epoch 20, batch 3400, loss[loss=0.1758, simple_loss=0.2578, pruned_loss=0.04684, over 23739.00 frames. ], tot_loss[loss=0.1784, simple_loss=0.2534, pruned_loss=0.05172, over 4715898.10 frames. ], batch size: 85, lr: 5.13e-03, grad_scale: 8.0 2023-09-30 11:20:43,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-30 11:20:43,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 11:20:43,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:20:44,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:20:44,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-30 11:20:46,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:20:46,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-30 11:20:47,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:20:47,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:20:49,242 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-30 11:20:49,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:20:50,789 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-30 11:20:55,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-30 11:20:55,852 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-30 11:20:55,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:21:00,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:21:00,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 11:21:02,469 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:21:04,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-30 11:21:07,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:21:10,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-30 11:21:10,896 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=695600.0, ans=0.125 2023-09-30 11:21:12,958 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.43 vs. limit=15.0 2023-09-30 11:21:14,116 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.519e+02 1.831e+02 2.000e+02 2.171e+02 2.770e+02, threshold=4.000e+02, percent-clipped=0.0 2023-09-30 11:21:15,835 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:21:18,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:21:18,882 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:21:20,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-30 11:21:23,891 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=695666.6666666666, ans=0.0 2023-09-30 11:21:26,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:21:30,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-30 11:21:37,072 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:21:37,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:21:37,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-30 11:21:37,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:21:38,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:21:40,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:21:40,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:21:41,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:21:48,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 11:21:48,398 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:21:53,249 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:21:54,852 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-30 11:21:59,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 11:22:04,513 INFO [train.py:1039] (1/4) Epoch 20, batch 3450, loss[loss=0.169, simple_loss=0.2455, pruned_loss=0.04628, over 24339.00 frames. ], tot_loss[loss=0.1781, simple_loss=0.2535, pruned_loss=0.0514, over 4725386.86 frames. ], batch size: 61, lr: 5.13e-03, grad_scale: 8.0 2023-09-30 11:22:04,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-30 11:22:11,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-30 11:22:11,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:22:13,483 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:22:13,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-30 11:22:15,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:22:18,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:22:20,309 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:22:24,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:22:25,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:22:25,444 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=695933.3333333334, ans=0.1 2023-09-30 11:22:26,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:22:26,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:22:28,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:22:33,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-30 11:22:39,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-30 11:22:41,333 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 11:22:41,403 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:22:43,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:22:50,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-30 11:22:50,625 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=696000.0, ans=0.0 2023-09-30 11:22:51,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 11:22:53,611 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=696066.6666666666, ans=0.2 2023-09-30 11:22:55,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:22:55,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:22:57,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-30 11:22:58,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:23:00,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-30 11:23:00,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:23:03,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:23:06,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:23:07,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-30 11:23:11,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:23:15,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:23:19,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:23:20,900 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:23:26,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:23:26,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:23:26,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:23:26,168 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:23:27,417 INFO [train.py:1039] (1/4) Epoch 20, batch 3500, loss[loss=0.1714, simple_loss=0.2172, pruned_loss=0.06273, over 19447.00 frames. ], tot_loss[loss=0.1765, simple_loss=0.2512, pruned_loss=0.05087, over 4721048.61 frames. ], batch size: 388, lr: 5.13e-03, grad_scale: 8.0 2023-09-30 11:23:28,311 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.70 vs. limit=15.0 2023-09-30 11:23:31,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:23:34,374 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:23:34,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-30 11:23:37,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 11:23:41,019 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-30 11:23:44,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:23:44,077 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-30 11:23:45,819 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=696266.6666666666, ans=0.2 2023-09-30 11:23:49,347 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:23:49,497 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:23:52,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:23:52,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:23:52,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-30 11:23:53,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:23:53,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:23:55,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-30 11:23:58,756 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.890e+02 2.132e+02 2.452e+02 4.334e+02, threshold=4.264e+02, percent-clipped=1.0 2023-09-30 11:23:58,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:23:58,971 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-30 11:24:00,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:24:04,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:24:05,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-30 11:24:05,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:24:07,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:24:10,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:24:10,591 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:24:12,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:24:12,682 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:24:14,214 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-30 11:24:14,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-30 11:24:15,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-30 11:24:15,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:24:17,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:24:18,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:24:19,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 11:24:23,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 11:24:24,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:24:24,469 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=696400.0, ans=0.2 2023-09-30 11:24:25,933 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=696400.0, ans=0.09899494936611666 2023-09-30 11:24:30,674 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:24:30,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-30 11:24:31,016 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=696400.0, ans=0.125 2023-09-30 11:24:32,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-30 11:24:32,102 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:24:35,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:24:35,364 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:24:36,922 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:24:41,813 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-30 11:24:43,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:24:45,254 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:24:45,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-30 11:24:46,938 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-30 11:24:48,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:24:49,896 INFO [train.py:1039] (1/4) Epoch 20, batch 3550, loss[loss=0.1694, simple_loss=0.2378, pruned_loss=0.0505, over 23647.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2501, pruned_loss=0.05043, over 4722754.61 frames. ], batch size: 256, lr: 5.13e-03, grad_scale: 8.0 2023-09-30 11:24:50,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:24:51,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:24:51,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:24:56,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:25:01,546 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=696533.3333333334, ans=0.0 2023-09-30 11:25:05,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:25:07,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 11:25:10,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:25:12,397 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:25:14,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:25:14,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:25:14,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 11:25:17,659 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:25:17,973 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=696600.0, ans=0.0 2023-09-30 11:25:17,974 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=696600.0, ans=0.125 2023-09-30 11:25:19,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:25:19,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:25:19,586 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-30 11:25:21,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 11:25:27,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:25:27,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:25:28,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:25:28,974 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:25:29,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:25:29,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-30 11:25:29,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:25:32,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:25:32,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 11:25:35,990 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=696666.6666666666, ans=0.125 2023-09-30 11:25:40,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:25:40,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:25:42,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:25:44,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-30 11:25:44,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:25:45,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-30 11:25:47,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:25:48,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:25:48,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:25:54,549 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-30 11:25:56,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:25:56,572 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=696800.0, ans=0.125 2023-09-30 11:25:59,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:26:01,020 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-30 11:26:02,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:26:05,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:26:07,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-30 11:26:12,346 INFO [train.py:1039] (1/4) Epoch 20, batch 3600, loss[loss=0.1659, simple_loss=0.2536, pruned_loss=0.03913, over 24507.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2507, pruned_loss=0.05012, over 4727018.70 frames. ], batch size: 66, lr: 5.12e-03, grad_scale: 16.0 2023-09-30 11:26:14,009 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-30 11:26:14,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:26:16,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:26:16,610 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=696866.6666666666, ans=0.0 2023-09-30 11:26:17,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:26:17,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:26:18,100 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=696866.6666666666, ans=0.125 2023-09-30 11:26:19,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:26:21,305 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=696866.6666666666, ans=0.0 2023-09-30 11:26:22,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:26:24,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:26:26,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:26:26,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:26:27,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:26:27,755 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-30 11:26:30,893 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 11:26:32,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:26:34,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:26:39,188 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:26:40,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:26:40,550 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:26:41,948 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-30 11:26:42,055 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:26:42,382 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=696933.3333333334, ans=0.025 2023-09-30 11:26:43,321 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.766e+02 1.915e+02 2.241e+02 3.227e+02, threshold=3.831e+02, percent-clipped=0.0 2023-09-30 11:26:43,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:26:45,167 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:26:45,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:26:47,516 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:26:47,715 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:26:48,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:26:50,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-30 11:26:58,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:26:58,896 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=697000.0, ans=0.0 2023-09-30 11:27:00,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 11:27:00,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-30 11:27:06,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:27:13,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:27:16,488 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:27:23,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:27:23,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:27:23,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-30 11:27:25,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-30 11:27:26,553 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-30 11:27:28,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:27:28,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:27:30,111 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=697133.3333333334, ans=0.0 2023-09-30 11:27:31,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-30 11:27:31,168 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:27:31,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:27:31,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:27:32,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-30 11:27:34,880 INFO [train.py:1039] (1/4) Epoch 20, batch 3650, loss[loss=0.1594, simple_loss=0.235, pruned_loss=0.04194, over 23605.00 frames. ], tot_loss[loss=0.1762, simple_loss=0.2515, pruned_loss=0.05048, over 4732465.39 frames. ], batch size: 149, lr: 5.12e-03, grad_scale: 16.0 2023-09-30 11:27:34,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-30 11:27:36,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:27:38,781 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-30 11:27:42,251 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=697200.0, ans=0.0 2023-09-30 11:27:44,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-30 11:27:47,880 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:27:49,014 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=697200.0, ans=0.125 2023-09-30 11:27:51,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-30 11:27:53,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-30 11:27:56,282 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=697266.6666666666, ans=0.1 2023-09-30 11:27:58,313 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:27:58,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:27:58,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 11:28:01,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-30 11:28:02,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:28:02,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-30 11:28:04,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:28:04,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:28:05,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-30 11:28:07,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 11:28:07,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:28:07,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:28:08,521 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=4.56 vs. limit=12.0 2023-09-30 11:28:09,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:28:12,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-30 11:28:13,032 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-30 11:28:15,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:28:18,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-30 11:28:21,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:28:21,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:28:26,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:28:29,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:28:29,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:28:29,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:28:31,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:28:33,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:28:35,213 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:28:36,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:28:36,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:28:38,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 11:28:39,781 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:28:39,898 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:28:46,599 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-30 11:28:50,516 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:28:50,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:28:52,006 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-30 11:28:53,403 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:28:54,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:28:56,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:28:58,364 INFO [train.py:1039] (1/4) Epoch 20, batch 3700, loss[loss=0.2262, simple_loss=0.2908, pruned_loss=0.08075, over 19452.00 frames. ], tot_loss[loss=0.1779, simple_loss=0.2532, pruned_loss=0.05136, over 4723630.77 frames. ], batch size: 388, lr: 5.12e-03, grad_scale: 8.0 2023-09-30 11:28:58,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-30 11:28:58,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:29:00,176 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 11:29:03,222 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:29:03,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:29:06,870 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:29:06,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-30 11:29:06,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:29:08,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 11:29:08,477 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 11:29:11,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 11:29:16,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:29:17,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:29:19,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:29:19,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:29:20,547 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 11:29:22,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:29:22,505 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=697600.0, ans=0.1 2023-09-30 11:29:24,427 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-30 11:29:31,976 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.879e+02 2.205e+02 2.626e+02 3.954e+02, threshold=4.410e+02, percent-clipped=1.0 2023-09-30 11:29:32,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:29:32,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 11:29:33,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 11:29:33,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-30 11:29:33,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:29:35,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:29:37,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-30 11:29:38,814 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:29:40,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:29:40,709 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=697666.6666666666, ans=0.125 2023-09-30 11:29:44,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:29:44,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 11:29:45,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 11:29:48,908 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:29:48,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-30 11:29:50,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:29:50,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-30 11:29:55,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:29:56,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:30:00,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:30:01,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-30 11:30:03,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:30:03,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-30 11:30:03,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:30:03,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:30:08,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:30:09,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-30 11:30:11,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-30 11:30:11,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:30:11,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:30:13,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:30:14,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:30:15,144 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=697800.0, ans=0.125 2023-09-30 11:30:18,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:30:21,081 INFO [train.py:1039] (1/4) Epoch 20, batch 3750, loss[loss=0.1802, simple_loss=0.2461, pruned_loss=0.05716, over 23889.00 frames. ], tot_loss[loss=0.1787, simple_loss=0.2543, pruned_loss=0.05153, over 4723735.68 frames. ], batch size: 195, lr: 5.12e-03, grad_scale: 8.0 2023-09-30 11:30:21,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:30:21,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:30:24,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-30 11:30:25,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 11:30:27,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-30 11:30:29,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-30 11:30:29,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:30:31,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:30:33,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:30:34,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:30:39,456 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.76 vs. limit=10.0 2023-09-30 11:30:40,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:30:43,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:30:43,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 11:30:46,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:30:49,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:30:51,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-30 11:30:51,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:30:52,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:30:54,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:30:57,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-30 11:31:02,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-30 11:31:02,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:31:04,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:31:04,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:31:07,253 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=698000.0, ans=0.1 2023-09-30 11:31:08,711 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=698000.0, ans=0.125 2023-09-30 11:31:11,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:31:13,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-30 11:31:16,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-30 11:31:19,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:31:23,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:31:23,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:31:26,442 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:31:30,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 11:31:31,716 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:31:31,875 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=698133.3333333334, ans=0.0 2023-09-30 11:31:33,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-30 11:31:34,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 11:31:36,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:31:38,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-30 11:31:44,038 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=698200.0, ans=0.0 2023-09-30 11:31:45,026 INFO [train.py:1039] (1/4) Epoch 20, batch 3800, loss[loss=0.181, simple_loss=0.2474, pruned_loss=0.05725, over 23800.00 frames. ], tot_loss[loss=0.1788, simple_loss=0.2544, pruned_loss=0.0516, over 4706618.47 frames. ], batch size: 212, lr: 5.12e-03, grad_scale: 8.0 2023-09-30 11:31:45,377 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=698200.0, ans=0.125 2023-09-30 11:31:47,151 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:31:50,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:31:52,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 11:31:53,536 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-30 11:31:54,379 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.48 vs. limit=15.0 2023-09-30 11:31:56,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:31:57,852 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:31:59,383 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-30 11:32:00,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 11:32:00,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:32:01,105 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:32:03,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:32:03,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:32:04,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:32:06,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-30 11:32:09,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-30 11:32:09,634 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:32:13,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:32:15,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:32:15,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 11:32:18,370 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.841e+02 1.992e+02 2.236e+02 3.615e+02, threshold=3.984e+02, percent-clipped=0.0 2023-09-30 11:32:18,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-30 11:32:18,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:32:20,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:32:20,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:32:24,440 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=698333.3333333334, ans=0.1 2023-09-30 11:32:27,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 11:32:27,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-30 11:32:28,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:32:36,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:32:41,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:32:45,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-30 11:32:48,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-30 11:32:48,992 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:32:50,699 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=698466.6666666666, ans=0.125 2023-09-30 11:32:51,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:32:53,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:32:55,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-30 11:32:58,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-30 11:32:58,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-30 11:32:59,390 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.11 vs. limit=15.0 2023-09-30 11:33:00,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:33:00,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:33:06,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:33:06,652 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=698533.3333333334, ans=0.1 2023-09-30 11:33:07,647 INFO [train.py:1039] (1/4) Epoch 20, batch 3850, loss[loss=0.1691, simple_loss=0.2251, pruned_loss=0.05649, over 22667.00 frames. ], tot_loss[loss=0.1781, simple_loss=0.2532, pruned_loss=0.05147, over 4704240.84 frames. ], batch size: 322, lr: 5.12e-03, grad_scale: 8.0 2023-09-30 11:33:07,799 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 11:33:12,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:33:14,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-30 11:33:16,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 11:33:17,563 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:33:21,234 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 11:33:23,600 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:33:26,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-30 11:33:28,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-30 11:33:31,991 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=698600.0, ans=0.125 2023-09-30 11:33:34,825 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:33:36,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:33:39,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:33:39,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:33:41,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:33:41,639 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:33:43,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:33:43,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:33:43,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:33:44,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:33:46,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:33:46,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:33:48,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-30 11:33:48,377 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-30 11:33:49,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:33:49,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:33:53,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:33:55,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:33:55,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-30 11:33:58,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-30 11:34:00,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:34:02,453 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-30 11:34:04,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-30 11:34:08,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:34:10,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:34:13,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:34:13,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-30 11:34:16,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-30 11:34:18,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:34:18,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:34:21,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 11:34:21,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:34:23,449 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:34:23,565 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:34:23,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:34:23,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-30 11:34:23,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:34:27,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-30 11:34:27,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:34:27,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:34:28,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:34:29,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:34:30,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:34:32,489 INFO [train.py:1039] (1/4) Epoch 20, batch 3900, loss[loss=0.175, simple_loss=0.2417, pruned_loss=0.05412, over 23420.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.2516, pruned_loss=0.05141, over 4692203.00 frames. ], batch size: 119, lr: 5.12e-03, grad_scale: 8.0 2023-09-30 11:34:32,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:34:32,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:34:32,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:34:32,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-30 11:34:32,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:34:37,115 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:34:38,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 11:34:38,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:34:40,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:34:43,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 11:34:43,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:34:45,474 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=10.79 vs. limit=15.0 2023-09-30 11:34:46,207 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-30 11:34:46,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-30 11:34:47,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:34:49,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-30 11:34:49,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:34:51,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-30 11:34:52,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-30 11:34:58,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:34:59,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:34:59,814 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 11:35:01,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:35:03,684 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=699000.0, ans=0.125 2023-09-30 11:35:04,754 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.560e+02 1.817e+02 2.055e+02 2.286e+02 3.490e+02, threshold=4.109e+02, percent-clipped=0.0 2023-09-30 11:35:06,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:35:08,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:35:10,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:35:10,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:35:11,733 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:35:17,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:35:17,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:35:24,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 11:35:25,844 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:35:26,588 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=8.92 vs. limit=15.0 2023-09-30 11:35:27,950 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.72 vs. limit=15.0 2023-09-30 11:35:37,937 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:35:41,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:35:43,408 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-30 11:35:43,466 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-30 11:35:43,518 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:35:45,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-30 11:35:46,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:35:47,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-30 11:35:49,886 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=699133.3333333334, ans=0.125 2023-09-30 11:35:53,958 INFO [train.py:1039] (1/4) Epoch 20, batch 3950, loss[loss=0.1757, simple_loss=0.2691, pruned_loss=0.04116, over 24329.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.2516, pruned_loss=0.05092, over 4702163.76 frames. ], batch size: 74, lr: 5.12e-03, grad_scale: 8.0 2023-09-30 11:35:57,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:35:57,162 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-30 11:35:58,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:35:59,530 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.07 vs. limit=15.0 2023-09-30 11:36:03,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:36:04,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:36:13,267 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-30 11:36:14,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 11:36:14,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-30 11:36:14,885 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-30 11:36:16,251 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:36:18,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:36:18,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-30 11:36:18,606 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:36:22,376 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-30 11:36:24,033 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=699266.6666666666, ans=0.125 2023-09-30 11:36:25,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:36:25,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 11:36:25,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 11:36:25,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:36:26,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:36:36,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:36:36,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:36:41,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-30 11:36:48,210 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-30 11:36:48,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-30 11:36:48,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:36:50,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:36:50,551 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=699400.0, ans=0.0 2023-09-30 11:36:56,848 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=699400.0, ans=0.0 2023-09-30 11:36:58,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:36:58,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-30 11:36:59,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:36:59,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:36:59,756 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=699466.6666666666, ans=0.125 2023-09-30 11:37:00,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-30 11:37:04,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:37:05,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:37:08,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-30 11:37:12,125 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=699466.6666666666, ans=0.125 2023-09-30 11:37:15,286 INFO [train.py:1039] (1/4) Epoch 20, batch 4000, loss[loss=0.1687, simple_loss=0.2402, pruned_loss=0.04859, over 24348.00 frames. ], tot_loss[loss=0.1771, simple_loss=0.2519, pruned_loss=0.05113, over 4710584.98 frames. ], batch size: 56, lr: 5.11e-03, grad_scale: 16.0 2023-09-30 11:37:17,390 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=699533.3333333334, ans=0.1 2023-09-30 11:37:20,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:37:24,647 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=699533.3333333334, ans=0.125 2023-09-30 11:37:24,764 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=699533.3333333334, ans=0.125 2023-09-30 11:37:27,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:37:31,028 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=699600.0, ans=0.125 2023-09-30 11:37:33,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:37:34,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:37:35,394 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:37:35,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-30 11:37:35,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:37:35,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-30 11:37:35,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:37:35,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-30 11:37:38,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:37:43,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:37:43,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:37:43,265 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:37:43,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:37:43,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-30 11:37:44,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:37:44,997 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-30 11:37:46,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:37:46,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:37:48,531 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.791e+02 1.984e+02 2.297e+02 3.398e+02, threshold=3.968e+02, percent-clipped=0.0 2023-09-30 11:37:50,286 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-30 11:37:50,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 11:37:50,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:37:59,274 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-30 11:37:59,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:38:00,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:38:02,441 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-30 11:38:03,979 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:38:04,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-30 11:38:04,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:38:05,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:38:05,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:38:07,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:38:08,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-30 11:38:08,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:38:10,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-30 11:38:11,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:38:13,195 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-30 11:38:19,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 11:38:22,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 11:38:25,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:38:26,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:38:27,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:38:28,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:38:34,999 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:38:36,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-30 11:38:36,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-30 11:38:37,995 INFO [train.py:1039] (1/4) Epoch 20, batch 4050, loss[loss=0.2249, simple_loss=0.2818, pruned_loss=0.08404, over 19446.00 frames. ], tot_loss[loss=0.1777, simple_loss=0.2529, pruned_loss=0.05121, over 4708183.64 frames. ], batch size: 388, lr: 5.11e-03, grad_scale: 16.0 2023-09-30 11:38:39,718 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 11:38:39,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:38:41,142 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:38:41,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:38:42,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:38:47,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:38:50,512 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:38:52,077 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 11:38:52,396 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=699933.3333333334, ans=0.125 2023-09-30 11:38:55,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:38:55,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:38:59,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:39:02,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:39:04,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 11:39:07,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-30 11:39:07,430 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-30 11:39:09,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:39:14,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-30 11:39:15,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:39:17,621 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=700000.0, ans=0.0 2023-09-30 11:39:20,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:39:21,021 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.95 vs. limit=22.5 2023-09-30 11:39:21,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:39:23,410 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:39:23,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:39:26,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:39:28,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-30 11:39:28,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 11:39:31,778 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:39:33,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-30 11:39:39,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:39:46,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-30 11:39:47,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:39:47,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 11:39:49,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-30 11:39:49,110 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-30 11:39:49,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:39:52,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:39:53,707 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:39:53,748 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:40:00,015 INFO [train.py:1039] (1/4) Epoch 20, batch 4100, loss[loss=0.2477, simple_loss=0.3026, pruned_loss=0.09642, over 19831.00 frames. ], tot_loss[loss=0.1796, simple_loss=0.2548, pruned_loss=0.05223, over 4693741.01 frames. ], batch size: 388, lr: 5.11e-03, grad_scale: 8.0 2023-09-30 11:40:01,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-30 11:40:03,363 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-30 11:40:05,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-30 11:40:07,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-30 11:40:07,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:40:09,249 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:40:09,322 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:40:09,346 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:40:10,789 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-30 11:40:14,416 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:40:14,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:40:14,565 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:40:16,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:40:21,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 11:40:22,560 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:40:22,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:40:22,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-30 11:40:24,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:40:24,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:40:25,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:40:25,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:40:25,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-30 11:40:28,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:40:30,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-30 11:40:31,837 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:40:34,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:40:34,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-30 11:40:36,124 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.811e+02 2.046e+02 2.303e+02 3.809e+02, threshold=4.092e+02, percent-clipped=0.0 2023-09-30 11:40:36,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:40:37,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:40:37,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:40:42,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-30 11:40:43,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-30 11:40:43,622 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:40:46,665 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-30 11:40:48,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:40:48,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:40:51,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:40:58,371 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:41:01,845 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=700400.0, ans=0.05 2023-09-30 11:41:02,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:41:03,324 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=700400.0, ans=0.0 2023-09-30 11:41:04,394 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:41:13,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:41:13,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:41:18,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:41:20,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:41:22,555 INFO [train.py:1039] (1/4) Epoch 20, batch 4150, loss[loss=0.163, simple_loss=0.2375, pruned_loss=0.04426, over 24449.00 frames. ], tot_loss[loss=0.1788, simple_loss=0.2538, pruned_loss=0.05193, over 4702084.41 frames. ], batch size: 58, lr: 5.11e-03, grad_scale: 4.0 2023-09-30 11:41:25,664 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:41:26,017 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=700533.3333333334, ans=0.125 2023-09-30 11:41:27,660 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 11:41:27,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:41:27,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:41:30,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-30 11:41:30,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:41:32,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-30 11:41:32,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-30 11:41:32,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-30 11:41:32,687 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=700533.3333333334, ans=0.2 2023-09-30 11:41:34,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:41:40,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:41:40,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:41:44,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:41:45,488 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:41:46,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-30 11:41:48,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 11:41:48,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:41:50,084 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-30 11:41:55,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:41:59,914 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:42:01,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-30 11:42:02,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-30 11:42:03,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:42:04,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-30 11:42:04,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:42:04,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:42:09,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:42:09,474 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=700666.6666666666, ans=0.2 2023-09-30 11:42:11,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:42:14,426 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=700733.3333333334, ans=0.2 2023-09-30 11:42:15,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-30 11:42:17,574 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=700733.3333333334, ans=0.125 2023-09-30 11:42:18,670 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:42:20,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:42:20,470 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-30 11:42:20,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:42:23,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-30 11:42:25,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:42:26,229 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.13 vs. limit=15.0 2023-09-30 11:42:26,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:42:26,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:42:26,982 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=700800.0, ans=0.125 2023-09-30 11:42:28,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-30 11:42:28,262 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:42:29,578 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-30 11:42:31,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 11:42:34,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-30 11:42:34,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:42:34,954 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:42:34,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 11:42:36,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-30 11:42:36,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:42:36,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 11:42:36,710 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:42:39,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:42:39,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-30 11:42:41,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:42:41,955 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.whiten.whitening_limit, batch_count=700800.0, ans=12.0 2023-09-30 11:42:44,278 INFO [train.py:1039] (1/4) Epoch 20, batch 4200, loss[loss=0.1721, simple_loss=0.2206, pruned_loss=0.06183, over 19497.00 frames. ], tot_loss[loss=0.1776, simple_loss=0.2521, pruned_loss=0.05155, over 4699885.31 frames. ], batch size: 388, lr: 5.11e-03, grad_scale: 8.0 2023-09-30 11:42:46,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:42:46,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-30 11:42:49,993 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 11:42:51,631 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:42:54,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 11:42:54,568 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:42:54,571 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:42:57,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-30 11:42:59,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-30 11:43:00,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:43:02,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:43:05,311 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=700933.3333333334, ans=0.0 2023-09-30 11:43:06,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:43:09,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-30 11:43:11,155 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:43:12,669 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:43:12,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-30 11:43:12,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:43:12,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:43:14,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:43:14,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 11:43:16,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 11:43:17,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-30 11:43:19,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:43:20,937 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.891e+02 2.083e+02 2.448e+02 3.727e+02, threshold=4.165e+02, percent-clipped=0.0 2023-09-30 11:43:25,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-30 11:43:27,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:43:28,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:43:30,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:43:32,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:43:32,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-30 11:43:32,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:43:33,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:43:35,256 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=701066.6666666666, ans=0.125 2023-09-30 11:43:40,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-30 11:43:40,383 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:43:43,763 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=701066.6666666666, ans=0.0 2023-09-30 11:43:46,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:43:49,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-30 11:43:51,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:43:51,528 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=701133.3333333334, ans=0.0 2023-09-30 11:43:56,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 11:43:58,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:43:59,814 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=701133.3333333334, ans=0.1 2023-09-30 11:44:00,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-30 11:44:05,343 INFO [train.py:1039] (1/4) Epoch 20, batch 4250, loss[loss=0.1799, simple_loss=0.2649, pruned_loss=0.04745, over 24643.00 frames. ], tot_loss[loss=0.1769, simple_loss=0.2513, pruned_loss=0.05128, over 4704933.40 frames. ], batch size: 73, lr: 5.11e-03, grad_scale: 8.0 2023-09-30 11:44:06,893 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:44:09,497 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=701200.0, ans=0.125 2023-09-30 11:44:11,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:44:11,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:44:13,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:44:17,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-30 11:44:19,120 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-30 11:44:19,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:44:23,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:44:25,867 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.85 vs. limit=15.0 2023-09-30 11:44:27,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:44:29,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:44:30,488 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:44:32,175 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:44:32,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:44:33,947 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=701266.6666666666, ans=0.1 2023-09-30 11:44:35,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:44:36,874 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:44:37,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:44:40,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:44:41,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:44:43,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-30 11:44:48,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-30 11:44:48,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:44:50,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:44:50,078 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:44:51,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:44:51,588 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:44:51,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:44:53,610 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=701400.0, ans=0.125 2023-09-30 11:44:54,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-30 11:44:54,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:44:59,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:45:01,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:45:02,160 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.44 vs. limit=6.0 2023-09-30 11:45:03,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-30 11:45:03,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:45:03,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-30 11:45:04,709 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:45:06,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-30 11:45:07,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:45:07,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:45:09,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-30 11:45:09,961 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=701466.6666666666, ans=0.0 2023-09-30 11:45:11,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 11:45:12,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-30 11:45:13,152 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=701466.6666666666, ans=0.125 2023-09-30 11:45:18,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:45:21,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:45:23,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:45:23,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:45:25,118 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:45:26,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:45:27,994 INFO [train.py:1039] (1/4) Epoch 20, batch 4300, loss[loss=0.1644, simple_loss=0.2413, pruned_loss=0.04379, over 24420.00 frames. ], tot_loss[loss=0.176, simple_loss=0.251, pruned_loss=0.05047, over 4715527.23 frames. ], batch size: 58, lr: 5.11e-03, grad_scale: 8.0 2023-09-30 11:45:28,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:45:28,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-30 11:45:29,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:45:34,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:45:34,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:45:39,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:45:44,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:45:44,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-30 11:45:44,488 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:45:48,522 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:45:48,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:45:48,595 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-30 11:45:51,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 11:45:54,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 11:45:54,997 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=701600.0, ans=0.125 2023-09-30 11:45:57,753 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-30 11:45:57,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:45:57,819 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-30 11:46:00,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 11:46:03,845 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.479e+02 1.784e+02 1.943e+02 2.161e+02 2.799e+02, threshold=3.885e+02, percent-clipped=0.0 2023-09-30 11:46:03,990 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:46:05,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:46:05,727 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:46:07,192 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:46:08,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:46:10,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:46:10,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-30 11:46:12,391 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-30 11:46:15,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:46:19,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:46:19,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 11:46:20,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:46:20,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:46:20,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-30 11:46:20,488 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-30 11:46:20,610 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-30 11:46:22,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:46:22,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-30 11:46:24,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-30 11:46:27,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:46:29,448 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-30 11:46:30,882 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:46:31,662 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.78 vs. limit=10.0 2023-09-30 11:46:32,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:46:32,419 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:46:35,454 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-30 11:46:35,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 11:46:35,564 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:46:37,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:46:37,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:46:37,283 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:46:40,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:46:43,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:46:44,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:46:44,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:46:49,815 INFO [train.py:1039] (1/4) Epoch 20, batch 4350, loss[loss=0.1878, simple_loss=0.2579, pruned_loss=0.05887, over 22752.00 frames. ], tot_loss[loss=0.1765, simple_loss=0.252, pruned_loss=0.05049, over 4720259.19 frames. ], batch size: 322, lr: 5.11e-03, grad_scale: 8.0 2023-09-30 11:46:51,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-30 11:46:51,458 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-30 11:46:53,314 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=701866.6666666666, ans=0.125 2023-09-30 11:46:57,240 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:47:00,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:47:04,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:47:04,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:47:08,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 11:47:09,101 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=701933.3333333334, ans=0.0 2023-09-30 11:47:13,259 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:47:14,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:47:15,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:47:19,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:47:21,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:47:22,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-30 11:47:30,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-30 11:47:30,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:47:31,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:47:36,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:47:39,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-30 11:47:42,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:47:44,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 11:47:47,356 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-30 11:47:48,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:47:48,928 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:47:50,432 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-30 11:47:51,154 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.90 vs. limit=22.5 2023-09-30 11:47:51,856 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-30 11:47:51,864 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:47:51,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:47:53,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:47:53,616 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=702066.6666666666, ans=0.1 2023-09-30 11:47:54,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:47:56,527 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:47:56,596 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:47:58,979 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-30 11:47:58,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:47:58,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:47:59,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:48:00,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-30 11:48:01,929 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-30 11:48:01,936 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-30 11:48:01,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-30 11:48:07,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:48:09,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 11:48:09,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:48:09,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:48:10,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-30 11:48:13,807 INFO [train.py:1039] (1/4) Epoch 20, batch 4400, loss[loss=0.1602, simple_loss=0.2383, pruned_loss=0.041, over 24464.00 frames. ], tot_loss[loss=0.1771, simple_loss=0.2529, pruned_loss=0.05067, over 4719371.20 frames. ], batch size: 58, lr: 5.10e-03, grad_scale: 16.0 2023-09-30 11:48:13,911 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-30 11:48:13,922 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:48:18,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:48:18,688 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:48:20,184 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:48:23,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-30 11:48:23,191 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-30 11:48:23,265 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-30 11:48:23,298 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-30 11:48:24,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 11:48:24,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:48:27,881 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-30 11:48:29,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:48:31,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:48:31,052 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-30 11:48:34,957 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:48:34,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-30 11:48:35,039 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-30 11:48:36,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-30 11:48:38,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-30 11:48:38,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-30 11:48:38,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:48:40,995 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:48:43,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:48:43,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:48:45,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-30 11:48:45,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-30 11:48:47,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:48:49,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:48:49,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:48:49,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:48:49,478 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=702333.3333333334, ans=0.0 2023-09-30 11:48:50,489 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.844e+02 2.019e+02 2.293e+02 3.220e+02, threshold=4.037e+02, percent-clipped=0.0 2023-09-30 11:48:50,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:48:50,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-30 11:48:52,219 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-30 11:48:55,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:49:01,504 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:49:04,603 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-30 11:49:09,542 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:49:13,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:49:14,776 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 11:49:14,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-30 11:49:16,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:49:16,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-30 11:49:16,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:49:18,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-30 11:49:23,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-30 11:49:26,693 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=702466.6666666666, ans=0.125 2023-09-30 11:49:27,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-30 11:49:29,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-30 11:49:29,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:49:29,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-30 11:49:30,834 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:49:33,863 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:49:34,113 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=702533.3333333334, ans=0.1 2023-09-30 11:49:35,355 INFO [train.py:1039] (1/4) Epoch 20, batch 4450, loss[loss=0.1683, simple_loss=0.2568, pruned_loss=0.03992, over 24439.00 frames. ], tot_loss[loss=0.1775, simple_loss=0.2534, pruned_loss=0.05074, over 4731960.63 frames. ], batch size: 69, lr: 5.10e-03, grad_scale: 8.0 2023-09-30 11:49:35,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-30 11:49:35,821 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=702533.3333333334, ans=0.125 2023-09-30 11:49:40,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:49:43,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:49:43,545 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:49:50,246 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:49:50,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:49:50,759 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=702600.0, ans=0.125 2023-09-30 11:49:54,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:49:57,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:50:01,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:50:01,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:50:01,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-30 11:50:01,889 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:50:03,367 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:50:03,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:50:03,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-30 11:50:06,511 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 11:50:10,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:50:10,214 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:50:11,705 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:50:11,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:50:13,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:50:17,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 11:50:17,203 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=702666.6666666666, ans=0.125 2023-09-30 11:50:18,521 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-30 11:50:18,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-30 11:50:18,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:50:18,839 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=702666.6666666666, ans=0.1 2023-09-30 11:50:22,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:50:22,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-30 11:50:29,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-30 11:50:31,115 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=702733.3333333334, ans=0.0 2023-09-30 11:50:32,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:50:32,718 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=702733.3333333334, ans=0.125 2023-09-30 11:50:33,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-30 11:50:33,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:50:33,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:50:33,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:50:33,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:50:36,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:50:41,540 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-30 11:50:41,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-30 11:50:41,769 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=702800.0, ans=0.2 2023-09-30 11:50:43,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 11:50:44,746 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:50:46,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:50:47,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:50:47,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 11:50:50,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-30 11:50:51,173 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=702800.0, ans=0.1 2023-09-30 11:50:54,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-30 11:50:55,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:50:57,881 INFO [train.py:1039] (1/4) Epoch 20, batch 4500, loss[loss=0.1592, simple_loss=0.2312, pruned_loss=0.04357, over 24324.00 frames. ], tot_loss[loss=0.1789, simple_loss=0.2546, pruned_loss=0.05158, over 4721594.67 frames. ], batch size: 56, lr: 5.10e-03, grad_scale: 8.0 2023-09-30 11:51:03,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:51:04,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-30 11:51:04,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-30 11:51:05,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:51:10,245 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:51:10,335 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:51:11,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 11:51:11,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:51:11,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:51:13,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:51:25,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:51:25,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:51:31,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:51:31,224 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:51:33,305 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 11:51:36,925 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.958e+02 2.176e+02 2.653e+02 3.969e+02, threshold=4.352e+02, percent-clipped=0.0 2023-09-30 11:51:38,726 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 11:51:42,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:51:45,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 11:51:47,188 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=703066.6666666666, ans=0.0 2023-09-30 11:51:48,328 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:51:48,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-30 11:51:48,487 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:51:49,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:51:50,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:51:51,490 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:51:55,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:51:55,745 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-30 11:51:55,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 11:51:55,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:52:00,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:52:00,503 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:52:06,613 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:52:06,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-30 11:52:08,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:52:09,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-30 11:52:12,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-30 11:52:12,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-30 11:52:16,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-30 11:52:16,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-30 11:52:17,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:52:20,732 INFO [train.py:1039] (1/4) Epoch 20, batch 4550, loss[loss=0.1653, simple_loss=0.2392, pruned_loss=0.0457, over 24256.00 frames. ], tot_loss[loss=0.1782, simple_loss=0.2529, pruned_loss=0.05178, over 4707113.45 frames. ], batch size: 56, lr: 5.10e-03, grad_scale: 8.0 2023-09-30 11:52:22,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:52:22,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:52:24,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:52:28,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:52:30,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:52:34,143 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:52:34,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:52:34,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:52:36,769 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=703266.6666666666, ans=0.07 2023-09-30 11:52:37,963 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:52:38,041 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:52:41,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:52:44,185 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-30 11:52:45,680 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-30 11:52:45,884 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:52:47,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:52:48,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-30 11:52:51,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-30 11:52:53,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:52:56,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-30 11:52:56,848 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=703333.3333333334, ans=0.1 2023-09-30 11:52:58,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 11:53:01,268 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=703333.3333333334, ans=0.125 2023-09-30 11:53:02,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:53:03,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:53:03,934 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:53:05,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-30 11:53:05,883 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=703333.3333333334, ans=0.0 2023-09-30 11:53:09,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:53:11,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:53:11,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:53:13,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:53:13,389 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=703400.0, ans=0.0 2023-09-30 11:53:14,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-30 11:53:14,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-30 11:53:14,781 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:53:16,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-30 11:53:19,177 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-30 11:53:19,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:53:20,731 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:53:20,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:53:23,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:53:23,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:53:25,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 11:53:25,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-30 11:53:26,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:53:26,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 11:53:26,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-30 11:53:28,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:53:28,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-30 11:53:28,581 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=703466.6666666666, ans=0.0 2023-09-30 11:53:31,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:53:31,266 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:53:35,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:53:35,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:53:35,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-30 11:53:38,643 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:53:38,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-30 11:53:40,326 INFO [train.py:1039] (1/4) Epoch 20, batch 4600, loss[loss=0.1817, simple_loss=0.2672, pruned_loss=0.04812, over 24648.00 frames. ], tot_loss[loss=0.178, simple_loss=0.2529, pruned_loss=0.05156, over 4721887.40 frames. ], batch size: 73, lr: 5.10e-03, grad_scale: 8.0 2023-09-30 11:53:40,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:53:40,884 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=703533.3333333334, ans=0.0 2023-09-30 11:53:42,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:53:42,943 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=703533.3333333334, ans=0.035 2023-09-30 11:53:46,951 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:53:46,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 11:53:47,364 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=703533.3333333334, ans=0.5 2023-09-30 11:53:48,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:53:49,869 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-30 11:53:51,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:53:51,734 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=703533.3333333334, ans=0.125 2023-09-30 11:53:56,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:53:56,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:53:56,425 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=703600.0, ans=0.125 2023-09-30 11:53:57,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:54:04,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-30 11:54:05,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:54:08,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:54:11,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:54:11,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:54:12,265 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=703666.6666666666, ans=0.0 2023-09-30 11:54:18,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-30 11:54:18,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 11:54:19,794 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.438e+02 1.811e+02 2.166e+02 2.771e+02 4.574e+02, threshold=4.333e+02, percent-clipped=1.0 2023-09-30 11:54:19,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:54:26,039 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=703666.6666666666, ans=0.125 2023-09-30 11:54:27,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:54:27,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:54:29,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:54:33,518 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-30 11:54:35,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-30 11:54:39,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:54:41,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:54:42,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:54:42,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 11:54:42,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:54:44,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-30 11:54:44,400 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:54:45,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:54:47,312 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:54:47,446 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:54:48,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:54:49,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-30 11:54:51,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-30 11:54:51,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-30 11:54:51,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:54:53,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:54:53,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:54:55,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:54:55,832 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=703800.0, ans=0.0 2023-09-30 11:55:02,992 INFO [train.py:1039] (1/4) Epoch 20, batch 4650, loss[loss=0.1499, simple_loss=0.2235, pruned_loss=0.03817, over 24460.00 frames. ], tot_loss[loss=0.1775, simple_loss=0.2525, pruned_loss=0.0512, over 4715596.00 frames. ], batch size: 58, lr: 5.10e-03, grad_scale: 8.0 2023-09-30 11:55:06,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:55:09,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:55:09,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:55:09,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:55:09,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:55:09,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:55:12,415 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:55:15,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-30 11:55:18,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:55:20,331 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-30 11:55:21,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:55:21,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-30 11:55:23,245 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:55:23,349 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-30 11:55:23,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-30 11:55:25,416 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:55:25,624 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=703933.3333333334, ans=0.125 2023-09-30 11:55:26,759 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:55:27,674 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.11 vs. limit=12.0 2023-09-30 11:55:28,625 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 11:55:30,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:55:30,264 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-30 11:55:34,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:55:35,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-30 11:55:39,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:55:39,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:55:39,213 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-30 11:55:40,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:55:43,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 11:55:46,903 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:55:52,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:55:54,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:55:56,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:55:56,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 11:56:01,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-30 11:56:01,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-30 11:56:01,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 11:56:01,587 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-30 11:56:03,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:56:10,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:56:10,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:56:10,141 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-30 11:56:10,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:56:12,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:56:12,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:56:14,594 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:56:17,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:56:17,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:56:19,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:56:22,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:56:22,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:56:22,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 11:56:22,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-30 11:56:23,857 INFO [train.py:1039] (1/4) Epoch 20, batch 4700, loss[loss=0.1906, simple_loss=0.2705, pruned_loss=0.05532, over 24030.00 frames. ], tot_loss[loss=0.1771, simple_loss=0.2528, pruned_loss=0.05075, over 4720664.55 frames. ], batch size: 80, lr: 5.10e-03, grad_scale: 8.0 2023-09-30 11:56:24,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:56:25,842 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=704200.0, ans=0.125 2023-09-30 11:56:26,993 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-30 11:56:27,366 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=704200.0, ans=0.125 2023-09-30 11:56:36,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:56:36,220 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:56:37,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:56:39,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:56:41,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 11:56:43,274 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.22 vs. limit=10.0 2023-09-30 11:56:44,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-30 11:56:44,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-30 11:56:48,602 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:56:48,734 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:56:50,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:56:51,986 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=704266.6666666666, ans=0.125 2023-09-30 11:56:54,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:57:00,543 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.486e+02 1.959e+02 2.502e+02 2.880e+02 4.077e+02, threshold=5.005e+02, percent-clipped=0.0 2023-09-30 11:57:00,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:57:02,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 11:57:04,229 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=704333.3333333334, ans=0.125 2023-09-30 11:57:05,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:57:11,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-30 11:57:12,760 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:57:16,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:57:19,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-30 11:57:21,068 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:57:24,247 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:57:24,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-30 11:57:27,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:57:27,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:57:30,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:57:30,414 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 11:57:30,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-30 11:57:31,902 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-30 11:57:33,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:57:35,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:57:35,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:57:36,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-30 11:57:36,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:57:42,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-30 11:57:45,965 INFO [train.py:1039] (1/4) Epoch 20, batch 4750, loss[loss=0.1753, simple_loss=0.2596, pruned_loss=0.04544, over 24627.00 frames. ], tot_loss[loss=0.1775, simple_loss=0.2531, pruned_loss=0.05093, over 4720703.35 frames. ], batch size: 68, lr: 5.10e-03, grad_scale: 8.0 2023-09-30 11:57:46,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:57:46,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:57:51,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:57:51,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:57:55,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-30 11:57:55,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:58:00,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-30 11:58:01,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:58:01,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:58:03,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:58:09,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-30 11:58:14,066 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:58:16,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-30 11:58:16,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:58:18,449 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=704666.6666666666, ans=0.125 2023-09-30 11:58:19,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:58:19,558 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:58:19,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:58:21,483 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-30 11:58:21,490 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-30 11:58:22,039 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=704666.6666666666, ans=0.1 2023-09-30 11:58:24,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-30 11:58:28,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:58:28,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:58:31,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:58:31,608 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-30 11:58:32,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:58:34,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:58:34,988 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=704733.3333333334, ans=0.025 2023-09-30 11:58:36,613 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=704733.3333333334, ans=0.125 2023-09-30 11:58:37,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:58:39,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-30 11:58:39,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-30 11:58:39,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:58:40,877 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:58:40,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:58:42,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 11:58:43,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-30 11:58:48,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-30 11:58:50,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:58:53,770 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:58:53,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-30 11:58:53,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:58:56,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:58:58,415 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:58:58,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:58:59,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:59:00,377 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=704800.0, ans=0.1 2023-09-30 11:59:03,579 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:59:03,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-30 11:59:03,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-30 11:59:05,257 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-30 11:59:08,058 INFO [train.py:1039] (1/4) Epoch 20, batch 4800, loss[loss=0.1816, simple_loss=0.2688, pruned_loss=0.04721, over 24417.00 frames. ], tot_loss[loss=0.1777, simple_loss=0.2534, pruned_loss=0.05104, over 4714034.50 frames. ], batch size: 69, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 11:59:08,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-30 11:59:09,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:59:09,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-30 11:59:11,670 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=704866.6666666666, ans=0.5 2023-09-30 11:59:12,172 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.61 vs. limit=15.0 2023-09-30 11:59:15,804 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:59:17,236 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:59:23,348 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:59:23,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:59:24,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:59:26,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-30 11:59:26,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:59:26,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:59:28,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:59:33,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:59:34,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:59:34,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:59:36,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:59:36,356 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 11:59:36,378 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:59:38,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:59:41,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:59:42,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:59:44,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:59:44,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-30 11:59:45,735 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.894e+02 2.052e+02 2.398e+02 3.146e+02, threshold=4.103e+02, percent-clipped=0.0 2023-09-30 11:59:45,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 11:59:46,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:59:48,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-30 11:59:48,886 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-30 11:59:49,208 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=705000.0, ans=0.0 2023-09-30 11:59:50,333 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:59:50,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:59:51,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:59:51,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:59:51,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:59:53,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 11:59:55,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:59:59,610 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:00:04,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:00:05,759 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:00:07,898 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.16 vs. limit=12.0 2023-09-30 12:00:12,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-30 12:00:12,438 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:00:12,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:00:12,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 12:00:14,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:00:16,077 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=705133.3333333334, ans=10.0 2023-09-30 12:00:18,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:00:20,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:00:20,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:00:21,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:00:21,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 12:00:23,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:00:26,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:00:26,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:00:26,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:00:27,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-30 12:00:29,405 INFO [train.py:1039] (1/4) Epoch 20, batch 4850, loss[loss=0.1758, simple_loss=0.2521, pruned_loss=0.04969, over 23208.00 frames. ], tot_loss[loss=0.1779, simple_loss=0.2541, pruned_loss=0.05086, over 4724191.16 frames. ], batch size: 105, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:00:29,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-30 12:00:29,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:00:29,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:00:33,104 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:00:33,106 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:00:36,184 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:00:36,383 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=705200.0, ans=0.125 2023-09-30 12:00:45,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-30 12:00:47,181 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:00:50,604 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=705266.6666666666, ans=0.125 2023-09-30 12:00:51,868 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:00:52,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 12:00:52,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:00:55,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:00:57,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:00:59,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:00:59,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-30 12:01:01,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:01:06,038 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-30 12:01:06,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 12:01:06,171 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 12:01:06,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-30 12:01:10,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:01:10,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:01:14,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:01:14,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-30 12:01:15,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-30 12:01:15,241 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=705333.3333333334, ans=0.125 2023-09-30 12:01:16,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 12:01:23,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:01:23,657 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-30 12:01:24,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:01:25,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:01:26,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-30 12:01:26,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-30 12:01:26,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:01:28,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-30 12:01:28,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:01:29,830 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:01:31,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-30 12:01:39,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:01:45,208 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:01:45,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:01:49,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-30 12:01:49,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:01:53,023 INFO [train.py:1039] (1/4) Epoch 20, batch 4900, loss[loss=0.1685, simple_loss=0.2565, pruned_loss=0.04022, over 24462.00 frames. ], tot_loss[loss=0.1773, simple_loss=0.2526, pruned_loss=0.051, over 4706325.86 frames. ], batch size: 69, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:01:56,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:01:58,471 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.21 vs. limit=22.5 2023-09-30 12:01:58,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:01:58,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:02:02,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-30 12:02:05,397 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=705533.3333333334, ans=0.125 2023-09-30 12:02:08,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-30 12:02:12,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-30 12:02:13,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-30 12:02:13,875 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-30 12:02:13,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:02:14,336 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=705600.0, ans=0.125 2023-09-30 12:02:15,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:02:15,424 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:02:15,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:02:15,559 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-30 12:02:19,451 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=705600.0, ans=0.1 2023-09-30 12:02:20,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-30 12:02:20,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 12:02:23,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:02:24,068 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=705666.6666666666, ans=0.0 2023-09-30 12:02:25,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-30 12:02:28,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:02:28,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:02:29,516 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.474e+02 1.910e+02 2.109e+02 2.496e+02 4.455e+02, threshold=4.218e+02, percent-clipped=1.0 2023-09-30 12:02:31,671 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:02:31,685 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-30 12:02:33,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:02:33,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:02:33,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-30 12:02:33,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-30 12:02:39,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-30 12:02:41,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-30 12:02:42,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:02:42,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:02:44,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:02:44,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 12:02:44,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:02:45,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-30 12:02:49,398 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:02:51,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-30 12:02:52,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:02:56,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-30 12:02:56,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:02:56,599 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-30 12:02:56,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-30 12:03:03,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:03:05,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 12:03:06,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-30 12:03:06,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 12:03:06,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:03:09,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:03:14,406 INFO [train.py:1039] (1/4) Epoch 20, batch 4950, loss[loss=0.1936, simple_loss=0.2585, pruned_loss=0.0643, over 23754.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.2518, pruned_loss=0.05078, over 4696871.16 frames. ], batch size: 164, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:03:14,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:03:14,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:03:15,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:03:15,982 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-30 12:03:17,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 12:03:21,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:03:21,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 12:03:22,115 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.23 vs. limit=12.0 2023-09-30 12:03:24,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-30 12:03:24,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-30 12:03:24,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-30 12:03:24,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-30 12:03:26,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:03:26,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:03:26,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:03:26,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:03:29,359 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:03:29,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:03:31,533 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:03:32,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:03:34,002 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=14.32 vs. limit=22.5 2023-09-30 12:03:34,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:03:34,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:03:38,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 12:03:43,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:03:46,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 12:03:47,687 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:03:49,081 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:03:50,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:03:53,364 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-30 12:03:53,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-30 12:03:55,198 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=706000.0, ans=0.0 2023-09-30 12:03:56,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:03:58,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:03:58,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-30 12:03:59,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:03:59,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:04:01,177 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-30 12:04:04,138 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.85 vs. limit=15.0 2023-09-30 12:04:04,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:04:06,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-30 12:04:09,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:04:10,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:04:10,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:04:12,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-30 12:04:12,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:04:12,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 12:04:17,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:04:18,197 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=706066.6666666666, ans=0.125 2023-09-30 12:04:19,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:04:19,380 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:04:20,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:04:20,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:04:22,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:04:23,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:04:23,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:04:23,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:04:26,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-30 12:04:33,142 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.56 vs. limit=15.0 2023-09-30 12:04:34,040 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:04:37,585 INFO [train.py:1039] (1/4) Epoch 20, batch 5000, loss[loss=0.1881, simple_loss=0.271, pruned_loss=0.05265, over 24536.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2519, pruned_loss=0.05047, over 4714959.50 frames. ], batch size: 71, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:04:39,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-30 12:04:39,323 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-30 12:04:44,263 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=706200.0, ans=0.1 2023-09-30 12:04:47,083 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:04:47,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:04:47,290 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-30 12:04:48,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-30 12:04:50,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:04:52,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-30 12:04:54,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:04:54,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 12:04:54,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-30 12:04:54,369 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=706266.6666666666, ans=0.1 2023-09-30 12:04:55,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:04:55,622 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:04:57,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-30 12:04:57,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:04:57,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:04:58,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-30 12:05:00,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-30 12:05:02,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:05:02,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-30 12:05:02,120 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 12:05:02,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:05:03,633 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 12:05:03,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-30 12:05:03,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-30 12:05:04,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-30 12:05:04,797 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=706266.6666666666, ans=0.025 2023-09-30 12:05:04,946 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=706266.6666666666, ans=0.125 2023-09-30 12:05:05,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:05:06,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:05:07,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-30 12:05:09,582 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:05:11,194 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:05:11,345 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:05:12,337 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=10.11 vs. limit=10.0 2023-09-30 12:05:12,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-30 12:05:14,523 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-30 12:05:14,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:05:15,915 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.770e+02 2.000e+02 2.327e+02 3.746e+02, threshold=4.000e+02, percent-clipped=0.0 2023-09-30 12:05:16,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:05:19,371 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-30 12:05:22,984 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:05:23,289 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=706333.3333333334, ans=0.1 2023-09-30 12:05:24,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:05:24,543 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:05:29,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-30 12:05:29,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:05:29,212 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:05:30,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:05:32,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-30 12:05:32,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:05:35,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:05:37,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:05:43,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-30 12:05:47,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:05:54,408 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=706466.6666666666, ans=0.125 2023-09-30 12:05:57,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:05:59,568 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:05:59,725 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=706533.3333333334, ans=0.2 2023-09-30 12:06:00,968 INFO [train.py:1039] (1/4) Epoch 20, batch 5050, loss[loss=0.1571, simple_loss=0.236, pruned_loss=0.03908, over 24441.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2521, pruned_loss=0.05031, over 4729053.98 frames. ], batch size: 58, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:06:01,045 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:06:01,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:06:01,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 12:06:01,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:06:01,279 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:06:03,124 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=706533.3333333334, ans=0.0 2023-09-30 12:06:06,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:06:06,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-30 12:06:07,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:06:09,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:06:11,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-30 12:06:12,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-30 12:06:15,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:06:15,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:06:16,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 12:06:18,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:06:19,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-30 12:06:20,063 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=706600.0, ans=0.0 2023-09-30 12:06:23,088 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=706600.0, ans=0.1 2023-09-30 12:06:30,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-30 12:06:30,410 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-30 12:06:31,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:06:31,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-30 12:06:34,377 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:06:37,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:06:37,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:06:37,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:06:37,536 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-30 12:06:38,953 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-30 12:06:39,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:06:40,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:06:43,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:06:45,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-30 12:06:47,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:06:50,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-30 12:06:51,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:06:53,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:06:53,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:06:53,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:06:55,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:06:58,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:06:58,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:06:59,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:06:59,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:06:59,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-30 12:06:59,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:07:00,180 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=706733.3333333334, ans=0.0 2023-09-30 12:07:01,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:07:02,007 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.11 vs. limit=12.0 2023-09-30 12:07:05,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:07:05,971 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-30 12:07:05,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-30 12:07:07,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:07:09,497 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:07:09,555 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-30 12:07:13,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:07:13,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-30 12:07:13,933 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:07:17,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:07:18,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:07:18,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-30 12:07:20,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-30 12:07:24,209 INFO [train.py:1039] (1/4) Epoch 20, batch 5100, loss[loss=0.1796, simple_loss=0.2681, pruned_loss=0.04557, over 24473.00 frames. ], tot_loss[loss=0.1765, simple_loss=0.2526, pruned_loss=0.05019, over 4737675.06 frames. ], batch size: 69, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:07:24,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:07:24,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:07:24,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:07:27,915 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-30 12:07:30,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:07:33,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-30 12:07:33,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-30 12:07:35,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:07:36,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:07:40,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:07:40,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-30 12:07:40,182 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-30 12:07:46,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:07:46,970 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:07:51,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:07:51,781 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=706933.3333333334, ans=0.0 2023-09-30 12:07:54,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-30 12:07:54,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:07:58,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:07:58,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-30 12:07:59,211 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=707000.0, ans=0.125 2023-09-30 12:08:00,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:08:01,551 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.819e+02 2.005e+02 2.241e+02 3.147e+02, threshold=4.010e+02, percent-clipped=0.0 2023-09-30 12:08:01,672 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:08:01,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-30 12:08:03,926 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-30 12:08:05,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:08:05,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-30 12:08:05,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-30 12:08:10,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:08:15,094 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_positive, batch_count=707066.6666666666, ans=0.05 2023-09-30 12:08:18,330 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:08:20,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-30 12:08:21,548 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-30 12:08:21,572 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-30 12:08:23,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-30 12:08:23,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:08:26,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-30 12:08:29,481 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-30 12:08:29,721 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=707133.3333333334, ans=0.125 2023-09-30 12:08:33,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 12:08:35,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-30 12:08:36,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-30 12:08:40,328 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-30 12:08:40,379 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-30 12:08:46,131 INFO [train.py:1039] (1/4) Epoch 20, batch 5150, loss[loss=0.1813, simple_loss=0.2501, pruned_loss=0.05631, over 23775.00 frames. ], tot_loss[loss=0.1778, simple_loss=0.2536, pruned_loss=0.05103, over 4726577.34 frames. ], batch size: 164, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:08:46,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:08:47,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:08:47,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:08:47,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:08:47,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 12:08:49,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:08:51,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-30 12:08:51,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-30 12:08:51,087 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-30 12:08:51,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:08:51,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-30 12:08:53,330 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:08:53,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 12:08:56,375 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:08:57,912 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:08:58,428 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=707200.0, ans=0.0 2023-09-30 12:09:02,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 12:09:02,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-30 12:09:04,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:09:04,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:09:07,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-30 12:09:07,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:09:07,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:09:07,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:09:07,931 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 12:09:08,969 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=707266.6666666666, ans=0.0 2023-09-30 12:09:10,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-30 12:09:11,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:09:11,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 12:09:15,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 12:09:17,003 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-30 12:09:18,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:09:23,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-30 12:09:24,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-30 12:09:29,285 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:09:36,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:09:38,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:09:38,622 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=707400.0, ans=0.0 2023-09-30 12:09:42,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:09:44,676 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:09:46,695 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=707400.0, ans=0.95 2023-09-30 12:09:47,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-30 12:09:49,612 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:09:51,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:09:51,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 12:09:56,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:09:56,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:09:59,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-30 12:10:02,697 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=707466.6666666666, ans=0.125 2023-09-30 12:10:03,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:10:04,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 12:10:05,152 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.77 vs. limit=22.5 2023-09-30 12:10:07,601 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:10:07,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:10:07,912 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=707533.3333333334, ans=0.05 2023-09-30 12:10:09,493 INFO [train.py:1039] (1/4) Epoch 20, batch 5200, loss[loss=0.1831, simple_loss=0.2634, pruned_loss=0.05141, over 23431.00 frames. ], tot_loss[loss=0.1781, simple_loss=0.2539, pruned_loss=0.05112, over 4727887.85 frames. ], batch size: 93, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:10:09,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-30 12:10:11,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-30 12:10:11,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:10:11,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:10:11,459 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=707533.3333333334, ans=0.125 2023-09-30 12:10:14,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:10:17,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:10:18,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:10:19,746 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=707533.3333333334, ans=0.1 2023-09-30 12:10:23,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-30 12:10:24,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:10:25,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:10:28,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:10:29,257 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=707600.0, ans=0.125 2023-09-30 12:10:30,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:10:30,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:10:31,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-30 12:10:33,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 12:10:33,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:10:36,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-30 12:10:37,352 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=707600.0, ans=0.2 2023-09-30 12:10:38,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-30 12:10:38,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-30 12:10:38,898 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=707600.0, ans=0.125 2023-09-30 12:10:40,887 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-30 12:10:40,981 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-30 12:10:44,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-30 12:10:46,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:10:46,085 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-30 12:10:46,096 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:10:47,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:10:47,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:10:48,962 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.872e+02 2.079e+02 2.395e+02 3.722e+02, threshold=4.157e+02, percent-clipped=0.0 2023-09-30 12:10:49,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-30 12:10:50,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:10:52,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:10:54,573 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-30 12:10:55,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-30 12:10:55,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-30 12:11:01,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-30 12:11:01,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 12:11:05,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:11:06,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:11:07,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-30 12:11:09,136 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:11:09,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-30 12:11:09,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:11:10,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 12:11:13,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:11:13,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:11:18,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:11:19,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:11:19,850 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:11:25,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:11:25,316 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-30 12:11:26,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:11:26,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:11:30,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:11:30,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-30 12:11:31,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:11:33,431 INFO [train.py:1039] (1/4) Epoch 20, batch 5250, loss[loss=0.1851, simple_loss=0.2671, pruned_loss=0.05158, over 24429.00 frames. ], tot_loss[loss=0.1775, simple_loss=0.253, pruned_loss=0.05097, over 4709367.14 frames. ], batch size: 77, lr: 5.08e-03, grad_scale: 16.0 2023-09-30 12:11:34,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:11:38,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:11:38,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:11:39,635 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:11:47,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:11:48,502 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.76 vs. limit=15.0 2023-09-30 12:11:49,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:11:49,589 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=707933.3333333334, ans=0.125 2023-09-30 12:11:50,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:11:51,213 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=707933.3333333334, ans=0.125 2023-09-30 12:11:52,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 12:11:54,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-30 12:11:54,599 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:11:56,069 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:12:02,790 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.47 vs. limit=15.0 2023-09-30 12:12:21,206 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=708066.6666666666, ans=0.07 2023-09-30 12:12:23,903 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=708066.6666666666, ans=0.05 2023-09-30 12:12:44,645 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=708133.3333333334, ans=0.0 2023-09-30 12:12:47,672 INFO [train.py:1039] (1/4) Epoch 20, batch 5300, loss[loss=0.1627, simple_loss=0.2471, pruned_loss=0.03914, over 24458.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.2522, pruned_loss=0.0505, over 4721636.14 frames. ], batch size: 66, lr: 5.08e-03, grad_scale: 8.0 2023-09-30 12:13:02,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:13:02,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-30 12:13:02,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-30 12:13:02,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:13:03,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:13:03,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:13:03,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:13:03,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:13:03,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:13:03,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:13:03,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-30 12:13:04,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:13:04,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-30 12:13:04,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-30 12:13:04,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-30 12:13:04,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-30 12:13:05,021 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-30 12:13:05,158 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-30 12:13:05,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:13:05,853 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:13:05,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:13:06,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:13:06,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:13:06,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:13:06,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:13:06,798 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:13:06,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:13:06,985 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:13:06,993 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:13:07,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:13:07,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:13:08,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-30 12:13:08,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:13:09,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:13:09,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-30 12:13:09,062 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-30 12:13:09,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:13:09,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:13:09,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-30 12:13:09,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-30 12:13:09,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-30 12:13:10,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:13:10,564 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:13:10,757 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-30 12:13:10,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-30 12:13:10,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-30 12:13:11,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:13:11,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-30 12:13:11,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-30 12:13:11,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-30 12:13:11,721 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-30 12:13:17,100 INFO [train.py:1039] (1/4) Epoch 21, batch 0, loss[loss=0.1658, simple_loss=0.2476, pruned_loss=0.042, over 24630.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2476, pruned_loss=0.042, over 24630.00 frames. ], batch size: 65, lr: 4.96e-03, grad_scale: 16.0 2023-09-30 12:13:17,101 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-30 12:13:30,291 INFO [train.py:1071] (1/4) Epoch 21, validation: loss=0.2775, simple_loss=0.2715, pruned_loss=0.1418, over 1125622.00 frames. 2023-09-30 12:13:30,292 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-30 12:13:34,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-30 12:13:34,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:13:37,559 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:13:40,687 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:13:42,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:13:42,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:13:42,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-30 12:13:43,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-30 12:13:47,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:13:47,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:13:50,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:13:50,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:13:50,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:13:52,057 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.851e+02 2.011e+02 2.315e+02 3.678e+02, threshold=4.022e+02, percent-clipped=0.0 2023-09-30 12:13:52,225 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:13:53,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-30 12:13:56,756 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:14:03,351 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=708413.3333333334, ans=0.125 2023-09-30 12:14:05,050 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 12:14:05,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:14:07,332 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-30 12:14:12,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:14:12,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:14:15,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:14:19,198 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:14:22,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:14:27,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-30 12:14:27,809 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=708480.0, ans=0.125 2023-09-30 12:14:30,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-30 12:14:30,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:14:30,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:14:32,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:14:32,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:14:33,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-30 12:14:37,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:14:37,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:14:42,723 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:14:44,617 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=708546.6666666666, ans=0.0 2023-09-30 12:14:45,947 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-30 12:14:47,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:14:51,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:14:52,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:14:53,945 INFO [train.py:1039] (1/4) Epoch 21, batch 50, loss[loss=0.1816, simple_loss=0.2655, pruned_loss=0.0488, over 23999.00 frames. ], tot_loss[loss=0.176, simple_loss=0.25, pruned_loss=0.05105, over 1064299.06 frames. ], batch size: 86, lr: 4.96e-03, grad_scale: 16.0 2023-09-30 12:14:54,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-30 12:14:54,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 12:14:54,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:14:57,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:14:57,272 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:14:57,518 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=708613.3333333334, ans=0.125 2023-09-30 12:15:00,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:15:04,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-30 12:15:04,866 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:15:05,237 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=708613.3333333334, ans=0.0 2023-09-30 12:15:11,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-30 12:15:13,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-30 12:15:15,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-30 12:15:17,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:15:17,790 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=708680.0, ans=0.125 2023-09-30 12:15:18,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:15:18,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:15:20,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:15:20,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-30 12:15:21,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 12:15:21,933 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:15:27,614 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=708746.6666666666, ans=0.125 2023-09-30 12:15:33,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:15:34,864 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:15:34,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 12:15:36,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-30 12:15:36,992 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.65 vs. limit=15.0 2023-09-30 12:15:37,969 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:15:39,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 12:15:39,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-30 12:15:40,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:15:42,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-30 12:15:49,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:15:49,464 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:15:50,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:15:51,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:15:51,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:15:56,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-30 12:15:56,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-30 12:15:59,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:15:59,356 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:16:01,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:16:01,862 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=708880.0, ans=0.125 2023-09-30 12:16:02,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:16:02,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-30 12:16:04,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-30 12:16:04,502 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-30 12:16:05,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:16:07,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:16:08,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-30 12:16:08,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-30 12:16:10,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:16:11,739 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:16:13,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-30 12:16:13,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:16:14,845 INFO [train.py:1039] (1/4) Epoch 21, batch 100, loss[loss=0.1708, simple_loss=0.2545, pruned_loss=0.0436, over 24458.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2515, pruned_loss=0.04991, over 1878600.90 frames. ], batch size: 63, lr: 4.96e-03, grad_scale: 8.0 2023-09-30 12:16:16,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:16:18,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:16:18,452 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:16:22,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:16:24,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-30 12:16:24,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:16:30,067 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:16:30,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:16:30,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:16:30,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:16:30,206 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:16:33,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-30 12:16:35,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-30 12:16:36,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:16:36,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:16:36,639 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:16:38,921 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.524e+02 1.766e+02 1.906e+02 2.268e+02 3.553e+02, threshold=3.812e+02, percent-clipped=0.0 2023-09-30 12:16:40,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-30 12:16:42,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:16:42,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:16:42,471 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-30 12:16:45,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 12:16:48,696 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-30 12:16:48,721 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-30 12:16:50,347 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:16:50,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:16:54,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-30 12:16:56,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:16:58,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:17:03,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:17:05,298 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-30 12:17:06,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-30 12:17:08,551 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=709146.6666666666, ans=0.0 2023-09-30 12:17:09,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:17:10,944 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:17:10,989 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=709146.6666666666, ans=0.2 2023-09-30 12:17:12,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:17:12,451 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=709146.6666666666, ans=0.125 2023-09-30 12:17:13,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:17:15,485 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=709146.6666666666, ans=0.0 2023-09-30 12:17:16,616 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:17:18,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:17:19,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:17:21,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:17:22,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:17:24,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:17:24,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:17:24,231 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:17:24,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-30 12:17:24,351 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-30 12:17:24,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:17:25,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:17:26,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:17:26,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:17:26,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 12:17:26,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 12:17:27,580 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-30 12:17:27,590 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:17:29,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:17:30,741 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:17:30,980 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=709213.3333333334, ans=0.0 2023-09-30 12:17:32,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:17:32,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:17:34,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:17:37,851 INFO [train.py:1039] (1/4) Epoch 21, batch 150, loss[loss=0.155, simple_loss=0.2324, pruned_loss=0.03878, over 23129.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.2532, pruned_loss=0.05055, over 2506552.48 frames. ], batch size: 51, lr: 4.95e-03, grad_scale: 8.0 2023-09-30 12:17:37,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:17:37,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:17:38,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:17:41,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:17:41,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:17:43,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:17:44,965 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:17:49,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-30 12:17:49,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-30 12:17:49,533 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-30 12:17:54,270 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:17:54,277 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 12:17:54,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:17:55,931 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:17:55,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:17:55,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:17:57,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:18:00,473 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-30 12:18:02,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:18:09,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:18:09,614 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=709413.3333333334, ans=0.1 2023-09-30 12:18:09,673 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=709413.3333333334, ans=0.125 2023-09-30 12:18:12,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:18:14,019 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-30 12:18:17,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:18:17,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:18:17,299 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:18:20,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:18:21,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:18:22,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:18:22,772 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=709413.3333333334, ans=0.125 2023-09-30 12:18:25,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:18:25,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-30 12:18:31,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:18:31,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:18:33,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:18:33,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:18:33,550 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=709480.0, ans=0.125 2023-09-30 12:18:34,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:18:36,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 12:18:37,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:18:40,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:18:40,313 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=709480.0, ans=0.125 2023-09-30 12:18:42,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:18:42,588 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=709546.6666666666, ans=0.125 2023-09-30 12:18:43,733 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-30 12:18:43,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-30 12:18:45,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:18:45,807 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-30 12:18:50,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:18:53,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:18:53,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:18:57,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-30 12:18:57,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:18:58,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:18:59,025 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=709613.3333333334, ans=0.125 2023-09-30 12:19:00,192 INFO [train.py:1039] (1/4) Epoch 21, batch 200, loss[loss=0.1683, simple_loss=0.2557, pruned_loss=0.04042, over 24573.00 frames. ], tot_loss[loss=0.176, simple_loss=0.253, pruned_loss=0.04944, over 3009588.13 frames. ], batch size: 71, lr: 4.95e-03, grad_scale: 8.0 2023-09-30 12:19:00,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-30 12:19:00,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-30 12:19:02,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:19:03,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:19:03,813 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=709613.3333333334, ans=0.125 2023-09-30 12:19:03,866 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:19:09,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:19:09,376 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:19:09,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:19:11,050 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=709613.3333333334, ans=0.04949747468305833 2023-09-30 12:19:22,965 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.816e+02 2.113e+02 2.431e+02 3.187e+02, threshold=4.227e+02, percent-clipped=0.0 2023-09-30 12:19:25,008 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=709680.0, ans=0.1 2023-09-30 12:19:29,455 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:19:30,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:19:31,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:19:34,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:19:34,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:19:35,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 12:19:35,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 12:19:37,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:19:39,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:19:39,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:19:39,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:19:40,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-30 12:19:40,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 12:19:42,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:19:45,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:19:53,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:20:02,812 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:20:02,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:20:10,581 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:20:11,656 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=709880.0, ans=0.025 2023-09-30 12:20:13,242 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=709880.0, ans=0.125 2023-09-30 12:20:14,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-30 12:20:14,527 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:20:15,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:20:15,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:20:17,375 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:20:20,270 INFO [train.py:1039] (1/4) Epoch 21, batch 250, loss[loss=0.1781, simple_loss=0.2573, pruned_loss=0.04944, over 24450.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.2527, pruned_loss=0.04874, over 3395549.31 frames. ], batch size: 63, lr: 4.95e-03, grad_scale: 8.0 2023-09-30 12:20:20,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-30 12:20:20,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:20:20,484 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-30 12:20:23,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:20:23,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:20:25,095 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:20:26,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:20:26,893 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=709946.6666666666, ans=0.2 2023-09-30 12:20:30,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:20:30,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:20:32,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:20:33,690 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.93 vs. limit=15.0 2023-09-30 12:20:36,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:20:46,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:20:48,793 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:20:49,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:20:56,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-30 12:20:57,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-30 12:20:57,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:20:57,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:20:59,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 12:20:59,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:21:00,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:21:02,950 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:21:05,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-30 12:21:06,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:21:08,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:21:09,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-30 12:21:09,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:21:09,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:21:12,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:21:12,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:21:14,145 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:21:14,586 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=710146.6666666666, ans=0.2 2023-09-30 12:21:15,667 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:21:17,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:21:21,598 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:21:26,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:21:29,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:21:36,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:21:38,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:21:40,700 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-30 12:21:41,370 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.86 vs. limit=22.5 2023-09-30 12:21:42,806 INFO [train.py:1039] (1/4) Epoch 21, batch 300, loss[loss=0.1745, simple_loss=0.2547, pruned_loss=0.04713, over 24340.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.2516, pruned_loss=0.04881, over 3679561.75 frames. ], batch size: 77, lr: 4.95e-03, grad_scale: 8.0 2023-09-30 12:21:43,070 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:21:43,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:21:44,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-30 12:21:44,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-30 12:21:46,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:21:46,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-30 12:21:52,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:21:53,687 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:21:56,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:21:58,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-30 12:21:59,100 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:22:00,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 12:22:00,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-30 12:22:00,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:22:05,047 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.476e+02 1.836e+02 2.048e+02 2.213e+02 3.686e+02, threshold=4.095e+02, percent-clipped=0.0 2023-09-30 12:22:05,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-30 12:22:08,379 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 12:22:08,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-30 12:22:14,203 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-30 12:22:14,277 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:22:17,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:22:18,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:22:18,891 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-30 12:22:18,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:22:21,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:22:24,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:22:24,865 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:22:26,882 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=710413.3333333334, ans=0.125 2023-09-30 12:22:29,539 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-30 12:22:29,546 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-30 12:22:31,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:22:33,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:22:34,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-30 12:22:34,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:22:36,710 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=710480.0, ans=0.125 2023-09-30 12:22:38,163 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:22:41,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:22:41,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-30 12:22:46,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:22:46,756 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 12:22:50,547 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:22:52,089 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:22:53,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-30 12:22:53,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 12:22:54,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:22:55,336 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=710546.6666666666, ans=10.0 2023-09-30 12:22:56,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-30 12:22:56,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:22:56,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:22:58,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:22:58,466 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=710546.6666666666, ans=0.125 2023-09-30 12:22:59,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:22:59,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:23:04,271 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=10.83 vs. limit=15.0 2023-09-30 12:23:04,735 INFO [train.py:1039] (1/4) Epoch 21, batch 350, loss[loss=0.1657, simple_loss=0.2348, pruned_loss=0.04829, over 23845.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.2488, pruned_loss=0.04887, over 3901723.33 frames. ], batch size: 195, lr: 4.95e-03, grad_scale: 8.0 2023-09-30 12:23:04,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:23:04,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 12:23:09,285 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:23:11,144 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=710613.3333333334, ans=0.125 2023-09-30 12:23:14,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:23:17,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:23:17,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:23:21,276 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-30 12:23:21,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:23:23,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-30 12:23:26,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:23:26,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-30 12:23:26,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:23:31,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-30 12:23:32,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:23:32,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:23:34,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:23:37,415 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.32 vs. limit=15.0 2023-09-30 12:23:37,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:23:37,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:23:38,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:23:38,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:23:38,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:23:41,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:23:41,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:23:47,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:23:47,364 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:23:48,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:23:50,184 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:23:56,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-30 12:23:56,302 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:24:01,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:24:01,539 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:24:01,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:24:03,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-30 12:24:03,460 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=710813.3333333334, ans=0.125 2023-09-30 12:24:05,029 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=710813.3333333334, ans=0.125 2023-09-30 12:24:07,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:24:08,918 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-30 12:24:10,466 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-30 12:24:10,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:24:15,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:24:15,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-30 12:24:17,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:24:18,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:24:22,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:24:22,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:24:22,227 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:24:25,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:24:26,557 INFO [train.py:1039] (1/4) Epoch 21, batch 400, loss[loss=0.1873, simple_loss=0.2585, pruned_loss=0.058, over 23676.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2495, pruned_loss=0.0489, over 4087986.81 frames. ], batch size: 232, lr: 4.95e-03, grad_scale: 16.0 2023-09-30 12:24:26,975 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=710946.6666666666, ans=0.2 2023-09-30 12:24:28,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:24:29,230 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=710946.6666666666, ans=0.2 2023-09-30 12:24:30,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:24:30,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-30 12:24:32,608 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:24:32,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:24:34,954 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=710946.6666666666, ans=0.1 2023-09-30 12:24:36,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:24:36,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:24:39,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:24:40,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:24:42,335 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-30 12:24:42,782 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=711013.3333333334, ans=0.5 2023-09-30 12:24:43,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-30 12:24:43,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:24:44,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-30 12:24:44,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:24:49,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:24:49,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:24:49,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-30 12:24:49,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:24:49,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:24:50,709 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.787e+02 1.990e+02 2.388e+02 3.650e+02, threshold=3.979e+02, percent-clipped=0.0 2023-09-30 12:24:50,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:24:50,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:24:52,517 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-30 12:24:52,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-30 12:24:57,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:24:59,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:25:00,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-30 12:25:00,877 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-30 12:25:04,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:25:09,461 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:25:15,633 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-30 12:25:18,233 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.83 vs. limit=15.0 2023-09-30 12:25:18,868 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-30 12:25:21,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-30 12:25:22,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:25:25,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:25:25,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-30 12:25:30,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:25:31,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 12:25:33,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:25:37,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:25:37,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-30 12:25:40,994 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-30 12:25:41,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-30 12:25:42,916 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=711213.3333333334, ans=0.125 2023-09-30 12:25:44,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:25:44,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:25:46,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-30 12:25:47,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 12:25:49,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:25:49,185 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-30 12:25:50,701 INFO [train.py:1039] (1/4) Epoch 21, batch 450, loss[loss=0.1648, simple_loss=0.2482, pruned_loss=0.04067, over 24326.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2503, pruned_loss=0.0492, over 4233936.73 frames. ], batch size: 77, lr: 4.95e-03, grad_scale: 16.0 2023-09-30 12:25:50,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-30 12:25:50,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:25:51,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:25:52,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:25:52,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-30 12:25:54,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:25:54,773 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:25:56,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 12:25:57,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:26:06,961 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=711346.6666666666, ans=0.125 2023-09-30 12:26:08,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:26:10,390 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:26:12,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-30 12:26:12,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-30 12:26:13,946 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=711346.6666666666, ans=0.05 2023-09-30 12:26:15,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-30 12:26:16,140 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=711346.6666666666, ans=0.125 2023-09-30 12:26:16,176 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=711346.6666666666, ans=0.125 2023-09-30 12:26:17,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:26:19,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:26:22,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:26:24,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:26:28,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-30 12:26:28,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-30 12:26:29,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-30 12:26:29,810 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:26:31,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:26:32,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:26:34,454 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-30 12:26:34,468 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-30 12:26:34,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:26:36,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:26:37,606 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-30 12:26:43,018 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-30 12:26:44,398 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:26:44,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-30 12:26:44,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-30 12:26:49,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:26:51,243 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-30 12:26:51,314 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 12:26:54,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-30 12:26:56,143 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=711546.6666666666, ans=0.2 2023-09-30 12:26:58,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:26:59,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-30 12:27:00,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-30 12:27:01,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:27:07,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:27:09,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:27:10,849 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:27:10,908 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-30 12:27:13,758 INFO [train.py:1039] (1/4) Epoch 21, batch 500, loss[loss=0.1568, simple_loss=0.2307, pruned_loss=0.04142, over 24437.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.2509, pruned_loss=0.04937, over 4338719.52 frames. ], batch size: 58, lr: 4.95e-03, grad_scale: 16.0 2023-09-30 12:27:15,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:27:17,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:27:17,287 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:27:17,312 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-30 12:27:19,334 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-30 12:27:19,348 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:27:21,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 12:27:25,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 12:27:28,033 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-30 12:27:30,963 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:27:30,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:27:31,320 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=711680.0, ans=0.125 2023-09-30 12:27:32,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:27:33,233 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=711680.0, ans=0.0 2023-09-30 12:27:37,393 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.852e+02 2.048e+02 2.259e+02 3.327e+02, threshold=4.095e+02, percent-clipped=0.0 2023-09-30 12:27:40,297 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=711680.0, ans=0.025 2023-09-30 12:27:42,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:27:42,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-30 12:27:42,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-30 12:27:42,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:27:44,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-30 12:27:44,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 12:27:44,771 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=711680.0, ans=0.1 2023-09-30 12:27:46,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:27:47,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:27:48,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:27:48,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:27:49,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-30 12:27:52,599 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-30 12:27:56,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:27:56,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:27:57,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:27:59,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:27:59,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-30 12:28:02,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-30 12:28:04,113 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=711813.3333333334, ans=0.125 2023-09-30 12:28:05,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:28:07,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:28:10,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:28:14,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:28:19,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:28:20,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-30 12:28:20,698 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:28:20,733 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:28:23,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-30 12:28:25,822 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-30 12:28:27,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:28:33,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-30 12:28:35,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-30 12:28:35,139 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:28:35,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-30 12:28:35,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:28:35,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:28:37,327 INFO [train.py:1039] (1/4) Epoch 21, batch 550, loss[loss=0.2105, simple_loss=0.2764, pruned_loss=0.0723, over 22661.00 frames. ], tot_loss[loss=0.1771, simple_loss=0.2529, pruned_loss=0.05061, over 4419346.45 frames. ], batch size: 322, lr: 4.95e-03, grad_scale: 16.0 2023-09-30 12:28:37,427 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:28:37,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:28:37,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:28:39,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:28:42,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:28:44,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-30 12:28:44,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:28:48,745 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:28:48,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:28:52,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:28:55,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:28:55,348 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=712013.3333333334, ans=0.125 2023-09-30 12:28:55,390 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=712013.3333333334, ans=0.2 2023-09-30 12:28:57,540 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.33 vs. limit=22.5 2023-09-30 12:28:59,684 WARNING [train.py:1197] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-30 12:28:59,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-30 12:29:02,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:29:05,443 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=712013.3333333334, ans=0.5 2023-09-30 12:29:08,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:29:08,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:29:11,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-30 12:29:15,077 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:29:15,086 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-30 12:29:15,217 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:29:16,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 12:29:20,380 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:29:20,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 12:29:21,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-30 12:29:21,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:29:23,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-30 12:29:26,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-30 12:29:27,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:29:27,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:29:29,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:29:29,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:29:33,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:29:35,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:29:37,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:29:37,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:29:39,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 12:29:41,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 12:29:42,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:29:44,334 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:29:45,773 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:29:47,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-30 12:29:47,339 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-30 12:29:52,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-30 12:29:54,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-30 12:29:58,029 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:29:58,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 12:29:58,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:29:59,457 INFO [train.py:1039] (1/4) Epoch 21, batch 600, loss[loss=0.1655, simple_loss=0.2515, pruned_loss=0.03978, over 24437.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.2532, pruned_loss=0.05059, over 4484516.78 frames. ], batch size: 69, lr: 4.94e-03, grad_scale: 16.0 2023-09-30 12:30:05,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:30:07,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 12:30:09,147 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-30 12:30:11,265 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-30 12:30:12,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:30:14,408 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:30:17,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-30 12:30:17,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:30:21,802 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.803e+02 1.996e+02 2.235e+02 3.480e+02, threshold=3.991e+02, percent-clipped=0.0 2023-09-30 12:30:24,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-30 12:30:24,617 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=712346.6666666666, ans=0.2 2023-09-30 12:30:27,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:30:27,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:30:28,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:30:28,066 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=712346.6666666666, ans=0.1 2023-09-30 12:30:34,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:30:34,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:30:34,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:30:40,525 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:30:44,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:30:44,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:30:44,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:30:53,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-30 12:30:59,611 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:31:00,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-30 12:31:00,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:31:04,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-30 12:31:05,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:31:08,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-30 12:31:08,564 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:31:08,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 12:31:15,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 12:31:15,698 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-30 12:31:17,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:31:19,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:31:21,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:31:22,617 INFO [train.py:1039] (1/4) Epoch 21, batch 650, loss[loss=0.1697, simple_loss=0.2357, pruned_loss=0.05184, over 17215.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2519, pruned_loss=0.05047, over 4519276.49 frames. ], batch size: 37, lr: 4.94e-03, grad_scale: 16.0 2023-09-30 12:31:24,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-30 12:31:24,392 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=712613.3333333334, ans=0.1 2023-09-30 12:31:25,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:31:31,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:31:31,182 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:31:36,371 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:31:40,892 WARNING [train.py:1197] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-30 12:31:43,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:31:43,943 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:31:45,753 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=712680.0, ans=0.0 2023-09-30 12:31:48,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:31:50,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 12:31:54,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:31:54,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:31:56,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 12:31:57,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:31:59,306 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:32:01,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:32:01,084 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-30 12:32:01,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:32:01,148 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:32:04,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:32:05,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:32:07,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:32:07,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-30 12:32:08,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-30 12:32:08,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:32:08,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:32:11,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-30 12:32:11,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:32:14,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 12:32:14,222 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-30 12:32:15,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-30 12:32:15,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:32:15,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:32:15,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:32:15,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:32:18,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:32:25,780 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=712813.3333333334, ans=0.125 2023-09-30 12:32:27,369 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:32:27,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:32:28,921 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:32:31,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:32:31,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 12:32:33,544 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:32:33,915 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=712880.0, ans=0.125 2023-09-30 12:32:41,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 12:32:41,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:32:42,680 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:32:42,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:32:45,051 INFO [train.py:1039] (1/4) Epoch 21, batch 700, loss[loss=0.169, simple_loss=0.2579, pruned_loss=0.04012, over 24531.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.2504, pruned_loss=0.04991, over 4557398.95 frames. ], batch size: 71, lr: 4.94e-03, grad_scale: 16.0 2023-09-30 12:32:46,931 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-30 12:32:48,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-30 12:32:51,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-30 12:32:51,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:32:53,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:32:54,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-30 12:32:59,256 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:33:02,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:33:04,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:33:05,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-30 12:33:07,181 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 1.845e+02 2.008e+02 2.196e+02 3.321e+02, threshold=4.016e+02, percent-clipped=0.0 2023-09-30 12:33:07,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:33:09,319 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=713013.3333333334, ans=0.2 2023-09-30 12:33:10,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:33:12,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 12:33:12,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:33:12,412 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=713013.3333333334, ans=0.125 2023-09-30 12:33:13,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-30 12:33:16,574 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.45 vs. limit=22.5 2023-09-30 12:33:17,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-30 12:33:21,824 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-30 12:33:21,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:33:23,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:33:26,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:33:26,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-30 12:33:31,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:33:31,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:33:34,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-30 12:33:35,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:33:37,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:33:38,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:33:43,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:33:43,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-30 12:33:49,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-30 12:33:49,412 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-30 12:33:52,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:33:54,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:33:55,557 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:33:58,581 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:33:58,591 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-30 12:34:03,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-30 12:34:03,679 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-30 12:34:03,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-30 12:34:05,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-30 12:34:06,590 INFO [train.py:1039] (1/4) Epoch 21, batch 750, loss[loss=0.1926, simple_loss=0.2688, pruned_loss=0.05821, over 23960.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.2502, pruned_loss=0.04994, over 4588052.28 frames. ], batch size: 80, lr: 4.94e-03, grad_scale: 16.0 2023-09-30 12:34:06,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-30 12:34:06,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:34:08,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-30 12:34:10,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:34:11,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-30 12:34:12,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:34:14,998 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:34:16,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-30 12:34:16,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:34:19,849 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:34:19,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 12:34:22,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:34:26,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:34:26,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:34:27,698 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-30 12:34:29,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:34:29,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:34:29,627 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=713346.6666666666, ans=0.2 2023-09-30 12:34:30,918 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:34:33,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-30 12:34:35,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-30 12:34:35,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:34:39,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-30 12:34:39,223 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-30 12:34:40,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-30 12:34:40,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:34:40,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 12:34:43,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:34:44,568 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.90 vs. limit=22.5 2023-09-30 12:34:51,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-30 12:34:51,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:34:51,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 12:34:53,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:34:53,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:34:53,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-30 12:34:55,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:34:56,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-30 12:34:57,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:35:02,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:35:04,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-30 12:35:05,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:35:10,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:35:11,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:35:12,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:35:14,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:35:18,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-30 12:35:18,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:35:18,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:35:22,622 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=713546.6666666666, ans=0.125 2023-09-30 12:35:23,841 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:35:23,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:35:25,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:35:25,696 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-30 12:35:30,171 INFO [train.py:1039] (1/4) Epoch 21, batch 800, loss[loss=0.1965, simple_loss=0.2679, pruned_loss=0.06255, over 23856.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2517, pruned_loss=0.0498, over 4632446.25 frames. ], batch size: 195, lr: 4.94e-03, grad_scale: 32.0 2023-09-30 12:35:36,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:35:36,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:35:40,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:35:40,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:35:40,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:35:41,623 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:35:43,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:35:49,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:35:49,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 12:35:49,473 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=713680.0, ans=0.125 2023-09-30 12:35:49,617 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=713680.0, ans=0.125 2023-09-30 12:35:52,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-30 12:35:52,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:35:53,664 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.427e+02 1.844e+02 2.013e+02 2.212e+02 3.409e+02, threshold=4.025e+02, percent-clipped=0.0 2023-09-30 12:35:53,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:35:53,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:35:53,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:35:55,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-30 12:35:55,451 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:35:55,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-30 12:36:00,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:36:02,109 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:36:05,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:36:05,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:36:06,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:36:06,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:36:13,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:36:13,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 12:36:13,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-30 12:36:15,770 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-30 12:36:15,816 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-30 12:36:15,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:36:15,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:36:18,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:36:18,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:36:24,026 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-30 12:36:25,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-30 12:36:26,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:36:28,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:36:33,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:36:35,640 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=713880.0, ans=0.125 2023-09-30 12:36:36,823 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:36:38,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-30 12:36:38,427 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:36:42,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-30 12:36:48,559 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=713880.0, ans=0.0 2023-09-30 12:36:51,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 12:36:53,263 INFO [train.py:1039] (1/4) Epoch 21, batch 850, loss[loss=0.1878, simple_loss=0.2696, pruned_loss=0.05305, over 24060.00 frames. ], tot_loss[loss=0.1763, simple_loss=0.2525, pruned_loss=0.05002, over 4657591.32 frames. ], batch size: 80, lr: 4.94e-03, grad_scale: 32.0 2023-09-30 12:36:53,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:36:53,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-30 12:36:54,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:36:54,957 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:36:56,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-30 12:36:56,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:36:57,338 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.41 vs. limit=15.0 2023-09-30 12:36:58,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:36:58,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:37:00,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 12:37:01,883 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:37:04,754 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-30 12:37:04,828 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-30 12:37:04,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-30 12:37:06,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 12:37:06,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:37:08,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:37:09,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:37:09,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:37:14,282 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.04 vs. limit=15.0 2023-09-30 12:37:15,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:37:15,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:37:15,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-30 12:37:19,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-30 12:37:19,902 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=714013.3333333334, ans=0.0 2023-09-30 12:37:23,442 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:37:24,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-30 12:37:28,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-30 12:37:29,677 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-30 12:37:33,321 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-30 12:37:33,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:37:33,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:37:33,364 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 12:37:36,422 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:37:37,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:37:37,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-30 12:37:41,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:37:42,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:37:42,647 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:37:44,172 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-30 12:37:44,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:37:46,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-30 12:37:46,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-30 12:37:48,483 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=714146.6666666666, ans=0.125 2023-09-30 12:37:51,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:37:51,437 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:37:52,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:37:52,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:37:52,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:37:54,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:37:55,638 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=714146.6666666666, ans=0.2 2023-09-30 12:37:56,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:37:58,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-30 12:37:59,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:38:01,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-30 12:38:01,809 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=714213.3333333334, ans=0.0 2023-09-30 12:38:01,880 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=714213.3333333334, ans=0.2 2023-09-30 12:38:08,275 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=714213.3333333334, ans=0.125 2023-09-30 12:38:09,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-30 12:38:11,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:38:11,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-30 12:38:12,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:38:12,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:38:14,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-30 12:38:15,650 INFO [train.py:1039] (1/4) Epoch 21, batch 900, loss[loss=0.1711, simple_loss=0.2399, pruned_loss=0.05114, over 24302.00 frames. ], tot_loss[loss=0.177, simple_loss=0.2531, pruned_loss=0.05049, over 4674791.36 frames. ], batch size: 56, lr: 4.94e-03, grad_scale: 32.0 2023-09-30 12:38:19,806 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:38:22,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:38:24,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-30 12:38:27,816 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.93 vs. limit=22.5 2023-09-30 12:38:29,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 12:38:29,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-30 12:38:31,339 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-30 12:38:32,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:38:32,878 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:38:32,959 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 12:38:32,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:38:39,495 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.478e+02 1.822e+02 2.031e+02 2.211e+02 2.952e+02, threshold=4.063e+02, percent-clipped=0.0 2023-09-30 12:38:42,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:38:42,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:38:42,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 12:38:44,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:38:51,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-30 12:38:53,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:38:57,638 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=714413.3333333334, ans=0.0 2023-09-30 12:39:00,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-30 12:39:00,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-30 12:39:01,934 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-30 12:39:02,061 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-30 12:39:09,300 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-30 12:39:09,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:39:10,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 12:39:11,155 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=714480.0, ans=0.0 2023-09-30 12:39:18,101 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.22 vs. limit=15.0 2023-09-30 12:39:19,066 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:39:19,101 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:39:21,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-30 12:39:21,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:39:25,118 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-30 12:39:26,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:39:26,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:39:27,080 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=714546.6666666666, ans=0.0 2023-09-30 12:39:28,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:39:29,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:39:35,117 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-30 12:39:35,178 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-30 12:39:36,804 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-30 12:39:36,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-30 12:39:38,216 INFO [train.py:1039] (1/4) Epoch 21, batch 950, loss[loss=0.1861, simple_loss=0.2659, pruned_loss=0.05315, over 24002.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.2534, pruned_loss=0.05054, over 4688014.60 frames. ], batch size: 80, lr: 4.94e-03, grad_scale: 32.0 2023-09-30 12:39:39,855 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:39:45,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-30 12:39:48,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:39:52,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:39:52,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:39:54,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 12:39:57,220 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-30 12:40:00,607 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=714680.0, ans=0.1 2023-09-30 12:40:01,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:40:01,967 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:40:02,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:40:02,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:40:03,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-30 12:40:03,717 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-30 12:40:05,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:40:07,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-30 12:40:07,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:40:12,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:40:13,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:40:13,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:40:14,926 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=714746.6666666666, ans=0.125 2023-09-30 12:40:15,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-30 12:40:17,628 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 12:40:18,827 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.11 vs. limit=15.0 2023-09-30 12:40:19,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:40:21,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 12:40:21,645 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=714746.6666666666, ans=0.0 2023-09-30 12:40:26,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:40:26,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:40:29,724 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-30 12:40:31,387 WARNING [train.py:1197] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 12:40:31,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 12:40:31,640 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=714813.3333333334, ans=0.2 2023-09-30 12:40:34,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:40:34,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:40:34,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:40:37,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-30 12:40:39,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:40:40,674 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:40:41,076 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=714813.3333333334, ans=0.1 2023-09-30 12:40:42,694 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:40:42,725 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-30 12:40:42,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:40:42,756 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:40:42,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-30 12:40:48,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:40:51,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:40:54,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:40:56,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-30 12:40:56,309 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-30 12:40:59,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:41:02,696 INFO [train.py:1039] (1/4) Epoch 21, batch 1000, loss[loss=0.1922, simple_loss=0.2701, pruned_loss=0.05718, over 24069.00 frames. ], tot_loss[loss=0.177, simple_loss=0.2528, pruned_loss=0.05061, over 4699258.70 frames. ], batch size: 86, lr: 4.93e-03, grad_scale: 16.0 2023-09-30 12:41:02,924 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-30 12:41:03,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:41:03,290 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=714946.6666666666, ans=0.0 2023-09-30 12:41:08,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:41:10,905 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-30 12:41:10,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-30 12:41:15,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:41:15,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:41:18,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:41:21,948 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-30 12:41:24,580 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=715013.3333333334, ans=0.0 2023-09-30 12:41:25,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-30 12:41:27,770 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.831e+02 2.095e+02 2.362e+02 3.753e+02, threshold=4.190e+02, percent-clipped=0.0 2023-09-30 12:41:27,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-30 12:41:28,817 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten.whitening_limit, batch_count=715013.3333333334, ans=22.5 2023-09-30 12:41:29,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:41:30,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-30 12:41:33,875 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-30 12:41:33,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-30 12:41:34,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:41:35,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:41:36,018 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=715080.0, ans=0.1 2023-09-30 12:41:39,321 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=715080.0, ans=0.125 2023-09-30 12:41:40,794 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=715080.0, ans=0.125 2023-09-30 12:41:44,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:41:45,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:41:46,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:41:48,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:41:48,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-30 12:41:48,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:41:50,067 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:41:51,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:41:51,667 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-30 12:41:54,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-30 12:41:56,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-30 12:41:59,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-30 12:42:02,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:42:08,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:42:08,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:42:10,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:42:10,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:42:11,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-30 12:42:13,532 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:42:13,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-30 12:42:13,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-30 12:42:15,240 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:42:15,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:42:18,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:42:21,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 12:42:23,391 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:42:25,001 INFO [train.py:1039] (1/4) Epoch 21, batch 1050, loss[loss=0.1784, simple_loss=0.2627, pruned_loss=0.04704, over 24037.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.2519, pruned_loss=0.04999, over 4719749.92 frames. ], batch size: 80, lr: 4.93e-03, grad_scale: 16.0 2023-09-30 12:42:25,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:42:26,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 12:42:28,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 12:42:29,699 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:42:31,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:42:35,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 12:42:36,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-30 12:42:39,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:42:40,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:42:40,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-30 12:42:41,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:42:43,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-30 12:42:43,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:42:43,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-30 12:42:45,041 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:42:45,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-30 12:42:46,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-30 12:42:46,854 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:42:52,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:42:52,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-30 12:42:52,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:42:53,553 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=7.22 vs. limit=15.0 2023-09-30 12:42:57,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-30 12:42:57,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-30 12:42:59,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:43:00,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-30 12:43:01,145 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:43:03,796 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.26 vs. limit=15.0 2023-09-30 12:43:04,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-30 12:43:06,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:43:10,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 12:43:11,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-30 12:43:13,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:43:13,310 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:43:16,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:43:19,676 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-30 12:43:21,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-30 12:43:21,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-30 12:43:22,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:43:22,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:43:24,335 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-30 12:43:29,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:43:31,161 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=715546.6666666666, ans=0.125 2023-09-30 12:43:32,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:43:32,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:43:32,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:43:32,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:43:37,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:43:37,781 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-30 12:43:40,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:43:40,089 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-30 12:43:41,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-30 12:43:42,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:43:45,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:43:47,962 INFO [train.py:1039] (1/4) Epoch 21, batch 1100, loss[loss=0.1718, simple_loss=0.2432, pruned_loss=0.05021, over 23782.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.2505, pruned_loss=0.04967, over 4710019.35 frames. ], batch size: 212, lr: 4.93e-03, grad_scale: 8.0 2023-09-30 12:43:51,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:43:57,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:43:58,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 12:43:58,763 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:44:00,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-30 12:44:01,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:44:05,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:44:07,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:44:10,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:44:10,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-30 12:44:11,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 12:44:13,695 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.517e+02 1.860e+02 2.075e+02 2.430e+02 4.755e+02, threshold=4.150e+02, percent-clipped=2.0 2023-09-30 12:44:14,605 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:44:14,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:44:17,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:44:19,195 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:44:24,334 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:44:28,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-30 12:44:28,978 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-30 12:44:30,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:44:33,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:44:33,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-30 12:44:35,276 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:44:38,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-30 12:44:38,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:44:38,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:44:38,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:44:38,532 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:44:40,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-30 12:44:42,572 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=715813.3333333334, ans=0.0 2023-09-30 12:44:46,826 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:44:46,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-30 12:44:50,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:44:53,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 12:44:56,638 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-30 12:44:56,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-30 12:44:58,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:45:00,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:45:00,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:45:03,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-30 12:45:04,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:45:04,902 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:45:06,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-30 12:45:06,430 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:45:07,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-30 12:45:08,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:45:08,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:45:09,418 INFO [train.py:1039] (1/4) Epoch 21, batch 1150, loss[loss=0.1625, simple_loss=0.2409, pruned_loss=0.04207, over 24640.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2508, pruned_loss=0.04957, over 4718484.52 frames. ], batch size: 65, lr: 4.93e-03, grad_scale: 8.0 2023-09-30 12:45:09,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-30 12:45:10,065 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=715946.6666666666, ans=0.1 2023-09-30 12:45:13,246 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.30 vs. limit=22.5 2023-09-30 12:45:16,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:45:19,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:45:20,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:45:20,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:45:21,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-30 12:45:23,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:45:25,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-30 12:45:26,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:45:26,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 12:45:27,069 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=716013.3333333334, ans=0.125 2023-09-30 12:45:31,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-30 12:45:33,826 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:45:38,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:45:38,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:45:38,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-30 12:45:38,499 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-30 12:45:38,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:45:43,368 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=716080.0, ans=0.0 2023-09-30 12:45:44,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-30 12:45:44,585 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=716080.0, ans=0.1 2023-09-30 12:45:45,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:45:48,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:45:56,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:45:57,090 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:45:58,978 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.52 vs. limit=22.5 2023-09-30 12:46:01,509 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:46:01,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-30 12:46:02,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:46:03,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:46:12,679 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-30 12:46:14,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:46:22,547 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-30 12:46:25,729 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:46:27,298 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-30 12:46:27,340 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-30 12:46:27,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:46:32,273 INFO [train.py:1039] (1/4) Epoch 21, batch 1200, loss[loss=0.1722, simple_loss=0.2577, pruned_loss=0.04329, over 24298.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2514, pruned_loss=0.04967, over 4712167.62 frames. ], batch size: 74, lr: 4.93e-03, grad_scale: 16.0 2023-09-30 12:46:32,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:46:37,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:46:37,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-30 12:46:40,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:46:40,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:46:40,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:46:42,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:46:43,994 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 12:46:47,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:46:47,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:46:50,466 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-30 12:46:52,166 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-30 12:46:57,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:46:58,564 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.836e+02 2.061e+02 2.415e+02 4.765e+02, threshold=4.121e+02, percent-clipped=1.0 2023-09-30 12:47:00,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:47:01,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:47:03,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:47:03,314 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-30 12:47:03,414 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=716413.3333333334, ans=0.015 2023-09-30 12:47:04,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:47:10,739 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=716413.3333333334, ans=0.0 2023-09-30 12:47:12,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-30 12:47:12,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:47:13,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-30 12:47:14,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:47:19,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-30 12:47:23,195 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=716480.0, ans=0.2 2023-09-30 12:47:24,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-30 12:47:24,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:47:25,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:47:27,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:47:28,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-30 12:47:30,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:47:30,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:47:32,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:47:32,707 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-30 12:47:34,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 12:47:34,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:47:34,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 12:47:37,449 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:47:37,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:47:41,272 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-30 12:47:42,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:47:46,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-30 12:47:49,805 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-30 12:47:52,658 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:47:53,999 INFO [train.py:1039] (1/4) Epoch 21, batch 1250, loss[loss=0.2292, simple_loss=0.2854, pruned_loss=0.0865, over 19265.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.252, pruned_loss=0.04976, over 4716234.48 frames. ], batch size: 388, lr: 4.93e-03, grad_scale: 16.0 2023-09-30 12:47:54,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:47:57,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:47:59,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:48:03,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-30 12:48:05,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:48:07,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:48:07,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-30 12:48:10,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:48:12,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 12:48:14,209 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=716680.0, ans=0.0 2023-09-30 12:48:15,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 12:48:17,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:48:17,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:48:17,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:48:17,761 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=716680.0, ans=0.1 2023-09-30 12:48:21,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-30 12:48:21,543 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=716680.0, ans=0.125 2023-09-30 12:48:25,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 12:48:25,579 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-30 12:48:25,587 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:48:27,200 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:48:28,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:48:31,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:48:33,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-30 12:48:35,591 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=716746.6666666666, ans=0.125 2023-09-30 12:48:38,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-30 12:48:39,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-30 12:48:43,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:48:44,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-30 12:48:45,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:48:45,063 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-30 12:48:45,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:48:45,101 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:48:48,650 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=716813.3333333334, ans=0.2 2023-09-30 12:48:50,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:48:52,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:48:52,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:48:52,386 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=716813.3333333334, ans=0.2 2023-09-30 12:48:53,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-30 12:48:53,766 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-30 12:48:55,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-30 12:48:56,156 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=716813.3333333334, ans=0.0 2023-09-30 12:48:58,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:49:00,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-30 12:49:00,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:49:00,895 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=716880.0, ans=0.125 2023-09-30 12:49:03,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-30 12:49:03,672 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:49:05,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-30 12:49:05,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-30 12:49:06,780 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:49:06,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-30 12:49:06,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:49:10,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-30 12:49:10,790 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:49:12,587 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:49:14,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:49:14,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:49:17,275 INFO [train.py:1039] (1/4) Epoch 21, batch 1300, loss[loss=0.1616, simple_loss=0.243, pruned_loss=0.04014, over 24552.00 frames. ], tot_loss[loss=0.1763, simple_loss=0.2524, pruned_loss=0.05013, over 4708850.81 frames. ], batch size: 60, lr: 4.93e-03, grad_scale: 16.0 2023-09-30 12:49:17,411 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-30 12:49:19,225 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=716946.6666666666, ans=0.1 2023-09-30 12:49:20,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:49:21,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-30 12:49:27,115 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:49:27,568 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=716946.6666666666, ans=0.125 2023-09-30 12:49:28,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-30 12:49:30,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:49:31,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:49:31,814 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:49:33,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-30 12:49:33,585 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=717013.3333333334, ans=0.125 2023-09-30 12:49:37,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:49:39,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-30 12:49:39,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-30 12:49:43,781 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.630e+02 1.834e+02 2.043e+02 2.344e+02 3.785e+02, threshold=4.086e+02, percent-clipped=0.0 2023-09-30 12:49:45,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 12:49:50,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:49:50,421 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:49:50,751 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=717080.0, ans=0.125 2023-09-30 12:49:51,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:49:53,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:49:54,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:49:56,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-30 12:49:56,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-30 12:50:02,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:50:02,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 12:50:05,225 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-30 12:50:05,315 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 12:50:06,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:50:08,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:50:10,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-30 12:50:10,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:50:11,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-30 12:50:13,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:50:16,762 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:50:16,766 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:50:22,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-30 12:50:22,129 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-30 12:50:25,162 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-30 12:50:28,408 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:50:31,452 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-30 12:50:33,023 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:50:39,718 INFO [train.py:1039] (1/4) Epoch 21, batch 1350, loss[loss=0.1703, simple_loss=0.2369, pruned_loss=0.05187, over 23729.00 frames. ], tot_loss[loss=0.1758, simple_loss=0.2518, pruned_loss=0.04986, over 4709446.26 frames. ], batch size: 164, lr: 4.93e-03, grad_scale: 16.0 2023-09-30 12:50:39,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-30 12:50:42,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:50:44,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:50:46,912 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:50:48,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:50:50,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:50:51,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:50:55,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:50:55,537 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=717346.6666666666, ans=0.2 2023-09-30 12:50:56,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-30 12:50:58,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-30 12:50:59,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:51:02,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-30 12:51:04,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:51:04,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:51:04,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-30 12:51:06,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-30 12:51:09,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-30 12:51:12,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:51:12,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-30 12:51:24,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:51:31,289 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.51 vs. limit=15.0 2023-09-30 12:51:33,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:51:35,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:51:35,086 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-30 12:51:38,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:51:41,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-30 12:51:41,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-30 12:51:41,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:51:45,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:51:47,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-30 12:51:48,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:51:49,158 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=717546.6666666666, ans=0.0 2023-09-30 12:51:51,426 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.58 vs. limit=6.0 2023-09-30 12:51:52,434 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:51:54,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-30 12:51:55,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-30 12:52:00,889 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=717613.3333333334, ans=0.0 2023-09-30 12:52:01,858 INFO [train.py:1039] (1/4) Epoch 21, batch 1400, loss[loss=0.1753, simple_loss=0.2656, pruned_loss=0.04254, over 24537.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.2507, pruned_loss=0.0496, over 4710125.29 frames. ], batch size: 71, lr: 4.93e-03, grad_scale: 8.0 2023-09-30 12:52:02,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-30 12:52:02,400 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=717613.3333333334, ans=0.125 2023-09-30 12:52:04,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:52:07,191 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:52:08,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:52:12,175 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-30 12:52:12,880 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.75 vs. limit=15.0 2023-09-30 12:52:13,692 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-30 12:52:24,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:52:28,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:52:30,113 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.457e+02 1.890e+02 2.143e+02 2.435e+02 3.256e+02, threshold=4.286e+02, percent-clipped=0.0 2023-09-30 12:52:30,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:52:30,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-30 12:52:35,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:52:36,613 WARNING [train.py:1197] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 12:52:39,122 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=717746.6666666666, ans=0.025 2023-09-30 12:52:48,797 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:52:48,905 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:52:55,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-30 12:52:55,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:52:55,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:52:57,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:52:57,150 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:52:58,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:52:58,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:52:58,754 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:53:00,514 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=717813.3333333334, ans=0.0 2023-09-30 12:53:01,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-30 12:53:01,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:53:03,744 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=717813.3333333334, ans=0.125 2023-09-30 12:53:06,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:53:09,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:53:17,340 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-30 12:53:18,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 12:53:20,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:53:21,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 12:53:23,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:53:25,611 INFO [train.py:1039] (1/4) Epoch 21, batch 1450, loss[loss=0.1675, simple_loss=0.2536, pruned_loss=0.04067, over 24619.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2497, pruned_loss=0.04942, over 4707596.70 frames. ], batch size: 68, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 12:53:25,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:53:28,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-30 12:53:31,984 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:53:32,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:53:32,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-30 12:53:38,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:53:39,716 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 12:53:41,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:53:41,371 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-30 12:53:42,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 12:53:45,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-30 12:53:46,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:53:46,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:53:46,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-30 12:53:48,066 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:53:49,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-30 12:53:49,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 12:53:49,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:53:51,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:53:53,318 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:53:53,540 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=718013.3333333334, ans=10.0 2023-09-30 12:53:54,952 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=718013.3333333334, ans=0.09899494936611666 2023-09-30 12:53:56,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:54:00,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:54:00,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:54:03,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:54:03,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:54:04,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:54:04,867 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:54:04,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:54:06,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:54:10,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-30 12:54:12,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:54:17,851 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-30 12:54:19,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:54:20,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-30 12:54:22,410 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:54:22,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-30 12:54:27,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:54:27,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-30 12:54:31,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-30 12:54:31,530 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:54:35,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:54:36,653 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:54:38,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-30 12:54:39,849 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=718213.3333333334, ans=0.125 2023-09-30 12:54:41,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-30 12:54:41,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-30 12:54:42,717 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:54:44,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 12:54:47,160 INFO [train.py:1039] (1/4) Epoch 21, batch 1500, loss[loss=0.1497, simple_loss=0.2385, pruned_loss=0.03048, over 24509.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.2502, pruned_loss=0.04944, over 4717084.07 frames. ], batch size: 66, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 12:54:49,231 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=718280.0, ans=0.125 2023-09-30 12:54:55,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-30 12:54:55,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:54:55,881 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:54:57,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:54:57,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:54:57,766 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=718280.0, ans=0.125 2023-09-30 12:54:59,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:54:59,224 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-30 12:55:01,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:55:01,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-30 12:55:01,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:55:01,663 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=718280.0, ans=0.125 2023-09-30 12:55:01,837 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=718280.0, ans=0.1 2023-09-30 12:55:03,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:55:05,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:55:06,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:55:13,345 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:55:13,365 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-30 12:55:14,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-30 12:55:16,115 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.411e+02 1.806e+02 1.957e+02 2.271e+02 3.905e+02, threshold=3.913e+02, percent-clipped=0.0 2023-09-30 12:55:16,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:55:17,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:55:20,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-30 12:55:24,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-30 12:55:26,195 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:55:26,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-30 12:55:29,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-30 12:55:30,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:55:32,317 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:55:32,341 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:55:32,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-30 12:55:33,970 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:55:33,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:55:36,098 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-30 12:55:36,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:55:41,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:55:41,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-30 12:55:43,637 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.77 vs. limit=15.0 2023-09-30 12:55:48,243 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:55:49,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 12:55:52,949 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-30 12:55:53,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:55:53,029 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-30 12:55:53,311 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=718546.6666666666, ans=0.1 2023-09-30 12:55:54,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:55:56,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:55:58,044 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-30 12:55:59,592 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-30 12:56:01,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-30 12:56:02,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:56:06,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:56:07,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:56:08,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:56:08,804 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.16 vs. limit=15.0 2023-09-30 12:56:09,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:56:09,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:56:09,709 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-30 12:56:11,613 INFO [train.py:1039] (1/4) Epoch 21, batch 1550, loss[loss=0.1675, simple_loss=0.2505, pruned_loss=0.04227, over 24484.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2509, pruned_loss=0.04977, over 4712402.31 frames. ], batch size: 63, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 12:56:11,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-30 12:56:11,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:56:13,232 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-30 12:56:14,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-30 12:56:16,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:56:16,627 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=718613.3333333334, ans=0.0 2023-09-30 12:56:18,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:56:19,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:56:19,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:56:20,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:56:23,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:56:25,246 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-30 12:56:25,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:56:26,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:56:28,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:56:29,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-30 12:56:29,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-30 12:56:31,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:56:33,485 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-30 12:56:33,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-30 12:56:35,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-30 12:56:35,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:56:36,698 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=718680.0, ans=0.125 2023-09-30 12:56:38,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:56:41,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:56:44,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-30 12:56:44,905 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-30 12:56:54,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:56:57,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:56:57,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:56:57,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:56:59,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-30 12:57:00,448 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=718813.3333333334, ans=0.125 2023-09-30 12:57:04,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 12:57:06,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:57:11,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:57:14,354 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:57:14,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:57:15,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-30 12:57:15,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 12:57:16,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:57:17,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:57:17,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-30 12:57:17,654 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-30 12:57:21,441 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=718880.0, ans=0.0 2023-09-30 12:57:22,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:57:27,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-30 12:57:29,705 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=718880.0, ans=0.0 2023-09-30 12:57:33,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:57:34,719 INFO [train.py:1039] (1/4) Epoch 21, batch 1600, loss[loss=0.199, simple_loss=0.2625, pruned_loss=0.06779, over 22730.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.2516, pruned_loss=0.05016, over 4712462.40 frames. ], batch size: 322, lr: 4.92e-03, grad_scale: 16.0 2023-09-30 12:57:34,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:57:34,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-30 12:57:35,247 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=718946.6666666666, ans=0.05 2023-09-30 12:57:36,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 12:57:36,659 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=718946.6666666666, ans=0.125 2023-09-30 12:57:37,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:57:37,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:57:39,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:57:39,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:57:39,740 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=718946.6666666666, ans=0.0 2023-09-30 12:57:41,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:57:43,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-30 12:57:44,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-30 12:57:47,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-30 12:57:50,850 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:57:52,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-30 12:57:52,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:57:56,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:57:58,242 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=719013.3333333334, ans=0.0 2023-09-30 12:57:58,618 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.08 vs. limit=10.0 2023-09-30 12:57:59,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:58:01,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-30 12:58:04,560 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.831e+02 2.011e+02 2.218e+02 3.597e+02, threshold=4.022e+02, percent-clipped=0.0 2023-09-30 12:58:05,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:58:06,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-30 12:58:06,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:58:07,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-30 12:58:11,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-30 12:58:20,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:58:21,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-30 12:58:21,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:58:23,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:58:23,155 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:58:24,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-30 12:58:28,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 12:58:30,198 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:58:30,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:58:31,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:58:33,798 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:58:34,019 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:58:36,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:58:36,335 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=719146.6666666666, ans=0.125 2023-09-30 12:58:37,670 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:58:43,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:58:43,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:58:47,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-30 12:58:47,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:58:48,829 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-30 12:58:53,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:58:56,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:58:56,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:58:56,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-30 12:58:56,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-30 12:58:56,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-30 12:58:56,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-30 12:58:57,950 INFO [train.py:1039] (1/4) Epoch 21, batch 1650, loss[loss=0.1643, simple_loss=0.2524, pruned_loss=0.03806, over 24664.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.2522, pruned_loss=0.05049, over 4714964.78 frames. ], batch size: 65, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 12:59:01,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:59:01,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:59:01,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:59:03,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-30 12:59:06,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:59:09,006 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-30 12:59:13,461 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:59:13,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:59:13,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:59:13,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:59:14,426 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.92 vs. limit=15.0 2023-09-30 12:59:15,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-30 12:59:15,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-30 12:59:21,869 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:59:24,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-30 12:59:32,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-30 12:59:34,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:59:35,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-30 12:59:40,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:59:42,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:59:44,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:59:46,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:59:46,888 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.60 vs. limit=10.0 2023-09-30 12:59:47,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:59:47,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:59:48,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:59:49,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:59:49,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:59:51,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:59:52,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:59:52,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 12:59:56,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:59:57,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-30 13:00:00,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:00:00,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-30 13:00:02,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-30 13:00:02,314 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-30 13:00:02,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:00:03,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:00:03,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:00:05,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:00:05,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-30 13:00:08,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:00:12,287 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:00:12,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:00:15,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-30 13:00:20,677 INFO [train.py:1039] (1/4) Epoch 21, batch 1700, loss[loss=0.1588, simple_loss=0.2368, pruned_loss=0.04034, over 18577.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.2515, pruned_loss=0.05018, over 4712364.92 frames. ], batch size: 40, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 13:00:20,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:00:20,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:00:20,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-30 13:00:20,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:00:20,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:00:20,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:00:24,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:00:24,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:00:25,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-30 13:00:26,736 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.35 vs. limit=12.0 2023-09-30 13:00:27,622 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 13:00:32,813 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=719613.3333333334, ans=0.0 2023-09-30 13:00:37,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:00:39,180 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=719680.0, ans=0.125 2023-09-30 13:00:40,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:00:45,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-30 13:00:45,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:00:45,580 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:00:47,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:00:49,899 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.892e+02 2.034e+02 2.348e+02 3.587e+02, threshold=4.068e+02, percent-clipped=0.0 2023-09-30 13:00:50,071 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-30 13:00:51,731 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:00:51,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:00:55,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-30 13:00:56,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-30 13:00:58,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-30 13:00:59,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-30 13:01:00,615 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:01:02,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-30 13:01:03,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:01:09,139 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=719813.3333333334, ans=0.125 2023-09-30 13:01:14,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:01:16,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:01:16,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:01:18,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:01:18,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-30 13:01:18,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:01:21,680 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:01:21,682 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-30 13:01:21,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:01:21,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:01:23,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:01:23,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:01:24,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:01:24,811 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:01:25,509 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.74 vs. limit=15.0 2023-09-30 13:01:26,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:01:26,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:01:26,469 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:01:33,189 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:01:34,747 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-30 13:01:36,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:01:38,428 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:01:41,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-30 13:01:42,778 INFO [train.py:1039] (1/4) Epoch 21, batch 1750, loss[loss=0.1642, simple_loss=0.2283, pruned_loss=0.05011, over 23556.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2503, pruned_loss=0.04905, over 4717032.43 frames. ], batch size: 256, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 13:01:46,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:01:47,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:01:49,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-30 13:01:49,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-30 13:01:50,778 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:01:54,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:01:54,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:02:01,147 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=720013.3333333334, ans=0.0 2023-09-30 13:02:02,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-30 13:02:04,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:02:06,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-30 13:02:06,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:02:08,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:02:11,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 13:02:13,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-30 13:02:14,072 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=720013.3333333334, ans=0.125 2023-09-30 13:02:15,184 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:02:16,609 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-30 13:02:21,806 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=720080.0, ans=0.125 2023-09-30 13:02:23,562 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.58 vs. limit=15.0 2023-09-30 13:02:24,534 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:02:24,842 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=720080.0, ans=0.1 2023-09-30 13:02:27,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:02:27,598 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:02:31,171 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:02:31,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:02:32,831 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:02:34,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:02:36,001 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:02:36,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:02:37,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-30 13:02:39,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:02:41,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-30 13:02:41,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:02:43,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:02:43,604 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=720146.6666666666, ans=0.0 2023-09-30 13:02:44,231 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.90 vs. limit=15.0 2023-09-30 13:02:45,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:02:48,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 13:02:48,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-30 13:02:48,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:02:51,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:02:56,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:02:58,673 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=720213.3333333334, ans=0.125 2023-09-30 13:02:59,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:03:01,395 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:03:01,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-30 13:03:01,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:03:03,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-30 13:03:03,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:03:03,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:03:03,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:03:03,407 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=720213.3333333334, ans=0.1 2023-09-30 13:03:04,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-30 13:03:09,629 INFO [train.py:1039] (1/4) Epoch 21, batch 1800, loss[loss=0.1736, simple_loss=0.2558, pruned_loss=0.04576, over 24454.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2489, pruned_loss=0.04815, over 4718717.25 frames. ], batch size: 66, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 13:03:09,708 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:03:09,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:03:11,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 13:03:13,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:03:18,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 13:03:18,481 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:03:21,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:03:24,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:03:24,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:03:25,072 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=720346.6666666666, ans=0.125 2023-09-30 13:03:26,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:03:27,861 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:03:27,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-30 13:03:29,377 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:03:32,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:03:37,713 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-30 13:03:39,071 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 1.961e+02 2.256e+02 2.662e+02 3.514e+02, threshold=4.513e+02, percent-clipped=0.0 2023-09-30 13:03:39,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-30 13:03:40,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-30 13:03:40,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:03:41,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:03:41,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:03:43,188 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:03:51,548 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-30 13:03:51,768 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=720413.3333333334, ans=0.1 2023-09-30 13:03:54,472 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:03:56,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:03:58,293 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-30 13:03:58,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-30 13:03:59,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:04:00,080 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=720480.0, ans=0.125 2023-09-30 13:04:01,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:04:02,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 13:04:07,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-30 13:04:14,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:04:14,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-30 13:04:14,581 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=720546.6666666666, ans=0.1 2023-09-30 13:04:16,247 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:04:16,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:04:17,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:04:17,810 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-30 13:04:20,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:04:20,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:04:23,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-30 13:04:23,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:04:26,208 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:04:26,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-30 13:04:26,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:04:27,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:04:29,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 13:04:30,784 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:04:30,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:04:32,742 INFO [train.py:1039] (1/4) Epoch 21, batch 1850, loss[loss=0.1883, simple_loss=0.2532, pruned_loss=0.06166, over 23866.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2498, pruned_loss=0.04802, over 4743268.39 frames. ], batch size: 179, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 13:04:34,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:04:36,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:04:45,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:04:45,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-30 13:04:50,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-30 13:04:52,972 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.01 vs. limit=15.0 2023-09-30 13:04:53,550 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=720680.0, ans=0.125 2023-09-30 13:04:55,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-30 13:04:58,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:04:59,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-30 13:05:00,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 13:05:06,098 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=720746.6666666666, ans=0.04949747468305833 2023-09-30 13:05:07,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:05:09,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-30 13:05:12,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:05:12,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:05:12,776 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=720746.6666666666, ans=0.125 2023-09-30 13:05:15,941 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=720746.6666666666, ans=0.125 2023-09-30 13:05:15,970 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=720746.6666666666, ans=0.0 2023-09-30 13:05:17,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-30 13:05:17,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:05:17,174 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 13:05:17,369 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=720746.6666666666, ans=0.05 2023-09-30 13:05:19,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:05:20,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:05:24,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:05:24,284 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=720813.3333333334, ans=0.0 2023-09-30 13:05:27,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:05:27,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:05:27,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 13:05:27,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:05:30,964 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:05:32,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:05:36,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-30 13:05:36,336 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:05:41,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:05:41,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:05:41,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-30 13:05:41,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-30 13:05:44,653 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-30 13:05:44,789 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-30 13:05:45,346 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.89 vs. limit=15.0 2023-09-30 13:05:47,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 13:05:47,749 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:05:47,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:05:47,783 WARNING [train.py:1197] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:05:49,283 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-30 13:05:49,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:05:50,739 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:05:50,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-30 13:05:53,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 13:05:54,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:05:54,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-30 13:05:56,049 INFO [train.py:1039] (1/4) Epoch 21, batch 1900, loss[loss=0.1754, simple_loss=0.2618, pruned_loss=0.04449, over 24641.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2501, pruned_loss=0.04815, over 4738166.79 frames. ], batch size: 68, lr: 4.91e-03, grad_scale: 8.0 2023-09-30 13:05:57,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:05:57,853 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-30 13:05:57,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:05:59,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:06:01,215 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=720946.6666666666, ans=0.2 2023-09-30 13:06:04,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:06:07,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:06:07,299 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-30 13:06:09,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-30 13:06:11,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:06:11,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:06:12,870 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-30 13:06:12,931 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-30 13:06:16,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-30 13:06:19,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:06:22,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-30 13:06:22,893 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=721013.3333333334, ans=0.0 2023-09-30 13:06:24,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-30 13:06:26,414 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.558e+02 1.817e+02 1.990e+02 2.367e+02 3.522e+02, threshold=3.980e+02, percent-clipped=0.0 2023-09-30 13:06:35,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-30 13:06:40,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-30 13:06:40,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:06:41,616 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-30 13:06:41,624 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-30 13:06:41,689 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-30 13:06:41,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-30 13:06:41,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:06:47,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-30 13:06:50,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:06:55,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:06:55,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-30 13:06:55,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:07:00,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-30 13:07:00,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-30 13:07:08,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:07:08,703 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:07:08,728 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:07:10,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:07:10,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 13:07:10,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-30 13:07:12,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:07:14,950 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:07:14,953 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:07:16,754 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:07:16,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:07:16,839 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-30 13:07:18,230 INFO [train.py:1039] (1/4) Epoch 21, batch 1950, loss[loss=0.1772, simple_loss=0.2465, pruned_loss=0.05392, over 23415.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2506, pruned_loss=0.0484, over 4731985.52 frames. ], batch size: 285, lr: 4.91e-03, grad_scale: 8.0 2023-09-30 13:07:18,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:07:22,105 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:07:22,453 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=721280.0, ans=0.1 2023-09-30 13:07:24,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:07:25,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:07:25,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 13:07:28,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-30 13:07:28,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 13:07:28,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:07:30,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:07:33,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:07:33,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:07:34,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:07:36,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:07:41,167 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:07:41,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 13:07:41,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:07:41,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:07:44,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:07:47,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:07:47,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:07:47,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-30 13:07:47,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-30 13:07:49,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 13:07:50,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:07:50,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:07:51,089 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=721413.3333333334, ans=0.125 2023-09-30 13:07:53,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:07:56,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:08:00,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 13:08:03,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:08:03,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-30 13:08:03,810 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-30 13:08:03,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:08:12,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:08:12,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:08:13,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-30 13:08:19,182 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.84 vs. limit=15.0 2023-09-30 13:08:21,959 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:08:23,509 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:08:25,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:08:29,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:08:33,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:08:33,575 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:08:35,110 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-30 13:08:35,118 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 13:08:36,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:08:38,057 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-30 13:08:39,450 INFO [train.py:1039] (1/4) Epoch 21, batch 2000, loss[loss=0.1623, simple_loss=0.2387, pruned_loss=0.04289, over 17949.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.2521, pruned_loss=0.04889, over 4729184.64 frames. ], batch size: 39, lr: 4.91e-03, grad_scale: 16.0 2023-09-30 13:08:39,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:08:45,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-30 13:08:45,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:08:45,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:08:46,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:08:49,635 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:08:52,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-30 13:08:52,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-30 13:08:55,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:08:57,609 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-30 13:08:59,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 13:08:59,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:09:00,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:09:02,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-30 13:09:02,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:09:03,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:09:04,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:09:06,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-30 13:09:07,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 13:09:08,954 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.918e+02 2.130e+02 2.425e+02 4.087e+02, threshold=4.260e+02, percent-clipped=1.0 2023-09-30 13:09:10,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-30 13:09:10,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:09:14,585 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:09:14,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-30 13:09:14,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:09:16,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:09:16,846 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=721746.6666666666, ans=0.0 2023-09-30 13:09:18,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:09:18,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-30 13:09:21,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-30 13:09:21,319 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:09:21,333 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:09:25,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:09:27,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:09:28,924 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 13:09:29,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:09:31,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:09:31,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:09:33,388 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 13:09:33,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:09:34,962 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:09:38,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:09:38,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-30 13:09:45,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 13:09:45,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:09:51,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:09:51,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:09:54,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:09:55,023 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=721880.0, ans=0.125 2023-09-30 13:09:56,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:09:56,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:09:57,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 13:09:57,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 13:10:00,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:10:02,019 INFO [train.py:1039] (1/4) Epoch 21, batch 2050, loss[loss=0.1597, simple_loss=0.2368, pruned_loss=0.04133, over 23328.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2523, pruned_loss=0.04954, over 4715448.53 frames. ], batch size: 134, lr: 4.91e-03, grad_scale: 16.0 2023-09-30 13:10:02,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:10:05,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:10:06,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:10:10,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:10:11,754 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:10:13,869 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:10:15,245 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:10:17,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-30 13:10:17,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:10:17,779 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=722013.3333333334, ans=0.0 2023-09-30 13:10:20,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:10:20,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-30 13:10:30,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:10:30,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:10:33,865 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-30 13:10:35,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:10:37,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-30 13:10:38,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:10:41,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:10:43,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:10:43,504 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-30 13:10:44,924 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:10:46,533 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:10:48,097 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:10:49,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:10:51,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:10:53,521 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 13:10:55,787 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:10:56,822 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=722146.6666666666, ans=0.2 2023-09-30 13:10:57,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:11:02,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 13:11:08,463 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:11:09,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-30 13:11:16,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:11:16,310 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:11:18,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:11:20,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-30 13:11:24,546 INFO [train.py:1039] (1/4) Epoch 21, batch 2100, loss[loss=0.1924, simple_loss=0.2606, pruned_loss=0.0621, over 23354.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.2508, pruned_loss=0.04923, over 4711517.28 frames. ], batch size: 119, lr: 4.91e-03, grad_scale: 16.0 2023-09-30 13:11:24,685 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-30 13:11:24,685 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:11:24,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:11:26,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 13:11:26,979 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:11:27,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-30 13:11:27,282 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 13:11:28,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-30 13:11:28,665 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 13:11:32,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:11:33,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:11:33,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:11:35,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:11:35,475 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-30 13:11:37,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:11:38,455 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-30 13:11:38,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-30 13:11:41,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:11:41,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:11:41,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-30 13:11:42,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 13:11:46,211 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-30 13:11:46,213 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 13:11:49,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:11:50,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:11:53,682 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.414e+02 1.855e+02 2.014e+02 2.188e+02 4.712e+02, threshold=4.028e+02, percent-clipped=1.0 2023-09-30 13:11:53,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:11:56,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-30 13:11:58,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:11:58,039 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 13:11:59,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-30 13:12:01,662 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:12:01,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-30 13:12:01,752 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-30 13:12:02,008 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=722413.3333333334, ans=0.0 2023-09-30 13:12:03,204 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-30 13:12:05,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-30 13:12:06,823 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:12:09,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 13:12:10,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 13:12:11,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:12:13,144 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:12:13,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-30 13:12:13,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:12:13,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:12:14,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:12:14,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-30 13:12:16,394 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-30 13:12:16,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-30 13:12:22,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:12:23,970 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=722480.0, ans=0.125 2023-09-30 13:12:25,380 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:12:26,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-30 13:12:32,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:12:36,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:12:36,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:12:36,406 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:12:36,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-30 13:12:36,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 13:12:38,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:12:38,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-30 13:12:40,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:12:40,163 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:12:41,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-30 13:12:41,816 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=722546.6666666666, ans=0.1 2023-09-30 13:12:43,341 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-30 13:12:43,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:12:46,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:12:46,378 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:12:47,694 INFO [train.py:1039] (1/4) Epoch 21, batch 2150, loss[loss=0.1654, simple_loss=0.2512, pruned_loss=0.03978, over 24667.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2492, pruned_loss=0.04883, over 4708183.18 frames. ], batch size: 68, lr: 4.91e-03, grad_scale: 16.0 2023-09-30 13:12:47,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:12:47,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:12:48,690 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.69 vs. limit=15.0 2023-09-30 13:12:52,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 13:12:54,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:12:55,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:12:57,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:12:57,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:12:57,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:13:02,083 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:13:04,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:13:04,063 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:13:07,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:13:07,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-30 13:13:13,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:13:13,251 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:13:14,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:13:14,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:13:16,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:13:16,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-30 13:13:16,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:13:16,696 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=722680.0, ans=0.2 2023-09-30 13:13:17,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:13:17,893 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:13:19,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-30 13:13:20,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:13:20,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:13:21,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:13:24,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 13:13:24,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:13:26,085 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:13:27,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:13:29,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:13:29,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-30 13:13:29,166 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-30 13:13:32,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:13:32,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:13:34,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:13:37,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 13:13:39,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:13:41,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:13:41,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-30 13:13:43,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-30 13:13:43,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:13:43,459 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-30 13:13:43,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:13:44,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:13:45,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-30 13:13:45,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:13:45,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-30 13:13:45,177 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-30 13:13:45,177 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-30 13:13:46,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-30 13:13:49,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:13:51,003 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:13:51,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:13:51,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:13:52,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 13:13:54,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:13:54,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:14:03,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:14:03,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-30 13:14:08,137 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:14:09,586 INFO [train.py:1039] (1/4) Epoch 21, batch 2200, loss[loss=0.1798, simple_loss=0.2603, pruned_loss=0.04969, over 24008.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2498, pruned_loss=0.04897, over 4706109.67 frames. ], batch size: 80, lr: 4.91e-03, grad_scale: 16.0 2023-09-30 13:14:13,278 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:14:13,400 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=722946.6666666666, ans=0.125 2023-09-30 13:14:14,773 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.91 vs. limit=10.0 2023-09-30 13:14:15,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:14:15,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:14:16,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-30 13:14:20,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:14:20,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:14:20,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-30 13:14:25,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-30 13:14:26,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 13:14:28,664 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=723013.3333333334, ans=0.125 2023-09-30 13:14:30,182 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=723013.3333333334, ans=0.125 2023-09-30 13:14:31,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-30 13:14:34,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:14:36,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:14:36,172 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:14:37,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:14:39,229 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.525e+02 1.806e+02 1.948e+02 2.214e+02 3.228e+02, threshold=3.896e+02, percent-clipped=0.0 2023-09-30 13:14:39,347 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-30 13:14:42,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-30 13:14:42,735 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=723080.0, ans=0.125 2023-09-30 13:14:44,623 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:14:46,686 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-30 13:14:47,649 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.52 vs. limit=22.5 2023-09-30 13:14:50,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:14:51,386 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.67 vs. limit=6.0 2023-09-30 13:14:52,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:14:55,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:14:56,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:14:58,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-30 13:14:59,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:15:00,556 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.14 vs. limit=15.0 2023-09-30 13:15:01,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-30 13:15:03,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:15:03,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:15:04,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:15:06,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:15:06,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:15:06,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:15:07,632 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:15:09,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-30 13:15:09,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:15:10,851 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 13:15:12,694 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=723146.6666666666, ans=0.125 2023-09-30 13:15:15,307 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 13:15:15,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:15:17,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:15:17,207 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-30 13:15:21,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 13:15:21,651 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-30 13:15:25,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-30 13:15:25,711 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-30 13:15:27,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:15:27,237 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-30 13:15:28,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:15:31,716 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-30 13:15:32,107 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=723280.0, ans=0.0 2023-09-30 13:15:33,220 INFO [train.py:1039] (1/4) Epoch 21, batch 2250, loss[loss=0.1604, simple_loss=0.2394, pruned_loss=0.04068, over 24439.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2503, pruned_loss=0.0488, over 4721942.18 frames. ], batch size: 58, lr: 4.91e-03, grad_scale: 16.0 2023-09-30 13:15:33,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:15:34,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:15:41,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:15:41,401 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:15:44,872 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=723280.0, ans=0.0 2023-09-30 13:15:45,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:15:45,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:15:47,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:15:50,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-30 13:15:50,620 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:15:50,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:15:52,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-30 13:15:54,593 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:15:54,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:15:56,846 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.35 vs. limit=6.0 2023-09-30 13:15:58,293 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:16:03,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:16:04,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 13:16:04,994 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-30 13:16:06,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-30 13:16:07,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:16:08,354 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=723413.3333333334, ans=0.125 2023-09-30 13:16:09,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:16:11,272 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 13:16:15,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:16:18,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:16:18,852 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=723413.3333333334, ans=0.0 2023-09-30 13:16:19,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:16:19,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:16:22,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:16:24,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:16:28,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:16:31,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-30 13:16:31,636 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=723480.0, ans=0.125 2023-09-30 13:16:38,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 13:16:38,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:16:38,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:16:43,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 13:16:46,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-30 13:16:46,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-30 13:16:47,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:16:47,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:16:50,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-30 13:16:53,624 INFO [train.py:1039] (1/4) Epoch 21, batch 2300, loss[loss=0.183, simple_loss=0.252, pruned_loss=0.05696, over 23273.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.251, pruned_loss=0.04893, over 4726719.98 frames. ], batch size: 105, lr: 4.91e-03, grad_scale: 16.0 2023-09-30 13:16:53,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:16:53,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:16:59,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:16:59,897 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:17:03,563 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-30 13:17:05,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:17:13,826 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:17:13,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-30 13:17:15,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:17:16,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:17:16,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-30 13:17:16,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:17:18,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:17:19,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:17:22,894 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.842e+02 2.058e+02 2.392e+02 4.261e+02, threshold=4.115e+02, percent-clipped=2.0 2023-09-30 13:17:24,547 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:17:24,790 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=723746.6666666666, ans=0.2 2023-09-30 13:17:27,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:17:31,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:17:36,871 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=723746.6666666666, ans=0.125 2023-09-30 13:17:38,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:17:38,077 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:17:41,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:17:41,990 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=723813.3333333334, ans=0.0 2023-09-30 13:17:44,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:17:50,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:17:50,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 13:17:52,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:17:52,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-30 13:17:56,966 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 13:17:56,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:17:57,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:17:57,074 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:17:58,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:17:58,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 13:17:58,601 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-30 13:18:00,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-30 13:18:00,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:18:00,066 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:18:00,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-30 13:18:06,336 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:18:09,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:18:13,897 INFO [train.py:1039] (1/4) Epoch 21, batch 2350, loss[loss=0.1669, simple_loss=0.2527, pruned_loss=0.04052, over 24695.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.2525, pruned_loss=0.04967, over 4729359.40 frames. ], batch size: 73, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:18:16,150 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:18:16,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:18:16,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-30 13:18:17,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 13:18:17,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:18:19,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 13:18:20,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-30 13:18:28,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:18:28,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-30 13:18:33,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-30 13:18:34,801 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=724013.3333333334, ans=0.2 2023-09-30 13:18:36,200 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=724013.3333333334, ans=0.125 2023-09-30 13:18:37,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:18:39,077 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:18:39,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:18:39,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:18:39,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:18:40,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-30 13:18:42,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:18:42,978 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=724013.3333333334, ans=0.0 2023-09-30 13:18:49,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-30 13:18:50,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:18:54,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 13:18:54,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:18:56,493 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:18:58,542 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-30 13:18:59,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:19:01,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:19:01,656 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:19:03,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:19:04,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:19:07,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-30 13:19:07,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:19:12,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:19:12,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:19:13,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-30 13:19:14,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:19:14,401 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=724146.6666666666, ans=0.125 2023-09-30 13:19:16,005 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=724146.6666666666, ans=0.125 2023-09-30 13:19:17,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-30 13:19:17,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:19:20,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-30 13:19:23,489 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=724213.3333333334, ans=0.0 2023-09-30 13:19:26,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-30 13:19:26,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:19:26,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-30 13:19:26,168 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-30 13:19:27,633 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-30 13:19:29,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-30 13:19:34,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:19:37,899 INFO [train.py:1039] (1/4) Epoch 21, batch 2400, loss[loss=0.1666, simple_loss=0.2302, pruned_loss=0.05149, over 23365.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2514, pruned_loss=0.04948, over 4719851.26 frames. ], batch size: 285, lr: 4.90e-03, grad_scale: 32.0 2023-09-30 13:19:38,065 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:19:41,369 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:19:44,316 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:19:45,825 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-30 13:19:45,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-30 13:19:52,868 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.67 vs. limit=15.0 2023-09-30 13:19:54,183 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 13:19:54,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:19:55,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-30 13:19:55,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:19:55,874 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:19:57,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-30 13:20:02,591 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:20:03,241 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.70 vs. limit=15.0 2023-09-30 13:20:06,054 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-30 13:20:07,654 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.507e+02 1.817e+02 2.030e+02 2.238e+02 3.635e+02, threshold=4.061e+02, percent-clipped=0.0 2023-09-30 13:20:12,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-30 13:20:17,122 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-30 13:20:20,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:20:20,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:20:23,983 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.16 vs. limit=10.0 2023-09-30 13:20:25,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:20:25,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-30 13:20:26,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 13:20:31,392 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=724480.0, ans=0.2 2023-09-30 13:20:34,056 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:20:35,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:20:39,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:20:40,818 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:20:40,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-30 13:20:40,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:20:40,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:20:40,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:20:41,004 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 13:20:46,356 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=724546.6666666666, ans=0.125 2023-09-30 13:20:47,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:20:47,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 13:20:47,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-30 13:20:49,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-30 13:20:52,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:20:52,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:20:52,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-30 13:20:53,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-30 13:20:53,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-30 13:20:53,895 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-30 13:20:55,356 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-30 13:20:55,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:20:58,554 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:20:58,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:20:59,985 INFO [train.py:1039] (1/4) Epoch 21, batch 2450, loss[loss=0.1887, simple_loss=0.2653, pruned_loss=0.05605, over 23244.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2501, pruned_loss=0.04924, over 4716460.74 frames. ], batch size: 93, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:21:00,124 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-30 13:21:00,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:21:00,563 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=724613.3333333334, ans=0.125 2023-09-30 13:21:02,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-30 13:21:03,413 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=724613.3333333334, ans=0.0 2023-09-30 13:21:04,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:21:04,742 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=724613.3333333334, ans=0.1 2023-09-30 13:21:06,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:21:09,483 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:21:09,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:21:11,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-30 13:21:18,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:21:18,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:21:18,680 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=724680.0, ans=0.125 2023-09-30 13:21:21,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:21:21,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:21:21,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:21:22,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-30 13:21:27,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:21:29,092 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 13:21:30,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:21:33,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-30 13:21:33,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:21:33,857 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=724746.6666666666, ans=0.2 2023-09-30 13:21:36,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:21:36,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:21:38,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-30 13:21:40,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:21:45,394 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=724746.6666666666, ans=0.125 2023-09-30 13:21:48,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:21:50,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:21:50,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:21:52,356 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:21:52,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:21:53,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:21:55,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-30 13:21:55,602 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=724813.3333333334, ans=0.0 2023-09-30 13:21:58,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:21:58,473 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:22:02,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:22:02,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:22:09,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:22:09,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-30 13:22:09,452 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=724880.0, ans=0.125 2023-09-30 13:22:10,433 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:22:11,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:22:11,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-30 13:22:11,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:22:14,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:22:16,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:22:19,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:22:21,015 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:22:21,498 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=724946.6666666666, ans=0.125 2023-09-30 13:22:22,990 INFO [train.py:1039] (1/4) Epoch 21, batch 2500, loss[loss=0.1595, simple_loss=0.2067, pruned_loss=0.05617, over 19145.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2495, pruned_loss=0.04924, over 4723859.68 frames. ], batch size: 388, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:22:24,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-30 13:22:24,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:22:27,160 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=5.688e-03 2023-09-30 13:22:31,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:22:39,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 13:22:40,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:22:42,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:22:42,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-30 13:22:49,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 13:22:49,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:22:51,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-30 13:22:51,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 13:22:52,843 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-30 13:22:54,205 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.323e+02 2.019e+02 2.359e+02 2.829e+02 4.327e+02, threshold=4.718e+02, percent-clipped=1.0 2023-09-30 13:22:54,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:22:56,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:22:56,378 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-30 13:22:56,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:22:57,773 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-30 13:22:57,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:23:00,371 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=725080.0, ans=0.1 2023-09-30 13:23:03,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:23:04,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:23:07,601 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 13:23:09,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-30 13:23:09,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:23:09,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:23:13,791 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:23:18,756 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:23:21,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:23:25,716 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=725146.6666666666, ans=0.1 2023-09-30 13:23:27,305 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=725213.3333333334, ans=0.2 2023-09-30 13:23:28,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-30 13:23:30,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-30 13:23:30,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:23:32,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-30 13:23:33,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:23:33,743 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 13:23:35,287 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-30 13:23:35,288 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-30 13:23:35,297 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-30 13:23:38,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:23:39,244 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=725213.3333333334, ans=0.1 2023-09-30 13:23:41,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-30 13:23:41,847 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-30 13:23:41,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:23:44,786 INFO [train.py:1039] (1/4) Epoch 21, batch 2550, loss[loss=0.1678, simple_loss=0.2443, pruned_loss=0.04567, over 23606.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2497, pruned_loss=0.04906, over 4720240.67 frames. ], batch size: 149, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:23:44,881 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-30 13:23:46,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-30 13:23:49,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:23:51,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:23:51,462 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:23:54,970 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:23:56,431 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-30 13:23:56,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-30 13:24:00,120 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-30 13:24:03,505 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:24:05,166 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:24:05,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:24:06,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 13:24:06,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 13:24:08,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:24:08,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:24:11,735 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:24:11,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-30 13:24:11,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-30 13:24:11,832 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:24:11,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-30 13:24:12,048 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=725346.6666666666, ans=0.0 2023-09-30 13:24:21,706 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.88 vs. limit=6.0 2023-09-30 13:24:24,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:24:32,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:24:32,769 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:24:32,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:24:32,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 13:24:39,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:24:41,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 13:24:41,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 13:24:43,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:24:43,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-30 13:24:43,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-30 13:24:46,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:24:48,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:24:51,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:24:51,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-30 13:24:51,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:24:51,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:24:52,858 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:24:54,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 13:24:55,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:25:03,510 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=725546.6666666666, ans=0.125 2023-09-30 13:25:04,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:25:07,608 INFO [train.py:1039] (1/4) Epoch 21, batch 2600, loss[loss=0.164, simple_loss=0.2415, pruned_loss=0.04323, over 24448.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2509, pruned_loss=0.04979, over 4706411.70 frames. ], batch size: 58, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:25:07,696 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:25:09,359 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-30 13:25:12,767 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-30 13:25:12,798 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:25:12,855 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-30 13:25:13,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-30 13:25:14,450 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-30 13:25:16,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:25:16,259 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-30 13:25:17,852 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-30 13:25:19,282 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-30 13:25:20,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:25:24,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-30 13:25:24,745 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=725680.0, ans=0.0 2023-09-30 13:25:25,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-30 13:25:26,691 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-30 13:25:27,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-30 13:25:29,533 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-30 13:25:29,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-30 13:25:31,540 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=725680.0, ans=0.125 2023-09-30 13:25:33,027 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=725680.0, ans=0.1 2023-09-30 13:25:37,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:25:37,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:25:37,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:25:37,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-30 13:25:38,533 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.20 vs. limit=15.0 2023-09-30 13:25:39,017 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.842e+02 2.050e+02 2.225e+02 3.222e+02, threshold=4.100e+02, percent-clipped=0.0 2023-09-30 13:25:42,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:25:47,772 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=725746.6666666666, ans=0.07 2023-09-30 13:25:48,994 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-30 13:25:51,869 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.29 vs. limit=8.0 2023-09-30 13:25:53,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:25:55,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:25:55,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-30 13:25:55,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:25:55,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:25:55,902 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=725813.3333333334, ans=0.1 2023-09-30 13:25:57,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-30 13:26:00,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-30 13:26:00,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:26:03,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:26:06,240 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=725813.3333333334, ans=0.1 2023-09-30 13:26:07,524 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-30 13:26:07,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:26:07,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:26:15,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:26:15,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:26:15,823 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-30 13:26:17,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:26:19,533 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:26:21,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:26:27,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-30 13:26:27,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:26:28,834 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 13:26:30,217 INFO [train.py:1039] (1/4) Epoch 21, batch 2650, loss[loss=0.1453, simple_loss=0.2235, pruned_loss=0.03353, over 24355.00 frames. ], tot_loss[loss=0.1758, simple_loss=0.2516, pruned_loss=0.05003, over 4709383.86 frames. ], batch size: 56, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:26:30,608 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=725946.6666666666, ans=0.125 2023-09-30 13:26:33,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-30 13:26:33,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:26:34,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 13:26:35,511 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-30 13:26:36,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:26:40,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:26:42,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 13:26:42,759 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=725946.6666666666, ans=0.125 2023-09-30 13:26:45,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:26:46,956 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:26:48,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-30 13:26:48,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 13:26:48,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:26:51,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-30 13:26:54,502 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-30 13:26:56,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:26:59,233 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-30 13:26:59,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:27:00,777 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-30 13:27:05,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:27:05,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-30 13:27:06,650 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:27:06,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:27:13,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-30 13:27:13,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-30 13:27:15,770 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=726080.0, ans=0.125 2023-09-30 13:27:16,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-30 13:27:19,936 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-30 13:27:19,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:27:21,511 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:27:21,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-30 13:27:23,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:27:23,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:27:23,957 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.52 vs. limit=15.0 2023-09-30 13:27:25,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:27:26,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:27:28,197 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:27:29,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:27:29,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:27:32,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:27:33,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 13:27:34,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:27:37,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:27:37,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-30 13:27:40,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:27:41,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:27:41,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:27:41,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-30 13:27:46,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:27:46,380 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:27:48,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:27:49,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:27:51,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-30 13:27:51,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:27:52,796 INFO [train.py:1039] (1/4) Epoch 21, batch 2700, loss[loss=0.1829, simple_loss=0.2644, pruned_loss=0.05075, over 24643.00 frames. ], tot_loss[loss=0.1762, simple_loss=0.2522, pruned_loss=0.05007, over 4713634.10 frames. ], batch size: 68, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:27:55,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:27:55,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-30 13:27:56,987 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.64 vs. limit=5.0 2023-09-30 13:27:59,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:27:59,747 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=726280.0, ans=0.125 2023-09-30 13:28:01,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 13:28:01,439 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=726280.0, ans=0.125 2023-09-30 13:28:04,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:28:04,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:28:04,113 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:28:05,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:28:05,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:28:05,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:28:05,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-30 13:28:05,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-30 13:28:07,671 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:28:07,857 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.min_positive, batch_count=726346.6666666666, ans=0.05 2023-09-30 13:28:10,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-30 13:28:10,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 13:28:10,844 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:28:15,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:28:16,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-30 13:28:16,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:28:21,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:28:21,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:28:22,643 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.21 vs. limit=22.5 2023-09-30 13:28:23,277 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.895e+02 2.064e+02 2.321e+02 3.195e+02, threshold=4.129e+02, percent-clipped=0.0 2023-09-30 13:28:26,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=726413.3333333334, ans=0.0 2023-09-30 13:28:27,181 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:28:27,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:28:27,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:28:27,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:28:27,691 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=726413.3333333334, ans=0.125 2023-09-30 13:28:31,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:28:34,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:28:34,164 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-30 13:28:34,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:28:39,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:28:39,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:28:48,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:28:48,683 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:28:51,812 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:28:51,815 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:28:57,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:28:59,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:28:59,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:28:59,414 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:29:01,044 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:29:01,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:29:04,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:29:05,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:29:05,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:29:10,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-30 13:29:10,773 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=726546.6666666666, ans=0.125 2023-09-30 13:29:12,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:29:12,781 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=726546.6666666666, ans=0.125 2023-09-30 13:29:15,405 INFO [train.py:1039] (1/4) Epoch 21, batch 2750, loss[loss=0.1641, simple_loss=0.2321, pruned_loss=0.04804, over 23804.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.2512, pruned_loss=0.04947, over 4708376.76 frames. ], batch size: 212, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:29:15,634 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:29:15,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-30 13:29:15,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-30 13:29:15,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:29:19,246 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_ff3.min_abs, batch_count=726613.3333333334, ans=0.2 2023-09-30 13:29:20,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:29:21,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:29:23,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:29:23,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-30 13:29:23,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:29:25,145 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=726613.3333333334, ans=0.0 2023-09-30 13:29:28,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:29:28,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 13:29:28,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:29:30,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:29:30,421 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-30 13:29:30,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:29:30,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:29:36,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-30 13:29:37,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:29:38,037 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:29:39,499 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:29:39,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-30 13:29:41,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:29:42,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:29:42,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:29:42,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:29:47,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 13:29:47,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 13:29:49,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 13:29:51,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:29:51,344 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=726746.6666666666, ans=0.125 2023-09-30 13:29:52,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 13:30:00,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:30:03,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 13:30:03,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:30:08,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:30:08,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:30:08,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 13:30:15,500 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:30:15,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:30:15,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-30 13:30:22,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:30:23,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-30 13:30:26,853 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-30 13:30:29,906 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:30:29,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-30 13:30:30,180 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=726880.0, ans=0.0 2023-09-30 13:30:31,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:30:32,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:30:34,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-30 13:30:35,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:30:35,329 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=726880.0, ans=0.0 2023-09-30 13:30:37,938 INFO [train.py:1039] (1/4) Epoch 21, batch 2800, loss[loss=0.1647, simple_loss=0.2396, pruned_loss=0.04491, over 24598.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2503, pruned_loss=0.04894, over 4720355.30 frames. ], batch size: 60, lr: 4.89e-03, grad_scale: 32.0 2023-09-30 13:30:38,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-30 13:30:39,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:30:39,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:30:39,941 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=726946.6666666666, ans=0.07 2023-09-30 13:30:41,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-30 13:30:41,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:30:42,581 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:30:44,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:30:44,262 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-30 13:30:44,264 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-30 13:30:47,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:30:48,133 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=726946.6666666666, ans=0.125 2023-09-30 13:30:48,393 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=726946.6666666666, ans=0.1 2023-09-30 13:30:49,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 13:30:49,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:30:53,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:30:54,981 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=727013.3333333334, ans=0.125 2023-09-30 13:30:56,170 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-30 13:30:57,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-30 13:30:59,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-30 13:31:00,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:31:00,917 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:31:02,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:31:05,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:31:05,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:31:05,605 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-30 13:31:07,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:31:10,733 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.859e+02 2.240e+02 2.757e+02 3.972e+02, threshold=4.479e+02, percent-clipped=0.0 2023-09-30 13:31:14,263 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=727080.0, ans=0.0 2023-09-30 13:31:14,265 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=727080.0, ans=0.125 2023-09-30 13:31:15,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:31:17,817 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:31:21,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:31:22,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:31:24,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:31:28,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:31:28,105 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-30 13:31:28,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:31:29,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:31:29,725 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:31:34,148 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:31:34,988 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.17 vs. limit=15.0 2023-09-30 13:31:35,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:31:37,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:31:40,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:31:40,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:31:40,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 13:31:41,312 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=7.77 vs. limit=15.0 2023-09-30 13:31:42,479 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 13:31:42,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 13:31:44,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:31:44,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-30 13:31:44,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:31:46,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:31:46,955 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:31:47,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-30 13:31:47,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:31:47,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:31:48,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:31:49,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-30 13:31:57,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:31:57,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 13:31:58,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:32:01,339 INFO [train.py:1039] (1/4) Epoch 21, batch 2850, loss[loss=0.1546, simple_loss=0.231, pruned_loss=0.0391, over 24266.00 frames. ], tot_loss[loss=0.1727, simple_loss=0.2485, pruned_loss=0.04846, over 4720432.71 frames. ], batch size: 61, lr: 4.89e-03, grad_scale: 16.0 2023-09-30 13:32:01,451 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:32:06,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:32:06,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:32:06,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:32:09,206 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:32:09,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:32:10,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-30 13:32:11,346 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.25 vs. limit=22.5 2023-09-30 13:32:12,287 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-30 13:32:20,256 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-30 13:32:20,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:32:21,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-30 13:32:23,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:32:25,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-30 13:32:27,150 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-30 13:32:29,288 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:32:40,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:32:42,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:32:42,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:32:43,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 13:32:43,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 13:32:43,936 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:32:45,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 13:32:45,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-30 13:32:46,402 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.08 vs. limit=10.0 2023-09-30 13:32:47,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:32:47,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:32:48,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:32:48,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:32:52,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:32:52,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:32:53,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:32:55,904 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:32:56,180 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=727480.0, ans=0.125 2023-09-30 13:32:57,482 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:32:58,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:33:00,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:33:00,704 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=727480.0, ans=0.125 2023-09-30 13:33:03,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:33:08,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:33:10,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-30 13:33:10,197 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-30 13:33:10,514 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=727546.6666666666, ans=0.125 2023-09-30 13:33:11,805 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 13:33:13,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:33:13,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-30 13:33:13,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:33:14,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:33:14,898 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:33:14,959 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:33:14,960 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-30 13:33:16,414 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-30 13:33:16,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:33:16,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:33:18,523 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=727546.6666666666, ans=0.2 2023-09-30 13:33:21,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-30 13:33:21,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:33:22,473 INFO [train.py:1039] (1/4) Epoch 21, batch 2900, loss[loss=0.1757, simple_loss=0.2666, pruned_loss=0.04237, over 24339.00 frames. ], tot_loss[loss=0.1728, simple_loss=0.2487, pruned_loss=0.04849, over 4707275.25 frames. ], batch size: 74, lr: 4.89e-03, grad_scale: 16.0 2023-09-30 13:33:22,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:33:24,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-30 13:33:29,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:33:29,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-30 13:33:31,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-30 13:33:32,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:33:32,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-30 13:33:33,247 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=727613.3333333334, ans=0.0 2023-09-30 13:33:34,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:33:36,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:33:36,722 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 13:33:40,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:33:40,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:33:43,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-30 13:33:43,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-30 13:33:44,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-30 13:33:46,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:33:48,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-30 13:33:48,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-30 13:33:51,135 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:33:51,140 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-30 13:33:51,185 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:33:54,192 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:33:54,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-30 13:33:55,587 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.882e+02 2.103e+02 2.412e+02 3.503e+02, threshold=4.205e+02, percent-clipped=0.0 2023-09-30 13:33:57,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:33:58,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:33:59,106 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=727746.6666666666, ans=0.1 2023-09-30 13:34:04,280 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.40 vs. limit=12.0 2023-09-30 13:34:04,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:34:07,891 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:34:09,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-30 13:34:09,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-30 13:34:09,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:34:13,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:34:15,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-30 13:34:16,700 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:34:21,523 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:34:32,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:34:32,065 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-30 13:34:32,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-30 13:34:37,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:34:39,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-30 13:34:39,143 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:34:39,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-30 13:34:45,008 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=727946.6666666666, ans=0.125 2023-09-30 13:34:45,991 INFO [train.py:1039] (1/4) Epoch 21, batch 2950, loss[loss=0.1603, simple_loss=0.2386, pruned_loss=0.04096, over 23615.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.25, pruned_loss=0.0494, over 4707303.40 frames. ], batch size: 149, lr: 4.89e-03, grad_scale: 16.0 2023-09-30 13:34:46,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:34:47,766 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-30 13:34:47,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:34:47,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:34:49,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:34:52,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:34:53,085 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-30 13:34:54,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-30 13:34:55,822 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 13:34:55,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:35:02,386 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:35:03,205 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.05 vs. limit=15.0 2023-09-30 13:35:04,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:35:05,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:35:07,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:35:12,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:35:12,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:35:14,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:35:15,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:35:15,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:35:17,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-30 13:35:23,869 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-30 13:35:23,903 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-30 13:35:25,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:35:27,441 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-30 13:35:29,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-30 13:35:29,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:35:29,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:35:29,162 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-30 13:35:29,169 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-30 13:35:32,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-30 13:35:32,381 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=728080.0, ans=0.125 2023-09-30 13:35:33,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:35:33,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:35:35,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:35:38,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:35:38,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:35:38,506 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-30 13:35:38,572 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:35:40,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-30 13:35:45,327 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:35:46,285 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.12 vs. limit=15.0 2023-09-30 13:35:46,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:35:46,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-30 13:35:46,956 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:35:49,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-30 13:35:52,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:35:52,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:35:54,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:35:55,910 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:35:55,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 13:35:58,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:35:59,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:35:59,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-30 13:35:59,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:36:01,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:36:01,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:36:02,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:36:02,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-30 13:36:04,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:36:06,192 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:36:06,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-30 13:36:09,253 INFO [train.py:1039] (1/4) Epoch 21, batch 3000, loss[loss=0.1549, simple_loss=0.2317, pruned_loss=0.0391, over 24363.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2516, pruned_loss=0.04981, over 4712437.00 frames. ], batch size: 61, lr: 4.89e-03, grad_scale: 16.0 2023-09-30 13:36:09,254 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-30 13:36:24,065 INFO [train.py:1071] (1/4) Epoch 21, validation: loss=0.3084, simple_loss=0.2796, pruned_loss=0.1686, over 1125622.00 frames. 2023-09-30 13:36:24,067 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-30 13:36:25,699 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-30 13:36:25,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-30 13:36:29,499 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=728280.0, ans=0.07 2023-09-30 13:36:30,641 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:36:30,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:36:30,813 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-30 13:36:32,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:36:34,348 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=728280.0, ans=0.125 2023-09-30 13:36:38,620 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 13:36:39,051 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=728346.6666666666, ans=0.125 2023-09-30 13:36:47,967 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:36:54,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-30 13:36:54,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:36:58,112 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.01 vs. limit=15.0 2023-09-30 13:36:58,423 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.897e+02 2.038e+02 2.302e+02 2.961e+02, threshold=4.076e+02, percent-clipped=0.0 2023-09-30 13:37:00,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 13:37:00,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:37:00,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:37:04,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:37:04,206 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-30 13:37:04,477 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.max_abs, batch_count=728413.3333333334, ans=10.0 2023-09-30 13:37:05,894 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-30 13:37:07,940 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.93 vs. limit=10.0 2023-09-30 13:37:08,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:37:08,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 13:37:11,816 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:37:11,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:37:12,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:37:12,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:37:17,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 13:37:18,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:37:18,005 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:37:19,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=728480.0, ans=0.1 2023-09-30 13:37:20,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:37:22,622 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-30 13:37:24,118 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:37:24,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:37:24,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:37:26,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:37:28,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:37:29,870 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-30 13:37:29,931 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-30 13:37:31,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:37:31,861 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-30 13:37:31,937 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:37:35,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-30 13:37:39,077 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-30 13:37:39,329 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=728546.6666666666, ans=0.2 2023-09-30 13:37:40,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 13:37:40,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-30 13:37:40,811 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-30 13:37:40,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 13:37:40,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:37:42,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:37:42,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-30 13:37:42,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:37:43,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:37:45,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-30 13:37:46,985 INFO [train.py:1039] (1/4) Epoch 21, batch 3050, loss[loss=0.2445, simple_loss=0.3016, pruned_loss=0.09368, over 19754.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2525, pruned_loss=0.05015, over 4716989.41 frames. ], batch size: 388, lr: 4.89e-03, grad_scale: 8.0 2023-09-30 13:37:47,250 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:37:50,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:37:51,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:37:56,224 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:37:59,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-30 13:38:04,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-30 13:38:06,223 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-30 13:38:06,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:38:11,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:38:16,364 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:38:16,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:38:16,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:38:16,721 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=728680.0, ans=0.0 2023-09-30 13:38:17,369 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.61 vs. limit=12.0 2023-09-30 13:38:18,660 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1.whitening_limit, batch_count=728746.6666666666, ans=10.0 2023-09-30 13:38:19,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:38:20,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-30 13:38:20,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:38:22,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:38:22,327 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:38:23,866 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:38:25,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:38:27,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:38:27,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-30 13:38:28,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:38:28,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 13:38:33,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:38:34,735 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 13:38:34,827 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:38:35,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:38:40,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:38:41,482 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:38:48,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:38:49,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:38:49,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:38:51,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:38:52,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 13:38:52,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:38:54,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-30 13:38:55,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:38:55,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:38:57,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-30 13:38:58,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:39:06,259 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:39:07,753 INFO [train.py:1039] (1/4) Epoch 21, batch 3100, loss[loss=0.1763, simple_loss=0.2633, pruned_loss=0.04468, over 24660.00 frames. ], tot_loss[loss=0.1761, simple_loss=0.2521, pruned_loss=0.05007, over 4718724.99 frames. ], batch size: 73, lr: 4.89e-03, grad_scale: 8.0 2023-09-30 13:39:09,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:39:11,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 13:39:13,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-30 13:39:14,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-30 13:39:16,574 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=728946.6666666666, ans=0.125 2023-09-30 13:39:17,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-30 13:39:17,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:39:21,755 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:39:21,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:39:25,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-30 13:39:27,529 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=729013.3333333334, ans=0.125 2023-09-30 13:39:30,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:39:34,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-30 13:39:39,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 13:39:39,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:39:39,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:39:39,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:39:40,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-30 13:39:42,282 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.906e+02 2.121e+02 2.536e+02 4.254e+02, threshold=4.242e+02, percent-clipped=1.0 2023-09-30 13:39:44,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:39:44,544 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-30 13:39:44,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:39:45,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:39:47,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-30 13:39:49,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:39:52,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:39:54,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-30 13:39:56,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-30 13:39:56,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:39:57,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:40:01,295 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:40:02,685 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:40:02,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:40:04,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-30 13:40:04,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:40:05,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:40:05,872 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:40:05,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:40:05,888 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 13:40:12,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:40:12,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-30 13:40:12,829 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.28 vs. limit=15.0 2023-09-30 13:40:15,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:40:15,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-30 13:40:15,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:40:16,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:40:16,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-30 13:40:28,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-30 13:40:30,100 INFO [train.py:1039] (1/4) Epoch 21, batch 3150, loss[loss=0.1809, simple_loss=0.2449, pruned_loss=0.05846, over 23790.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2511, pruned_loss=0.04964, over 4713369.77 frames. ], batch size: 212, lr: 4.89e-03, grad_scale: 8.0 2023-09-30 13:40:30,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:40:32,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:40:34,745 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:40:35,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:40:36,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-30 13:40:37,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:40:37,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-30 13:40:39,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-30 13:40:40,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:40:42,365 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-30 13:40:44,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-30 13:40:44,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:40:45,881 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-30 13:40:46,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-30 13:40:47,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-30 13:40:48,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-30 13:40:48,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-30 13:40:49,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:40:49,011 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:40:49,316 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=729346.6666666666, ans=0.125 2023-09-30 13:40:50,503 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:40:51,375 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-30 13:40:52,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:40:54,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:40:54,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:40:56,304 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=729346.6666666666, ans=0.2 2023-09-30 13:40:57,928 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-30 13:41:02,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-30 13:41:02,858 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:41:07,081 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-30 13:41:07,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:41:07,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-30 13:41:07,936 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.06 vs. limit=12.0 2023-09-30 13:41:10,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-30 13:41:11,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:41:12,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 13:41:12,067 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 13:41:13,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:41:13,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 13:41:15,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-30 13:41:15,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-30 13:41:16,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-30 13:41:18,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 13:41:18,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:41:19,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:41:19,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:41:21,217 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-30 13:41:22,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:41:24,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-30 13:41:24,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:41:26,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-30 13:41:26,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-30 13:41:28,182 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:41:29,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:41:31,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-30 13:41:31,152 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 13:41:32,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:41:35,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:41:37,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:41:37,604 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:41:42,911 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:41:43,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:41:45,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-30 13:41:47,645 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=729546.6666666666, ans=0.025 2023-09-30 13:41:51,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:41:51,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-30 13:41:53,362 INFO [train.py:1039] (1/4) Epoch 21, batch 3200, loss[loss=0.1712, simple_loss=0.2508, pruned_loss=0.0458, over 23699.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2498, pruned_loss=0.04922, over 4711885.34 frames. ], batch size: 106, lr: 4.89e-03, grad_scale: 16.0 2023-09-30 13:41:53,864 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=729613.3333333334, ans=0.0 2023-09-30 13:41:56,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:41:58,040 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:41:58,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-30 13:42:00,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:42:06,689 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-30 13:42:08,582 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=729680.0, ans=0.125 2023-09-30 13:42:10,367 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:42:11,413 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.59 vs. limit=6.0 2023-09-30 13:42:18,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:42:23,146 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=729680.0, ans=0.0 2023-09-30 13:42:28,690 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.936e+02 2.183e+02 2.546e+02 4.680e+02, threshold=4.365e+02, percent-clipped=1.0 2023-09-30 13:42:28,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-30 13:42:29,153 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=729746.6666666666, ans=0.125 2023-09-30 13:42:30,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:42:34,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-30 13:42:35,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 13:42:40,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:42:40,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 13:42:41,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:42:41,924 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=729813.3333333334, ans=0.1 2023-09-30 13:42:47,373 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-30 13:42:48,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-30 13:42:50,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-30 13:42:54,047 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-30 13:42:55,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:43:01,954 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:43:03,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:43:03,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:43:03,470 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-30 13:43:03,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 13:43:06,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:43:08,872 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-30 13:43:10,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-30 13:43:10,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-30 13:43:13,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-30 13:43:13,563 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=729880.0, ans=0.0 2023-09-30 13:43:14,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:43:16,228 INFO [train.py:1039] (1/4) Epoch 21, batch 3250, loss[loss=0.1617, simple_loss=0.2422, pruned_loss=0.04054, over 24665.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.2503, pruned_loss=0.04923, over 4722114.61 frames. ], batch size: 65, lr: 4.88e-03, grad_scale: 16.0 2023-09-30 13:43:17,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-30 13:43:17,785 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-30 13:43:17,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:43:19,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:43:19,451 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-30 13:43:23,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:43:26,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:43:34,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:43:34,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-30 13:43:36,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:43:36,249 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:43:36,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:43:39,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:43:39,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 13:43:42,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:43:43,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:43:43,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:43:44,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:43:44,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:43:44,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:43:47,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:43:49,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:43:50,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:43:50,802 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:43:52,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:43:54,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:43:54,258 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:43:56,112 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=730080.0, ans=0.125 2023-09-30 13:43:59,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-30 13:44:00,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:44:00,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:44:02,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:44:02,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-30 13:44:08,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 13:44:14,653 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:44:16,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:44:16,687 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-30 13:44:16,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:44:16,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 13:44:16,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:44:19,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-30 13:44:19,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-30 13:44:21,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:44:21,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:44:22,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:44:22,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-30 13:44:24,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:44:28,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:44:28,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:44:29,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-30 13:44:29,641 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:44:33,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:44:33,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-30 13:44:37,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:44:37,178 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-30 13:44:38,548 INFO [train.py:1039] (1/4) Epoch 21, batch 3300, loss[loss=0.1731, simple_loss=0.2445, pruned_loss=0.05083, over 23502.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2512, pruned_loss=0.04984, over 4725783.05 frames. ], batch size: 134, lr: 4.88e-03, grad_scale: 16.0 2023-09-30 13:44:38,902 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-30 13:44:40,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-30 13:44:41,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:44:46,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:44:47,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:44:47,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:44:50,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 13:44:50,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 13:44:53,460 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=730346.6666666666, ans=0.0 2023-09-30 13:44:54,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:44:56,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:44:56,654 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=730346.6666666666, ans=0.0 2023-09-30 13:45:01,012 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-30 13:45:01,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:45:01,136 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:45:01,233 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=730346.6666666666, ans=0.125 2023-09-30 13:45:03,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:45:03,311 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-30 13:45:04,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:45:06,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 13:45:07,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 13:45:07,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:45:09,826 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-30 13:45:13,414 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.837e+02 2.019e+02 2.374e+02 3.048e+02, threshold=4.038e+02, percent-clipped=0.0 2023-09-30 13:45:13,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:45:13,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-30 13:45:16,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:45:16,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-30 13:45:16,957 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=730413.3333333334, ans=0.5 2023-09-30 13:45:18,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-30 13:45:18,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:45:19,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:45:22,566 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-30 13:45:22,885 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=730413.3333333334, ans=0.0 2023-09-30 13:45:24,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-30 13:45:25,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:45:27,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-30 13:45:29,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:45:29,868 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=730480.0, ans=0.0 2023-09-30 13:45:32,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:45:34,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:45:35,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:45:37,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:45:37,184 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:45:37,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:45:40,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:45:40,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:45:42,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:45:43,872 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-30 13:45:45,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-30 13:45:47,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-30 13:45:47,145 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:45:47,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:45:49,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:45:49,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:45:50,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 13:45:51,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:45:52,411 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-30 13:45:53,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:45:54,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 13:45:56,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-30 13:45:57,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:45:58,507 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:45:59,869 INFO [train.py:1039] (1/4) Epoch 21, batch 3350, loss[loss=0.1761, simple_loss=0.2567, pruned_loss=0.0477, over 24452.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2512, pruned_loss=0.04943, over 4729293.56 frames. ], batch size: 66, lr: 4.88e-03, grad_scale: 16.0 2023-09-30 13:46:01,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 13:46:01,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:46:04,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:46:04,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:46:04,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:46:09,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:46:11,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:46:11,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:46:13,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:46:15,543 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=730680.0, ans=0.1 2023-09-30 13:46:16,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:46:16,920 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=730680.0, ans=0.0 2023-09-30 13:46:18,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:46:18,880 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.16 vs. limit=15.0 2023-09-30 13:46:20,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:46:20,686 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=730680.0, ans=0.1 2023-09-30 13:46:21,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-30 13:46:23,818 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-30 13:46:23,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:46:27,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-30 13:46:27,105 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-30 13:46:28,654 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:46:28,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:46:31,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:46:31,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-30 13:46:31,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:46:33,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:46:34,746 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:46:36,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:46:37,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:46:37,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:46:39,766 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=730746.6666666666, ans=0.0 2023-09-30 13:46:41,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:46:43,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:46:43,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:46:46,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:46:48,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:46:51,311 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:46:51,326 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:46:52,233 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.81 vs. limit=15.0 2023-09-30 13:46:53,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:46:56,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-30 13:46:56,595 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 13:46:56,656 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-30 13:46:56,731 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:46:58,365 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-30 13:46:58,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:47:00,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:47:07,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:47:09,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-30 13:47:09,655 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=730880.0, ans=0.09899494936611666 2023-09-30 13:47:10,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 13:47:12,921 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:47:13,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:47:17,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:47:20,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-30 13:47:20,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 13:47:20,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:47:22,755 INFO [train.py:1039] (1/4) Epoch 21, batch 3400, loss[loss=0.2323, simple_loss=0.2977, pruned_loss=0.08347, over 19412.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2526, pruned_loss=0.05013, over 4723771.26 frames. ], batch size: 388, lr: 4.88e-03, grad_scale: 8.0 2023-09-30 13:47:24,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:47:25,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-30 13:47:27,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:47:27,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-30 13:47:29,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:47:29,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:47:29,651 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-30 13:47:31,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:47:31,170 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-30 13:47:36,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-30 13:47:36,443 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-30 13:47:36,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:47:36,724 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=730946.6666666666, ans=0.1 2023-09-30 13:47:40,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:47:40,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 13:47:42,436 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:47:43,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:47:49,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:47:50,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-30 13:47:58,453 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.479e+02 1.830e+02 2.005e+02 2.277e+02 4.714e+02, threshold=4.010e+02, percent-clipped=1.0 2023-09-30 13:47:58,558 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:47:58,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:48:00,180 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:48:00,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-30 13:48:07,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:48:12,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-30 13:48:13,116 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.04 vs. limit=15.0 2023-09-30 13:48:16,898 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:48:16,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:48:17,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-30 13:48:17,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:48:18,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:48:20,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:48:20,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:48:23,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:48:26,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 13:48:26,874 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:48:30,691 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:48:34,213 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-30 13:48:40,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 13:48:45,280 INFO [train.py:1039] (1/4) Epoch 21, batch 3450, loss[loss=0.1793, simple_loss=0.263, pruned_loss=0.04782, over 23450.00 frames. ], tot_loss[loss=0.1763, simple_loss=0.253, pruned_loss=0.04977, over 4725060.02 frames. ], batch size: 93, lr: 4.88e-03, grad_scale: 8.0 2023-09-30 13:48:45,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-30 13:48:50,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-30 13:48:50,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:48:50,818 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.84 vs. limit=12.0 2023-09-30 13:48:51,678 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:48:51,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-30 13:48:53,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:48:55,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-30 13:49:01,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:49:01,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:49:03,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:49:03,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:49:05,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:49:11,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-30 13:49:18,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-30 13:49:18,817 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=731413.3333333334, ans=0.0 2023-09-30 13:49:20,042 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 13:49:20,118 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:49:21,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:49:26,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-30 13:49:26,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 13:49:32,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:49:32,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:49:33,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-30 13:49:36,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:49:39,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-30 13:49:39,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:49:41,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:49:44,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:49:44,900 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=731480.0, ans=0.0 2023-09-30 13:49:45,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-30 13:49:49,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:49:55,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:49:57,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:49:57,454 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=731546.6666666666, ans=0.1 2023-09-30 13:49:58,886 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:50:03,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:50:03,510 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:50:04,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:50:05,025 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:50:07,251 INFO [train.py:1039] (1/4) Epoch 21, batch 3500, loss[loss=0.1621, simple_loss=0.2218, pruned_loss=0.05118, over 23705.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2518, pruned_loss=0.04911, over 4738017.27 frames. ], batch size: 232, lr: 4.88e-03, grad_scale: 8.0 2023-09-30 13:50:10,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:50:13,898 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:50:13,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-30 13:50:15,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 13:50:18,640 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-30 13:50:20,294 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=731613.3333333334, ans=0.2 2023-09-30 13:50:21,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:50:21,628 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-30 13:50:28,256 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:50:29,740 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:50:29,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:50:29,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:50:29,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-30 13:50:31,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:50:31,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:50:31,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-30 13:50:33,322 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=731680.0, ans=0.0 2023-09-30 13:50:34,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:50:35,919 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-30 13:50:37,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:50:41,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:50:42,254 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.60 vs. limit=15.0 2023-09-30 13:50:43,334 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.924e+02 2.163e+02 2.586e+02 4.135e+02, threshold=4.325e+02, percent-clipped=1.0 2023-09-30 13:50:43,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-30 13:50:43,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:50:46,408 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:50:47,277 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.63 vs. limit=15.0 2023-09-30 13:50:47,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:50:48,104 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:50:49,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:50:51,065 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:50:51,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-30 13:50:52,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-30 13:50:53,151 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 13:50:54,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-30 13:50:54,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:50:55,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:50:57,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:50:57,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 13:50:59,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 13:51:01,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:51:04,990 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.34 vs. limit=15.0 2023-09-30 13:51:07,265 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:51:07,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-30 13:51:07,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-30 13:51:07,444 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:51:08,116 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.06 vs. limit=15.0 2023-09-30 13:51:12,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:51:12,506 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:51:14,719 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:51:17,609 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-30 13:51:19,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:51:20,699 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:51:20,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-30 13:51:23,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-30 13:51:25,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:51:25,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:51:26,860 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:51:26,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:51:28,330 INFO [train.py:1039] (1/4) Epoch 21, batch 3550, loss[loss=0.1597, simple_loss=0.2309, pruned_loss=0.04427, over 24329.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2503, pruned_loss=0.04863, over 4731671.10 frames. ], batch size: 56, lr: 4.88e-03, grad_scale: 8.0 2023-09-30 13:51:30,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:51:41,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:51:43,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 13:51:45,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:51:48,644 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:51:50,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:51:50,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:51:50,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 13:51:52,779 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=732013.3333333334, ans=0.09899494936611666 2023-09-30 13:51:54,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:51:55,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:51:57,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:51:57,383 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-30 13:51:58,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 13:52:03,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:52:03,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:52:06,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:52:06,861 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:52:06,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:52:07,213 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=732080.0, ans=0.1 2023-09-30 13:52:08,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-30 13:52:08,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:52:09,285 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=732080.0, ans=0.125 2023-09-30 13:52:10,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:52:12,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 13:52:16,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:52:18,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:52:18,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:52:18,788 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=732146.6666666666, ans=0.125 2023-09-30 13:52:21,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-30 13:52:21,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:52:23,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-30 13:52:25,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:52:27,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:52:27,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:52:30,230 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.56 vs. limit=15.0 2023-09-30 13:52:30,679 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-30 13:52:32,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:52:33,916 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_abs, batch_count=732213.3333333334, ans=0.5 2023-09-30 13:52:38,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:52:39,913 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-30 13:52:39,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:52:44,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:52:46,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-30 13:52:51,309 INFO [train.py:1039] (1/4) Epoch 21, batch 3600, loss[loss=0.1548, simple_loss=0.2331, pruned_loss=0.03823, over 24344.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2499, pruned_loss=0.04884, over 4721628.11 frames. ], batch size: 61, lr: 4.88e-03, grad_scale: 16.0 2023-09-30 13:52:51,583 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-30 13:52:52,248 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.45 vs. limit=15.0 2023-09-30 13:52:52,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:52:54,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:52:56,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:52:58,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:52:58,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:53:02,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:53:03,541 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.37 vs. limit=6.0 2023-09-30 13:53:03,973 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:53:04,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:53:05,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:53:06,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:53:06,997 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-30 13:53:10,214 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 13:53:11,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:53:16,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:53:19,974 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:53:21,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 13:53:21,540 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:53:21,569 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-30 13:53:23,126 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:53:24,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:53:26,152 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:53:27,441 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.948e+02 2.290e+02 2.674e+02 4.312e+02, threshold=4.579e+02, percent-clipped=0.0 2023-09-30 13:53:27,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:53:31,304 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:53:32,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:53:32,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-30 13:53:33,166 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=732413.3333333334, ans=0.0 2023-09-30 13:53:40,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:53:40,625 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=732480.0, ans=0.125 2023-09-30 13:53:41,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 13:53:43,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-30 13:53:48,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:53:48,270 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=732480.0, ans=0.1 2023-09-30 13:53:54,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:53:56,622 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-30 13:53:59,134 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:54:05,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-30 13:54:06,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:54:06,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-30 13:54:06,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-30 13:54:08,243 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-30 13:54:10,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:54:10,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:54:11,124 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.36 vs. limit=15.0 2023-09-30 13:54:12,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-30 13:54:14,039 INFO [train.py:1039] (1/4) Epoch 21, batch 3650, loss[loss=0.1828, simple_loss=0.2681, pruned_loss=0.04872, over 24027.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2504, pruned_loss=0.04886, over 4726433.67 frames. ], batch size: 80, lr: 4.88e-03, grad_scale: 16.0 2023-09-30 13:54:14,160 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:54:14,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:54:14,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:54:14,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-30 13:54:15,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-30 13:54:18,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:54:20,291 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-30 13:54:25,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-30 13:54:25,438 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=732613.3333333334, ans=0.1 2023-09-30 13:54:26,745 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:54:26,911 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=732613.3333333334, ans=0.2 2023-09-30 13:54:29,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-30 13:54:31,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-30 13:54:33,854 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=732680.0, ans=0.125 2023-09-30 13:54:36,476 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:54:36,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:54:36,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 13:54:38,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-30 13:54:38,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:54:40,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-30 13:54:42,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-30 13:54:42,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:54:43,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-30 13:54:45,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 13:54:45,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:54:45,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:54:48,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:54:49,985 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.23 vs. limit=15.0 2023-09-30 13:54:50,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-30 13:54:52,040 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-30 13:54:52,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:54:53,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-30 13:54:54,450 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=6.30 vs. limit=12.0 2023-09-30 13:54:55,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:54:55,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:55:01,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:55:01,819 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=732813.3333333334, ans=0.0 2023-09-30 13:55:03,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:55:03,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-30 13:55:06,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:55:06,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:55:09,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:55:12,806 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:55:13,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:55:13,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:55:16,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 13:55:16,576 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:55:16,678 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:55:23,805 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-30 13:55:26,896 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:55:26,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:55:27,054 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-30 13:55:28,548 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:55:29,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:55:30,292 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=732880.0, ans=0.1 2023-09-30 13:55:31,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:55:34,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-30 13:55:34,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:55:35,778 INFO [train.py:1039] (1/4) Epoch 21, batch 3700, loss[loss=0.1545, simple_loss=0.2369, pruned_loss=0.03602, over 24670.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2514, pruned_loss=0.04979, over 4718379.56 frames. ], batch size: 65, lr: 4.87e-03, grad_scale: 16.0 2023-09-30 13:55:37,652 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 13:55:41,145 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:55:41,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:55:44,316 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:55:44,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-30 13:55:44,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:55:45,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 13:55:45,857 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 13:55:49,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 13:55:54,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:55:55,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:55:56,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:55:56,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:55:57,649 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 13:56:00,013 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.66 vs. limit=15.0 2023-09-30 13:56:00,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:56:00,901 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-30 13:56:08,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:56:08,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 13:56:10,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 13:56:10,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-30 13:56:10,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:56:10,821 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=733080.0, ans=0.1 2023-09-30 13:56:11,740 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.508e+02 1.839e+02 1.995e+02 2.258e+02 3.801e+02, threshold=3.991e+02, percent-clipped=0.0 2023-09-30 13:56:15,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:56:16,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-30 13:56:18,658 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:56:18,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:56:22,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:56:22,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 13:56:25,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 13:56:26,148 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=5.75 vs. limit=15.0 2023-09-30 13:56:29,782 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:56:29,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-30 13:56:29,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:56:29,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-30 13:56:36,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:56:37,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:56:38,464 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.31 vs. limit=15.0 2023-09-30 13:56:39,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:56:39,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-30 13:56:42,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:56:42,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-30 13:56:43,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:56:43,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:56:47,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:56:47,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-30 13:56:48,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-30 13:56:50,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:56:50,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:56:51,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:56:52,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:56:57,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:56:59,059 INFO [train.py:1039] (1/4) Epoch 21, batch 3750, loss[loss=0.1801, simple_loss=0.2566, pruned_loss=0.05185, over 23426.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.2525, pruned_loss=0.05047, over 4712170.81 frames. ], batch size: 106, lr: 4.87e-03, grad_scale: 16.0 2023-09-30 13:56:59,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:56:59,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:57:02,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-30 13:57:04,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 13:57:07,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-30 13:57:07,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-30 13:57:09,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:57:09,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:57:10,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:57:12,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:57:15,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:57:15,681 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=733346.6666666666, ans=0.1 2023-09-30 13:57:15,753 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=733346.6666666666, ans=0.09899494936611666 2023-09-30 13:57:18,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:57:18,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 13:57:21,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:57:24,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:57:24,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-30 13:57:25,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:57:26,606 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.whiten.whitening_limit, batch_count=733346.6666666666, ans=12.0 2023-09-30 13:57:27,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:57:27,196 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:57:32,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-30 13:57:36,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-30 13:57:36,334 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=733413.3333333334, ans=0.125 2023-09-30 13:57:37,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:57:39,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:57:40,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:57:44,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:57:46,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-30 13:57:52,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-30 13:57:55,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:57:59,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:57:59,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:58:04,317 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:58:08,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 13:58:09,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-30 13:58:11,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 13:58:13,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:58:17,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:58:20,406 INFO [train.py:1039] (1/4) Epoch 21, batch 3800, loss[loss=0.1641, simple_loss=0.2314, pruned_loss=0.04838, over 23698.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.2525, pruned_loss=0.05048, over 4708857.71 frames. ], batch size: 232, lr: 4.87e-03, grad_scale: 8.0 2023-09-30 13:58:25,184 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:58:28,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:58:29,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 13:58:31,325 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-30 13:58:32,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:58:36,831 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:58:36,959 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-30 13:58:37,220 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=733680.0, ans=0.2 2023-09-30 13:58:40,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 13:58:40,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:58:40,670 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:58:42,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:58:43,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:58:43,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:58:43,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-30 13:58:47,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-30 13:58:48,974 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:58:52,245 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=733746.6666666666, ans=0.125 2023-09-30 13:58:53,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:58:55,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:58:55,423 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=733746.6666666666, ans=0.125 2023-09-30 13:58:56,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 13:58:58,111 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 1.864e+02 2.186e+02 2.612e+02 3.955e+02, threshold=4.372e+02, percent-clipped=0.0 2023-09-30 13:58:58,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-30 13:58:58,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:58:59,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:59:01,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:59:06,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 13:59:06,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-30 13:59:08,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:59:10,021 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=733813.3333333334, ans=0.125 2023-09-30 13:59:15,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:59:20,230 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2.whitening_limit, batch_count=733813.3333333334, ans=15.0 2023-09-30 13:59:21,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:59:21,429 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=733813.3333333334, ans=0.125 2023-09-30 13:59:22,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-30 13:59:24,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-30 13:59:25,654 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:59:28,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:59:30,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:59:31,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-30 13:59:31,984 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=733880.0, ans=0.2 2023-09-30 13:59:34,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-30 13:59:34,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-30 13:59:34,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:59:36,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:59:41,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:59:41,852 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 13:59:43,081 INFO [train.py:1039] (1/4) Epoch 21, batch 3850, loss[loss=0.1713, simple_loss=0.257, pruned_loss=0.04279, over 24286.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.251, pruned_loss=0.05006, over 4710379.08 frames. ], batch size: 74, lr: 4.87e-03, grad_scale: 8.0 2023-09-30 13:59:47,257 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=733946.6666666666, ans=0.1 2023-09-30 13:59:48,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:59:50,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-30 13:59:50,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 13:59:50,460 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=733946.6666666666, ans=0.1 2023-09-30 13:59:52,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:59:55,403 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 13:59:59,672 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:00:01,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-30 14:00:01,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-30 14:00:02,460 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.83 vs. limit=12.0 2023-09-30 14:00:03,738 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.37 vs. limit=15.0 2023-09-30 14:00:09,065 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:00:10,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:00:15,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:00:15,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:00:17,912 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.29 vs. limit=15.0 2023-09-30 14:00:18,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:00:19,094 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:00:21,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:00:21,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:00:23,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:00:24,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:00:26,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:00:26,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:00:26,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-30 14:00:26,456 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-30 14:00:27,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:00:27,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:00:31,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:00:31,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:00:31,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-30 14:00:32,789 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.14 vs. limit=22.5 2023-09-30 14:00:34,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-30 14:00:36,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:00:39,294 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-30 14:00:40,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-30 14:00:45,835 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=734146.6666666666, ans=0.0 2023-09-30 14:00:46,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:00:47,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:00:50,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:00:52,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-30 14:00:54,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-30 14:00:59,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:00:59,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:00:59,984 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=734213.3333333334, ans=0.125 2023-09-30 14:01:01,788 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.58 vs. limit=15.0 2023-09-30 14:01:03,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 14:01:03,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 14:01:03,258 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:01:04,728 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:01:04,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:01:04,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-30 14:01:04,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:01:05,487 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.39 vs. limit=15.0 2023-09-30 14:01:06,279 INFO [train.py:1039] (1/4) Epoch 21, batch 3900, loss[loss=0.1648, simple_loss=0.2352, pruned_loss=0.04723, over 23461.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.2497, pruned_loss=0.04949, over 4713948.28 frames. ], batch size: 134, lr: 4.87e-03, grad_scale: 8.0 2023-09-30 14:01:07,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-30 14:01:07,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:01:07,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:01:09,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:01:09,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:01:11,075 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:01:11,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:01:11,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:01:12,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:01:12,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-30 14:01:12,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:01:14,424 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:01:15,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 14:01:16,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:01:17,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:01:22,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 14:01:22,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:01:23,984 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-30 14:01:24,139 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=734346.6666666666, ans=0.125 2023-09-30 14:01:26,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-30 14:01:27,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:01:28,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-30 14:01:28,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:01:31,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-30 14:01:31,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-30 14:01:38,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:01:39,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:01:39,625 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:01:41,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:01:41,423 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=734413.3333333334, ans=0.125 2023-09-30 14:01:44,190 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.861e+02 2.057e+02 2.419e+02 3.679e+02, threshold=4.115e+02, percent-clipped=0.0 2023-09-30 14:01:44,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:01:47,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:01:48,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:01:49,000 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:01:49,894 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.06 vs. limit=15.0 2023-09-30 14:01:50,482 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:01:57,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:01:57,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:02:06,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 14:02:06,659 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:02:13,820 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=734546.6666666666, ans=0.0 2023-09-30 14:02:17,978 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:02:19,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:02:21,228 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-30 14:02:21,297 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-30 14:02:22,693 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:02:22,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-30 14:02:24,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:02:25,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-30 14:02:28,995 INFO [train.py:1039] (1/4) Epoch 21, batch 3950, loss[loss=0.1864, simple_loss=0.2697, pruned_loss=0.05159, over 24361.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2491, pruned_loss=0.04925, over 4705288.97 frames. ], batch size: 77, lr: 4.87e-03, grad_scale: 8.0 2023-09-30 14:02:32,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:02:34,275 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-30 14:02:34,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:02:38,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:02:41,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:02:45,392 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-30 14:02:46,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:02:46,867 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-30 14:02:48,821 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-30 14:02:48,860 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:02:50,742 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=734680.0, ans=0.125 2023-09-30 14:02:51,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:02:51,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:02:51,967 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:02:55,065 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-30 14:02:56,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:02:58,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:02:58,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:02:58,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:02:59,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:03:11,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:03:11,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:03:11,614 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=734746.6666666666, ans=0.035 2023-09-30 14:03:15,533 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=734746.6666666666, ans=0.1 2023-09-30 14:03:16,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-30 14:03:24,265 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.74 vs. limit=12.0 2023-09-30 14:03:24,781 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-30 14:03:24,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-30 14:03:24,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:03:27,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:03:28,070 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=734813.3333333334, ans=0.2 2023-09-30 14:03:35,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-30 14:03:36,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-30 14:03:37,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:03:37,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:03:37,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-30 14:03:38,345 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.42 vs. limit=15.0 2023-09-30 14:03:39,479 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=734880.0, ans=0.125 2023-09-30 14:03:40,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:03:42,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:03:47,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-30 14:03:52,513 INFO [train.py:1039] (1/4) Epoch 21, batch 4000, loss[loss=0.1737, simple_loss=0.247, pruned_loss=0.05015, over 23393.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.2498, pruned_loss=0.04969, over 4710050.74 frames. ], batch size: 119, lr: 4.87e-03, grad_scale: 16.0 2023-09-30 14:03:59,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:03:59,638 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=734946.6666666666, ans=0.125 2023-09-30 14:04:05,446 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:04:12,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:04:12,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:04:13,531 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:04:13,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-30 14:04:15,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-30 14:04:15,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-30 14:04:15,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:04:15,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-30 14:04:17,437 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.11 vs. limit=15.0 2023-09-30 14:04:18,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:04:21,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:04:21,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:04:21,394 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:04:21,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:04:21,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-30 14:04:24,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:04:26,264 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-30 14:04:28,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:04:28,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:04:29,849 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.867e+02 2.046e+02 2.259e+02 3.289e+02, threshold=4.093e+02, percent-clipped=0.0 2023-09-30 14:04:30,484 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=735080.0, ans=0.1 2023-09-30 14:04:31,606 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-30 14:04:33,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 14:04:33,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:04:33,870 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=735080.0, ans=0.2 2023-09-30 14:04:35,597 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=735080.0, ans=0.0 2023-09-30 14:04:38,534 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-30 14:04:40,034 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:04:40,517 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=735146.6666666666, ans=0.09899494936611666 2023-09-30 14:04:41,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:04:42,329 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.81 vs. limit=15.0 2023-09-30 14:04:43,160 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-30 14:04:44,565 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:04:44,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-30 14:04:44,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:04:46,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:04:48,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:04:49,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:04:49,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:04:49,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:04:50,224 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=735146.6666666666, ans=0.125 2023-09-30 14:04:52,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-30 14:04:52,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:04:53,273 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=735146.6666666666, ans=0.09899494936611666 2023-09-30 14:04:55,977 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-30 14:04:59,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:05:03,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 14:05:06,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 14:05:07,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:05:08,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:05:09,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:05:14,491 INFO [train.py:1039] (1/4) Epoch 21, batch 4050, loss[loss=0.1652, simple_loss=0.2411, pruned_loss=0.04463, over 23449.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2516, pruned_loss=0.04988, over 4702573.78 frames. ], batch size: 119, lr: 4.87e-03, grad_scale: 8.0 2023-09-30 14:05:16,135 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:05:17,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-30 14:05:19,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-30 14:05:20,769 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:05:22,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:05:22,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:05:24,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-30 14:05:24,853 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=735280.0, ans=0.1 2023-09-30 14:05:26,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:05:29,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:05:32,235 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:05:33,686 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 14:05:35,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:05:35,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:05:40,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:05:43,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-30 14:05:46,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 14:05:47,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-30 14:05:47,982 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-30 14:05:49,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:05:57,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-30 14:05:58,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:05:58,352 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=735413.3333333334, ans=0.1 2023-09-30 14:06:00,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:06:03,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:06:05,486 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:06:05,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:06:05,686 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=735480.0, ans=0.0 2023-09-30 14:06:08,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:06:14,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-30 14:06:14,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 14:06:16,425 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:06:17,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-30 14:06:21,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:06:28,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-30 14:06:30,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:06:30,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:06:32,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-30 14:06:32,022 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-30 14:06:32,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:06:35,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:06:36,985 INFO [train.py:1039] (1/4) Epoch 21, batch 4100, loss[loss=0.1452, simple_loss=0.2227, pruned_loss=0.03383, over 24322.00 frames. ], tot_loss[loss=0.1758, simple_loss=0.2522, pruned_loss=0.04967, over 4717113.66 frames. ], batch size: 56, lr: 4.87e-03, grad_scale: 8.0 2023-09-30 14:06:37,096 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:06:37,148 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:06:44,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-30 14:06:44,757 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=735613.3333333334, ans=0.125 2023-09-30 14:06:47,836 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-30 14:06:48,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-30 14:06:50,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-30 14:06:50,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:06:51,457 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:06:51,511 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:06:52,978 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:06:53,131 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-30 14:06:57,605 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:06:57,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:06:57,766 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:06:57,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:07:01,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 14:07:02,891 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:07:04,285 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:07:04,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-30 14:07:05,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:07:05,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:07:06,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:07:06,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:07:07,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-30 14:07:09,619 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:07:11,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-30 14:07:12,780 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:07:14,618 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:07:16,300 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:07:16,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-30 14:07:18,229 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.830e+02 1.986e+02 2.444e+02 3.912e+02, threshold=3.973e+02, percent-clipped=0.0 2023-09-30 14:07:19,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:07:19,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:07:21,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:07:22,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-30 14:07:24,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:07:26,587 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:07:29,594 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-30 14:07:29,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:07:29,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-30 14:07:34,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:07:40,262 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:07:42,672 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.22 vs. limit=15.0 2023-09-30 14:07:44,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:07:45,458 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:07:52,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:07:52,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:07:56,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:07:57,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:08:00,536 INFO [train.py:1039] (1/4) Epoch 21, batch 4150, loss[loss=0.1623, simple_loss=0.2383, pruned_loss=0.04312, over 24326.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.2521, pruned_loss=0.04987, over 4716848.87 frames. ], batch size: 56, lr: 4.86e-03, grad_scale: 8.0 2023-09-30 14:08:02,852 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-30 14:08:04,322 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:08:05,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:08:05,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:08:08,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-30 14:08:08,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:08:10,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-30 14:08:10,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-30 14:08:11,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-30 14:08:12,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:08:17,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:08:17,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:08:17,788 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=736013.3333333334, ans=0.5 2023-09-30 14:08:21,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:08:22,102 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:08:23,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-30 14:08:25,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 14:08:25,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:08:27,335 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-30 14:08:31,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:08:35,025 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:08:38,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-30 14:08:38,791 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=736080.0, ans=0.125 2023-09-30 14:08:40,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-30 14:08:40,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:08:42,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-30 14:08:42,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:08:42,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:08:44,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:08:46,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:08:51,320 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-30 14:08:51,749 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=736146.6666666666, ans=0.2 2023-09-30 14:08:54,480 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-30 14:08:55,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:08:57,392 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-30 14:08:57,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:08:57,730 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=736146.6666666666, ans=0.05 2023-09-30 14:08:59,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-30 14:09:02,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:09:02,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:09:04,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:09:05,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-30 14:09:05,788 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:09:05,792 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-30 14:09:08,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 14:09:10,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-30 14:09:10,428 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:09:10,434 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 14:09:10,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 14:09:12,479 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-30 14:09:12,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:09:12,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 14:09:14,056 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:09:14,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:09:14,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-30 14:09:14,520 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=736213.3333333334, ans=0.1 2023-09-30 14:09:15,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-30 14:09:20,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:09:22,199 INFO [train.py:1039] (1/4) Epoch 21, batch 4200, loss[loss=0.1837, simple_loss=0.2518, pruned_loss=0.05778, over 23812.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.2504, pruned_loss=0.04958, over 4717977.52 frames. ], batch size: 195, lr: 4.86e-03, grad_scale: 8.0 2023-09-30 14:09:22,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-30 14:09:24,002 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:09:26,554 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.48 vs. limit=10.0 2023-09-30 14:09:26,998 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:09:28,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:09:28,623 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:09:28,626 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:09:29,461 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.38 vs. limit=15.0 2023-09-30 14:09:32,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-30 14:09:35,059 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.29 vs. limit=15.0 2023-09-30 14:09:35,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-30 14:09:37,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:09:38,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:09:41,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:09:46,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-30 14:09:46,946 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:09:46,998 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:09:48,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-30 14:09:48,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:09:50,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:09:50,417 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=736346.6666666666, ans=0.125 2023-09-30 14:09:51,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:09:51,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:09:53,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:09:54,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-30 14:09:54,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:09:57,322 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=736413.3333333334, ans=0.1 2023-09-30 14:10:00,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-30 14:10:00,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:10:01,429 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.517e+02 1.766e+02 1.995e+02 2.283e+02 3.415e+02, threshold=3.990e+02, percent-clipped=0.0 2023-09-30 14:10:01,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:10:03,421 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:10:04,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:10:06,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:10:07,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-30 14:10:07,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:10:09,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:10:12,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:10:14,192 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:10:19,808 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.11 vs. limit=22.5 2023-09-30 14:10:22,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-30 14:10:23,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-30 14:10:27,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:10:33,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 14:10:33,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:10:34,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-30 14:10:41,694 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-30 14:10:45,458 INFO [train.py:1039] (1/4) Epoch 21, batch 4250, loss[loss=0.1949, simple_loss=0.272, pruned_loss=0.05888, over 24010.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2498, pruned_loss=0.04917, over 4711789.73 frames. ], batch size: 80, lr: 4.86e-03, grad_scale: 8.0 2023-09-30 14:10:47,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:10:47,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-30 14:10:51,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:10:56,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:10:56,685 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-30 14:10:56,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:11:01,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:11:01,498 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=736680.0, ans=0.1 2023-09-30 14:11:01,524 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:11:06,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:11:09,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:11:09,445 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:11:10,286 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.64 vs. limit=15.0 2023-09-30 14:11:12,331 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:11:12,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:11:13,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:11:16,086 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:11:17,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:11:19,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:11:19,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:11:21,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-30 14:11:25,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-30 14:11:25,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:11:25,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:11:26,760 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:11:26,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:11:26,905 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:11:26,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:11:31,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-30 14:11:32,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:11:36,715 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=736813.3333333334, ans=0.125 2023-09-30 14:11:37,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:11:39,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:11:41,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-30 14:11:41,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:11:41,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-30 14:11:42,639 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-30 14:11:44,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:11:45,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:11:45,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:11:48,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-30 14:11:51,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 14:11:52,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-30 14:11:57,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:11:59,532 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=736880.0, ans=0.125 2023-09-30 14:12:00,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:12:01,107 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=736880.0, ans=0.0 2023-09-30 14:12:02,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:12:02,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:12:03,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:12:05,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:12:06,795 INFO [train.py:1039] (1/4) Epoch 21, batch 4300, loss[loss=0.1748, simple_loss=0.2642, pruned_loss=0.04269, over 24302.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2488, pruned_loss=0.04898, over 4720132.61 frames. ], batch size: 74, lr: 4.86e-03, grad_scale: 8.0 2023-09-30 14:12:06,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:12:06,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-30 14:12:09,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:12:13,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:12:15,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:12:19,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:12:29,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:12:29,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-30 14:12:29,866 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:12:33,433 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:12:33,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:12:33,505 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-30 14:12:36,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 14:12:38,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:12:41,218 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-30 14:12:41,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 14:12:42,579 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-30 14:12:44,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 14:12:45,602 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.504e+02 1.887e+02 2.170e+02 2.542e+02 3.657e+02, threshold=4.340e+02, percent-clipped=0.0 2023-09-30 14:12:45,857 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:12:46,057 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=737080.0, ans=0.05 2023-09-30 14:12:49,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:12:49,439 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:12:51,011 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:12:52,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:12:52,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:12:54,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-30 14:12:54,311 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-30 14:12:57,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:12:59,848 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=11.33 vs. limit=15.0 2023-09-30 14:13:01,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:13:01,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 14:13:01,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:13:01,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:13:01,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-30 14:13:02,523 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-30 14:13:02,637 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-30 14:13:04,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:13:04,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-30 14:13:06,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-30 14:13:09,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:13:11,230 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-30 14:13:11,313 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:13:14,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:13:14,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:13:15,974 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-30 14:13:17,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:13:17,468 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:13:17,869 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=737213.3333333334, ans=0.125 2023-09-30 14:13:19,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:13:19,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:13:21,000 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:13:22,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:13:24,865 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.06 vs. limit=15.0 2023-09-30 14:13:25,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:13:25,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:13:25,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:13:27,977 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.39 vs. limit=6.0 2023-09-30 14:13:28,544 INFO [train.py:1039] (1/4) Epoch 21, batch 4350, loss[loss=0.1568, simple_loss=0.238, pruned_loss=0.03777, over 24470.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2495, pruned_loss=0.04914, over 4702886.68 frames. ], batch size: 63, lr: 4.86e-03, grad_scale: 8.0 2023-09-30 14:13:30,539 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:13:31,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-30 14:13:31,909 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-30 14:13:33,675 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=737280.0, ans=0.0 2023-09-30 14:13:37,212 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:13:41,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:13:44,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:13:44,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:13:49,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:13:53,469 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=737346.6666666666, ans=0.125 2023-09-30 14:13:54,459 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:13:56,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:13:57,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:14:00,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:14:02,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:14:02,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:14:07,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-30 14:14:08,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:14:08,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:14:08,782 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=737413.3333333334, ans=0.0 2023-09-30 14:14:13,484 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.42 vs. limit=15.0 2023-09-30 14:14:15,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:14:18,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-30 14:14:22,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:14:22,368 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=737480.0, ans=0.0 2023-09-30 14:14:23,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 14:14:28,847 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-30 14:14:31,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:14:31,820 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-30 14:14:33,247 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-30 14:14:33,367 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-30 14:14:33,378 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:14:34,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:14:35,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:14:36,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:14:38,061 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:14:38,124 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:14:41,264 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-30 14:14:41,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:14:41,281 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:14:41,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:14:42,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-30 14:14:44,314 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-30 14:14:44,321 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-30 14:14:44,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-30 14:14:46,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:14:47,019 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=737546.6666666666, ans=0.0 2023-09-30 14:14:48,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:14:48,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:14:48,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:14:51,252 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=737613.3333333334, ans=0.125 2023-09-30 14:14:52,188 INFO [train.py:1039] (1/4) Epoch 21, batch 4400, loss[loss=0.1799, simple_loss=0.2506, pruned_loss=0.05463, over 23798.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2498, pruned_loss=0.04878, over 4718822.75 frames. ], batch size: 195, lr: 4.86e-03, grad_scale: 16.0 2023-09-30 14:14:52,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-30 14:14:53,734 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-30 14:14:53,745 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:14:58,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:14:58,521 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:14:58,951 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=737613.3333333334, ans=0.125 2023-09-30 14:15:00,093 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:15:01,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-30 14:15:01,849 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-30 14:15:03,270 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-30 14:15:03,301 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-30 14:15:03,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 14:15:05,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:15:07,075 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-30 14:15:08,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:15:10,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:15:10,226 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-30 14:15:14,756 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:15:14,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-30 14:15:14,841 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-30 14:15:18,435 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-30 14:15:19,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-30 14:15:19,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-30 14:15:19,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:15:20,157 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:15:21,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:15:21,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:15:23,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-30 14:15:23,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-30 14:15:25,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:15:27,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:15:27,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:15:27,592 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=737746.6666666666, ans=0.1 2023-09-30 14:15:30,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:15:30,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:15:30,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-30 14:15:31,468 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.966e+02 2.237e+02 2.534e+02 3.532e+02, threshold=4.474e+02, percent-clipped=0.0 2023-09-30 14:15:31,660 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-30 14:15:34,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:15:43,125 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:15:44,745 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-30 14:15:49,328 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:15:51,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:15:56,253 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:15:56,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-30 14:15:58,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:15:58,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:15:58,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:15:58,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-30 14:15:59,317 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.24 vs. limit=22.5 2023-09-30 14:16:02,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-30 14:16:02,933 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=737880.0, ans=0.0 2023-09-30 14:16:05,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-30 14:16:08,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-30 14:16:08,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:16:08,445 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-30 14:16:08,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:16:10,401 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=737880.0, ans=0.1 2023-09-30 14:16:11,765 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:16:13,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-30 14:16:15,284 INFO [train.py:1039] (1/4) Epoch 21, batch 4450, loss[loss=0.1666, simple_loss=0.2451, pruned_loss=0.04403, over 23247.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2511, pruned_loss=0.04943, over 4712111.59 frames. ], batch size: 119, lr: 4.86e-03, grad_scale: 16.0 2023-09-30 14:16:15,833 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=737946.6666666666, ans=0.0 2023-09-30 14:16:17,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:16:18,943 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=737946.6666666666, ans=0.125 2023-09-30 14:16:20,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:16:20,213 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:16:24,197 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.84 vs. limit=15.0 2023-09-30 14:16:25,160 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:16:25,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:16:30,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:16:30,938 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.46 vs. limit=6.0 2023-09-30 14:16:33,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:16:34,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:16:34,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:16:38,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-30 14:16:38,489 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:16:39,971 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:16:40,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:16:40,017 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:16:41,703 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 14:16:45,581 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=738013.3333333334, ans=0.125 2023-09-30 14:16:48,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:16:48,438 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:16:49,920 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:16:51,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:16:53,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:16:57,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 14:16:58,925 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-30 14:17:00,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-30 14:17:00,323 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:17:02,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:17:03,131 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.38 vs. limit=15.0 2023-09-30 14:17:03,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-30 14:17:07,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-30 14:17:10,788 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:17:12,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-30 14:17:14,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:17:14,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:17:14,297 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:17:14,311 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:17:15,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:17:20,951 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-30 14:17:21,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-30 14:17:22,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 14:17:23,290 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.77 vs. limit=12.0 2023-09-30 14:17:24,309 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:17:25,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:17:27,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:17:27,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 14:17:27,616 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=738213.3333333334, ans=0.125 2023-09-30 14:17:30,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-30 14:17:32,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-30 14:17:33,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:17:37,152 INFO [train.py:1039] (1/4) Epoch 21, batch 4500, loss[loss=0.1599, simple_loss=0.2385, pruned_loss=0.04068, over 24464.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2509, pruned_loss=0.04997, over 4713490.28 frames. ], batch size: 58, lr: 4.86e-03, grad_scale: 8.0 2023-09-30 14:17:39,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:17:41,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-30 14:17:41,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-30 14:17:43,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:17:44,362 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=738280.0, ans=0.0 2023-09-30 14:17:48,950 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:17:50,422 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:17:50,754 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=738280.0, ans=0.1 2023-09-30 14:17:52,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 14:17:52,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:17:53,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:17:54,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:18:05,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:18:06,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:18:09,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:18:10,953 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:18:11,099 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:18:16,948 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 14:18:18,462 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.886e+02 2.149e+02 2.495e+02 4.486e+02, threshold=4.299e+02, percent-clipped=1.0 2023-09-30 14:18:19,063 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=738413.3333333334, ans=0.07 2023-09-30 14:18:20,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:18:25,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 14:18:28,984 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:18:30,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-30 14:18:31,979 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:18:32,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:18:34,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:18:34,918 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:18:36,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:18:36,655 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-30 14:18:36,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 14:18:36,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:18:41,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:18:41,250 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:18:45,131 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:18:47,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:18:47,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:18:48,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-30 14:18:51,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-30 14:18:52,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-30 14:18:55,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-30 14:18:59,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-30 14:19:00,004 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=738613.3333333334, ans=0.0 2023-09-30 14:19:00,951 INFO [train.py:1039] (1/4) Epoch 21, batch 4550, loss[loss=0.154, simple_loss=0.2284, pruned_loss=0.03977, over 24318.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.25, pruned_loss=0.04965, over 4697829.31 frames. ], batch size: 56, lr: 4.86e-03, grad_scale: 8.0 2023-09-30 14:19:01,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:19:01,433 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=738613.3333333334, ans=0.09899494936611666 2023-09-30 14:19:05,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:19:05,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:19:08,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:19:13,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:19:15,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:19:16,931 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:19:16,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:19:16,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:19:17,186 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=738680.0, ans=0.0 2023-09-30 14:19:21,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:19:21,992 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:19:25,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:19:28,174 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-30 14:19:28,264 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-30 14:19:29,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:19:31,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-30 14:19:32,313 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=738746.6666666666, ans=0.1 2023-09-30 14:19:37,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-30 14:19:37,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:19:38,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-30 14:19:40,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 14:19:43,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:19:43,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:19:43,575 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-30 14:19:43,931 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=738746.6666666666, ans=0.2 2023-09-30 14:19:46,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-30 14:19:48,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:19:49,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:19:51,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:19:53,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:19:56,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-30 14:19:56,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-30 14:19:57,039 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:19:58,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-30 14:20:00,072 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-30 14:20:01,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:20:01,606 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:20:01,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:20:03,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:20:03,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:20:05,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 14:20:06,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-30 14:20:08,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:20:08,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 14:20:08,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-30 14:20:08,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:20:09,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-30 14:20:13,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 14:20:13,081 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:20:16,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:20:17,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:20:17,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-30 14:20:17,829 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=738880.0, ans=0.1 2023-09-30 14:20:19,146 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:20:20,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-30 14:20:21,109 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=738946.6666666666, ans=0.125 2023-09-30 14:20:22,183 INFO [train.py:1039] (1/4) Epoch 21, batch 4600, loss[loss=0.1736, simple_loss=0.2489, pruned_loss=0.04912, over 22466.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2489, pruned_loss=0.0491, over 4700169.26 frames. ], batch size: 49, lr: 4.85e-03, grad_scale: 8.0 2023-09-30 14:20:23,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:20:23,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:20:26,333 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=738946.6666666666, ans=0.2 2023-09-30 14:20:27,469 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:20:27,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:20:29,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:20:31,173 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-30 14:20:34,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:20:37,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:20:37,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:20:41,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:20:47,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-30 14:20:48,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:20:48,889 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=739013.3333333334, ans=0.1 2023-09-30 14:20:50,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:20:52,273 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:20:55,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:20:55,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:20:58,604 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=739080.0, ans=0.0 2023-09-30 14:21:00,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-30 14:21:00,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 14:21:02,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:21:03,435 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.612e+02 1.897e+02 2.077e+02 2.452e+02 3.334e+02, threshold=4.153e+02, percent-clipped=0.0 2023-09-30 14:21:08,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:21:08,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:21:10,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:21:17,106 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-30 14:21:18,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-30 14:21:19,084 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=739146.6666666666, ans=0.125 2023-09-30 14:21:20,470 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=739146.6666666666, ans=0.125 2023-09-30 14:21:21,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:21:23,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:21:26,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:21:26,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 14:21:26,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:21:28,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-30 14:21:28,077 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:21:28,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:21:28,582 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=739213.3333333334, ans=0.0 2023-09-30 14:21:29,725 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:21:31,065 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:21:31,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:21:32,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-30 14:21:32,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-30 14:21:33,533 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=739213.3333333334, ans=10.0 2023-09-30 14:21:34,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-30 14:21:34,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:21:34,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:21:36,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:21:37,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:21:44,501 INFO [train.py:1039] (1/4) Epoch 21, batch 4650, loss[loss=0.1721, simple_loss=0.2575, pruned_loss=0.04337, over 24616.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.249, pruned_loss=0.0491, over 4712647.03 frames. ], batch size: 68, lr: 4.85e-03, grad_scale: 8.0 2023-09-30 14:21:48,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:21:51,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:21:51,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:21:52,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:21:52,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:21:52,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:21:52,987 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:21:57,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-30 14:22:00,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:22:02,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-30 14:22:02,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:22:02,704 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.77 vs. limit=6.0 2023-09-30 14:22:03,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-30 14:22:03,676 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:22:05,086 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-30 14:22:05,123 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-30 14:22:05,136 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:22:05,242 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:22:10,417 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:22:11,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:22:12,001 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-30 14:22:14,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:22:15,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-30 14:22:18,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:22:18,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:22:18,894 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-30 14:22:21,078 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=739413.3333333334, ans=0.125 2023-09-30 14:22:22,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:22:25,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:22:28,890 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:22:35,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:22:38,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:22:39,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:22:41,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:22:41,398 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=739480.0, ans=0.0 2023-09-30 14:22:44,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-30 14:22:44,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-30 14:22:46,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 14:22:46,332 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-30 14:22:47,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:22:54,745 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-30 14:22:54,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:22:56,673 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-30 14:22:56,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:22:58,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:22:58,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:22:59,932 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:23:01,308 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.70 vs. limit=15.0 2023-09-30 14:23:03,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:23:03,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:23:05,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:23:06,511 INFO [train.py:1039] (1/4) Epoch 21, batch 4700, loss[loss=0.1441, simple_loss=0.2278, pruned_loss=0.03014, over 24601.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2492, pruned_loss=0.04854, over 4714689.49 frames. ], batch size: 60, lr: 4.85e-03, grad_scale: 8.0 2023-09-30 14:23:08,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:23:09,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:23:09,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 14:23:09,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-30 14:23:11,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-30 14:23:12,879 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-30 14:23:19,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:23:21,111 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:23:21,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:23:22,686 WARNING [train.py:1197] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:23:24,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 14:23:27,911 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=739680.0, ans=0.1 2023-09-30 14:23:30,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-30 14:23:30,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-30 14:23:34,998 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:23:36,491 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:23:36,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:23:39,761 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:23:42,170 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.64 vs. limit=15.0 2023-09-30 14:23:46,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 14:23:46,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 14:23:47,468 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.530e+02 1.882e+02 2.021e+02 2.237e+02 3.621e+02, threshold=4.043e+02, percent-clipped=0.0 2023-09-30 14:23:49,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:23:54,705 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=739813.3333333334, ans=0.1 2023-09-30 14:23:56,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-30 14:23:56,183 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:23:59,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:24:04,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-30 14:24:05,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:24:11,116 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:24:12,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-30 14:24:14,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:24:14,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:24:15,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:24:17,303 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:24:17,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-30 14:24:18,879 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-30 14:24:20,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:24:21,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:24:21,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:24:21,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-30 14:24:23,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:24:28,517 INFO [train.py:1039] (1/4) Epoch 21, batch 4750, loss[loss=0.1832, simple_loss=0.2473, pruned_loss=0.05955, over 23413.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2503, pruned_loss=0.04895, over 4697192.07 frames. ], batch size: 285, lr: 4.85e-03, grad_scale: 8.0 2023-09-30 14:24:28,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-30 14:24:30,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:24:31,092 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.42 vs. limit=15.0 2023-09-30 14:24:31,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:24:35,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:24:35,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:24:37,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-30 14:24:37,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:24:42,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-30 14:24:45,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:24:45,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:24:47,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:24:52,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-30 14:24:53,894 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=740013.3333333334, ans=0.125 2023-09-30 14:24:56,192 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.43 vs. limit=10.0 2023-09-30 14:24:57,036 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=740013.3333333334, ans=0.2 2023-09-30 14:24:58,138 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:24:58,562 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:24:59,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-30 14:24:59,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:25:02,353 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=740080.0, ans=0.0 2023-09-30 14:25:04,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:25:04,996 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:25:05,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:25:06,524 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-30 14:25:06,528 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-30 14:25:12,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-30 14:25:14,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:25:18,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:25:20,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 14:25:20,660 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-30 14:25:20,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:25:23,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-30 14:25:25,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:25:25,625 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=740146.6666666666, ans=0.125 2023-09-30 14:25:26,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-30 14:25:26,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-30 14:25:28,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:25:28,318 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:25:29,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:25:29,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 14:25:29,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-30 14:25:31,981 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=740146.6666666666, ans=0.0 2023-09-30 14:25:33,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-30 14:25:36,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:25:38,616 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:25:39,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-30 14:25:40,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:25:40,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:25:41,749 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:25:43,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:25:44,726 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 14:25:46,390 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:25:46,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-30 14:25:48,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-30 14:25:49,983 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-30 14:25:52,012 INFO [train.py:1039] (1/4) Epoch 21, batch 4800, loss[loss=0.1538, simple_loss=0.2361, pruned_loss=0.03572, over 24270.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.2512, pruned_loss=0.04923, over 4704550.61 frames. ], batch size: 56, lr: 4.85e-03, grad_scale: 16.0 2023-09-30 14:25:53,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-30 14:25:53,612 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:25:55,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-30 14:26:00,075 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:26:00,312 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=740280.0, ans=0.125 2023-09-30 14:26:01,515 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:26:06,433 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:26:07,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:26:08,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:26:08,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-30 14:26:10,349 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=740346.6666666666, ans=0.2 2023-09-30 14:26:11,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:26:11,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:26:11,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:26:15,062 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:26:15,490 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:26:16,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:26:16,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:26:18,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:26:18,945 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 14:26:18,985 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:26:20,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:26:24,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:26:27,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:26:28,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:26:28,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:26:30,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 14:26:31,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:26:32,876 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.899e+02 2.131e+02 2.497e+02 3.417e+02, threshold=4.262e+02, percent-clipped=0.0 2023-09-30 14:26:34,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-30 14:26:34,536 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-30 14:26:35,957 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:26:35,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:26:36,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:26:36,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:26:36,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:26:37,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:26:39,224 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:26:43,038 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:26:47,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:26:48,962 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:26:55,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-30 14:26:55,550 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:26:55,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:26:55,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 14:26:57,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:27:00,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:27:02,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:27:02,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:27:02,493 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:27:03,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:27:05,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:27:05,657 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=740546.6666666666, ans=0.125 2023-09-30 14:27:08,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:27:08,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:27:08,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:27:10,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-30 14:27:12,840 INFO [train.py:1039] (1/4) Epoch 21, batch 4850, loss[loss=0.1716, simple_loss=0.2576, pruned_loss=0.04275, over 24629.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2514, pruned_loss=0.04952, over 4711217.45 frames. ], batch size: 68, lr: 4.85e-03, grad_scale: 16.0 2023-09-30 14:27:12,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-30 14:27:13,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:27:13,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:27:13,134 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:27:13,136 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:27:14,811 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=740613.3333333334, ans=0.125 2023-09-30 14:27:16,741 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:27:21,627 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=740613.3333333334, ans=0.07 2023-09-30 14:27:24,995 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.34 vs. limit=22.5 2023-09-30 14:27:26,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-30 14:27:27,654 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:27:27,947 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=740680.0, ans=0.0 2023-09-30 14:27:32,945 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:27:33,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 14:27:34,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:27:38,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:27:39,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 14:27:41,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:27:41,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-30 14:27:44,918 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=740746.6666666666, ans=10.0 2023-09-30 14:27:46,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:27:47,741 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:27:47,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 14:27:49,227 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 14:27:49,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-30 14:27:51,026 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=740746.6666666666, ans=0.125 2023-09-30 14:27:52,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:27:52,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:27:56,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:27:56,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-30 14:27:56,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-30 14:27:57,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 14:28:06,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:28:07,533 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-30 14:28:08,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:28:08,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:28:08,694 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:28:09,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-30 14:28:12,191 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=740813.3333333334, ans=0.0 2023-09-30 14:28:13,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-30 14:28:13,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:28:14,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-30 14:28:14,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:28:16,331 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:28:16,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-30 14:28:20,323 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.13 vs. limit=15.0 2023-09-30 14:28:27,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:28:32,161 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:28:32,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:28:35,177 INFO [train.py:1039] (1/4) Epoch 21, batch 4900, loss[loss=0.1801, simple_loss=0.2491, pruned_loss=0.05557, over 23667.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2506, pruned_loss=0.04941, over 4705509.04 frames. ], batch size: 149, lr: 4.85e-03, grad_scale: 16.0 2023-09-30 14:28:37,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-30 14:28:37,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:28:41,337 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=740946.6666666666, ans=0.125 2023-09-30 14:28:42,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:28:44,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:28:44,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:28:47,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-30 14:28:52,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-30 14:28:56,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-30 14:28:57,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-30 14:28:57,793 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:28:57,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:28:57,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:28:57,898 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:28:57,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:28:59,373 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-30 14:28:59,535 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=741013.3333333334, ans=0.125 2023-09-30 14:29:03,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-30 14:29:05,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 14:29:05,924 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.53 vs. limit=15.0 2023-09-30 14:29:06,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:29:08,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:29:10,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:29:11,253 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.12 vs. limit=22.5 2023-09-30 14:29:12,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:29:12,987 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:29:13,001 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-30 14:29:13,220 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=741080.0, ans=0.125 2023-09-30 14:29:15,216 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.04 vs. limit=15.0 2023-09-30 14:29:15,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:29:16,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:29:16,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-30 14:29:16,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-30 14:29:17,285 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.865e+02 2.105e+02 2.513e+02 4.105e+02, threshold=4.210e+02, percent-clipped=0.0 2023-09-30 14:29:20,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-30 14:29:22,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:29:25,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:29:25,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:29:27,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:29:27,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 14:29:27,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:29:27,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-30 14:29:30,713 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:29:32,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-30 14:29:33,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:29:37,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-30 14:29:38,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:29:38,624 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-30 14:29:40,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-30 14:29:40,268 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=741213.3333333334, ans=0.1 2023-09-30 14:29:48,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:29:50,486 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:29:52,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-30 14:29:52,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 14:29:52,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:29:53,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:29:58,062 INFO [train.py:1039] (1/4) Epoch 21, batch 4950, loss[loss=0.1724, simple_loss=0.2606, pruned_loss=0.04209, over 24519.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2497, pruned_loss=0.04881, over 4704573.79 frames. ], batch size: 71, lr: 4.85e-03, grad_scale: 16.0 2023-09-30 14:29:58,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:29:58,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:30:00,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:30:00,166 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-30 14:30:00,443 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=741280.0, ans=0.0 2023-09-30 14:30:01,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 14:30:05,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:30:05,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 14:30:08,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-30 14:30:08,577 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-30 14:30:08,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-30 14:30:10,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-30 14:30:10,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:30:10,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:30:11,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-30 14:30:12,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:30:13,622 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:30:15,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:30:16,622 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:30:18,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:30:20,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:30:20,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:30:24,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 14:30:28,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:30:30,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:30:31,679 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:30:33,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:30:35,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:30:35,456 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-30 14:30:35,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-30 14:30:39,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:30:42,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:30:42,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:30:43,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:30:43,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:30:45,265 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-30 14:30:48,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:30:49,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-30 14:30:51,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:30:53,706 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:30:55,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:30:55,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-30 14:30:56,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:30:57,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:31:00,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:31:03,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:31:03,597 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:31:03,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:31:03,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:31:05,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:31:05,689 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten.whitening_limit, batch_count=741546.6666666666, ans=22.5 2023-09-30 14:31:06,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:31:08,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 14:31:08,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:31:10,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-30 14:31:11,185 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.80 vs. limit=15.0 2023-09-30 14:31:16,030 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:31:17,958 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=741546.6666666666, ans=0.0 2023-09-30 14:31:19,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-30 14:31:19,386 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-30 14:31:22,296 INFO [train.py:1039] (1/4) Epoch 21, batch 5000, loss[loss=0.1864, simple_loss=0.2368, pruned_loss=0.06801, over 19196.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.249, pruned_loss=0.04909, over 4699751.93 frames. ], batch size: 388, lr: 4.85e-03, grad_scale: 16.0 2023-09-30 14:31:28,298 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:31:28,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:31:29,862 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-30 14:31:31,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-30 14:31:32,868 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:31:36,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-30 14:31:36,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:31:37,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:31:37,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-30 14:31:37,904 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=741680.0, ans=0.0 2023-09-30 14:31:38,999 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:31:39,103 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:31:40,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-30 14:31:40,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:31:40,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:31:43,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-30 14:31:45,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-30 14:31:45,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:31:45,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-30 14:31:45,990 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 14:31:46,158 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=741680.0, ans=0.125 2023-09-30 14:31:47,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:31:49,548 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 14:31:49,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-30 14:31:49,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-30 14:31:52,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-30 14:31:52,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:31:54,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:31:54,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-30 14:31:54,342 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:31:55,141 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.45 vs. limit=15.0 2023-09-30 14:31:55,894 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:31:57,362 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:31:58,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-30 14:32:00,499 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-30 14:32:01,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:32:01,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:32:03,261 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.857e+02 2.049e+02 2.517e+02 4.196e+02, threshold=4.099e+02, percent-clipped=0.0 2023-09-30 14:32:07,012 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-30 14:32:10,182 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:32:11,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:32:11,597 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:32:14,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-30 14:32:16,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:32:16,077 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:32:16,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:32:19,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-30 14:32:19,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:32:21,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:32:23,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:32:29,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-30 14:32:32,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:32:36,107 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=741880.0, ans=0.125 2023-09-30 14:32:43,720 INFO [train.py:1039] (1/4) Epoch 21, batch 5050, loss[loss=0.1715, simple_loss=0.2498, pruned_loss=0.04655, over 23750.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2498, pruned_loss=0.04902, over 4713592.78 frames. ], batch size: 149, lr: 4.84e-03, grad_scale: 16.0 2023-09-30 14:32:43,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:32:45,486 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:32:45,498 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:32:45,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:32:45,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 14:32:47,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:32:47,140 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:32:50,635 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=741946.6666666666, ans=0.2 2023-09-30 14:32:51,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:32:51,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-30 14:32:53,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:32:57,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:32:59,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:32:59,134 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-30 14:33:00,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:33:02,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:33:03,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 14:33:03,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:33:05,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-30 14:33:15,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-30 14:33:16,396 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.22 vs. limit=15.0 2023-09-30 14:33:16,980 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-30 14:33:17,110 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:33:18,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-30 14:33:20,041 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:33:20,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:33:20,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:33:20,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:33:20,353 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-30 14:33:21,792 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-30 14:33:23,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:33:25,118 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=742080.0, ans=0.125 2023-09-30 14:33:26,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:33:29,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:33:29,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-30 14:33:33,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:33:36,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-30 14:33:37,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:33:37,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:33:37,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:33:37,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:33:39,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:33:42,414 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:33:42,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:33:42,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:33:42,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:33:43,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-30 14:33:45,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:33:47,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:33:52,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:33:52,649 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-30 14:33:52,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-30 14:33:54,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:33:54,206 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:33:54,267 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-30 14:33:54,705 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=742213.3333333334, ans=0.1 2023-09-30 14:33:56,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:33:56,065 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-30 14:33:56,066 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:34:00,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:34:01,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:34:01,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-30 14:34:04,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-30 14:34:05,793 INFO [train.py:1039] (1/4) Epoch 21, batch 5100, loss[loss=0.1722, simple_loss=0.2595, pruned_loss=0.04244, over 24383.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.2509, pruned_loss=0.04944, over 4705177.25 frames. ], batch size: 77, lr: 4.84e-03, grad_scale: 16.0 2023-09-30 14:34:06,195 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=742280.0, ans=0.1 2023-09-30 14:34:07,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:34:07,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:34:07,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:34:10,640 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-30 14:34:13,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:34:15,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-30 14:34:15,512 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-30 14:34:15,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:34:19,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:34:22,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:34:22,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-30 14:34:23,384 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-30 14:34:27,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:34:28,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:34:31,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:34:35,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-30 14:34:36,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:34:38,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:34:38,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-30 14:34:40,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:34:41,074 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=742413.3333333334, ans=0.1 2023-09-30 14:34:41,622 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.68 vs. limit=12.0 2023-09-30 14:34:43,602 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:34:43,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-30 14:34:45,206 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-30 14:34:45,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:34:46,538 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 2.007e+02 2.200e+02 2.519e+02 3.504e+02, threshold=4.400e+02, percent-clipped=0.0 2023-09-30 14:34:46,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-30 14:34:46,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-30 14:34:49,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:34:58,819 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.83 vs. limit=15.0 2023-09-30 14:35:01,625 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:35:04,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-30 14:35:04,699 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-30 14:35:04,711 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-30 14:35:06,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-30 14:35:06,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:35:08,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-30 14:35:13,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-30 14:35:15,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 14:35:16,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:35:19,965 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-30 14:35:21,449 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-30 14:35:22,856 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-30 14:35:24,719 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=742546.6666666666, ans=0.0 2023-09-30 14:35:28,127 INFO [train.py:1039] (1/4) Epoch 21, batch 5150, loss[loss=0.152, simple_loss=0.2249, pruned_loss=0.03949, over 24476.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2512, pruned_loss=0.04979, over 4712427.95 frames. ], batch size: 58, lr: 4.84e-03, grad_scale: 16.0 2023-09-30 14:35:28,311 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:35:28,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:35:28,338 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:35:29,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:35:31,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 14:35:31,776 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.19 vs. limit=10.0 2023-09-30 14:35:32,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:35:33,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-30 14:35:33,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-30 14:35:34,043 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-30 14:35:34,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:35:35,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-30 14:35:35,641 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:35:35,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 14:35:37,993 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:35:39,630 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:35:43,247 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=742680.0, ans=0.125 2023-09-30 14:35:44,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 14:35:44,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-30 14:35:47,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:35:48,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:35:51,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-30 14:35:51,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:35:51,612 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:35:51,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:35:51,739 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:35:51,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-30 14:35:54,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:35:54,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:35:57,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 14:35:59,460 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-30 14:35:59,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:36:06,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:36:06,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-30 14:36:10,177 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.95 vs. limit=15.0 2023-09-30 14:36:11,133 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:36:20,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:36:21,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:36:24,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:36:24,836 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:36:26,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-30 14:36:32,487 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:36:33,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:36:34,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:36:35,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:36:37,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:36:39,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-30 14:36:45,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:36:45,711 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 14:36:46,072 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=742880.0, ans=0.125 2023-09-30 14:36:47,709 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:36:47,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:36:49,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-30 14:36:49,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-30 14:36:49,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:36:50,686 INFO [train.py:1039] (1/4) Epoch 21, batch 5200, loss[loss=0.1796, simple_loss=0.2534, pruned_loss=0.0529, over 23217.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2514, pruned_loss=0.04928, over 4702783.69 frames. ], batch size: 105, lr: 4.84e-03, grad_scale: 32.0 2023-09-30 14:36:50,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:36:54,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:36:56,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:36:56,866 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=742946.6666666666, ans=0.125 2023-09-30 14:36:59,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:37:02,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-30 14:37:04,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:37:05,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:37:07,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:37:08,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:37:08,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:37:10,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-30 14:37:15,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 14:37:15,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:37:18,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-30 14:37:21,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-30 14:37:21,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:37:23,740 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-30 14:37:23,826 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-30 14:37:25,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-30 14:37:27,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:37:27,491 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-30 14:37:27,516 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:37:29,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:37:29,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:37:30,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-30 14:37:31,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:37:32,279 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.872e+02 2.039e+02 2.379e+02 3.474e+02, threshold=4.079e+02, percent-clipped=0.0 2023-09-30 14:37:33,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:37:35,655 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-30 14:37:35,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-30 14:37:37,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-30 14:37:40,443 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=743146.6666666666, ans=0.0 2023-09-30 14:37:43,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-30 14:37:43,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 14:37:47,021 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=743146.6666666666, ans=0.2 2023-09-30 14:37:49,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:37:49,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:37:51,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-30 14:37:51,380 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:37:52,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-30 14:37:52,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:37:52,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 14:37:56,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:37:57,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:38:03,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:38:03,652 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=743213.3333333334, ans=0.125 2023-09-30 14:38:04,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:38:04,773 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:38:08,258 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=743213.3333333334, ans=0.1 2023-09-30 14:38:09,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:38:09,623 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-30 14:38:11,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:38:11,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:38:12,504 INFO [train.py:1039] (1/4) Epoch 21, batch 5250, loss[loss=0.1729, simple_loss=0.2387, pruned_loss=0.05357, over 23796.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.2507, pruned_loss=0.04929, over 4698294.86 frames. ], batch size: 212, lr: 4.84e-03, grad_scale: 16.0 2023-09-30 14:38:12,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:38:14,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-30 14:38:14,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:38:14,577 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=743280.0, ans=0.125 2023-09-30 14:38:17,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:38:20,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:38:20,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:38:22,258 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:38:23,052 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=743280.0, ans=15.0 2023-09-30 14:38:26,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:38:28,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:38:30,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:38:30,565 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.55 vs. limit=10.0 2023-09-30 14:38:33,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 14:38:35,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-30 14:38:35,195 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:38:37,246 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:39:08,250 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=743480.0, ans=0.04949747468305833 2023-09-30 14:39:25,625 INFO [train.py:1039] (1/4) Epoch 21, batch 5300, loss[loss=0.191, simple_loss=0.2748, pruned_loss=0.0536, over 24431.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2489, pruned_loss=0.04955, over 4692094.24 frames. ], batch size: 69, lr: 4.84e-03, grad_scale: 8.0 2023-09-30 14:39:31,703 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=743613.3333333334, ans=0.125 2023-09-30 14:39:40,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:39:40,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-30 14:39:41,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-30 14:39:41,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:39:41,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:39:41,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:39:41,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:39:41,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:39:41,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:39:41,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:39:41,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-30 14:39:42,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:39:42,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-30 14:39:42,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-30 14:39:42,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-30 14:39:43,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-30 14:39:43,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-30 14:39:43,359 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-30 14:39:43,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:39:44,030 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:39:44,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:39:44,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:39:44,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:39:44,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:39:44,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:39:44,966 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:39:45,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:39:45,148 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:39:45,155 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:39:45,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:39:45,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:39:46,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-30 14:39:46,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:39:47,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:39:47,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-30 14:39:47,230 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-30 14:39:47,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:39:47,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:39:47,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-30 14:39:47,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-30 14:39:47,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-30 14:39:48,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:39:48,739 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:39:48,902 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-30 14:39:49,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-30 14:39:49,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:39:49,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:39:49,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-30 14:39:49,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-30 14:39:49,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-30 14:39:50,404 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-30 14:39:58,323 INFO [train.py:1039] (1/4) Epoch 22, batch 0, loss[loss=0.2034, simple_loss=0.2798, pruned_loss=0.06349, over 23277.00 frames. ], tot_loss[loss=0.2034, simple_loss=0.2798, pruned_loss=0.06349, over 23277.00 frames. ], batch size: 93, lr: 4.73e-03, grad_scale: 16.0 2023-09-30 14:39:58,324 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-30 14:40:11,504 INFO [train.py:1071] (1/4) Epoch 22, validation: loss=0.3042, simple_loss=0.2741, pruned_loss=0.1671, over 1125622.00 frames. 2023-09-30 14:40:11,505 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-30 14:40:13,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-30 14:40:15,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:40:16,940 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:40:21,456 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:40:21,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:40:22,863 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:40:22,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-30 14:40:25,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-30 14:40:28,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:40:28,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:40:28,669 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=743760.0, ans=0.125 2023-09-30 14:40:31,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:40:31,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:40:32,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 14:40:32,919 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:40:35,706 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.460e+02 1.911e+02 2.208e+02 2.678e+02 6.793e+02, threshold=4.416e+02, percent-clipped=10.0 2023-09-30 14:40:35,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-30 14:40:37,482 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:40:46,332 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:40:46,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:40:48,559 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-30 14:40:50,236 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=743826.6666666666, ans=0.125 2023-09-30 14:40:54,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:40:54,703 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:40:57,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:41:01,591 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:41:06,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:41:09,382 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=743893.3333333334, ans=0.125 2023-09-30 14:41:09,485 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=743893.3333333334, ans=0.1 2023-09-30 14:41:10,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-30 14:41:13,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-30 14:41:15,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:41:15,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:41:15,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:41:16,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:41:17,449 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.50 vs. limit=15.0 2023-09-30 14:41:19,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-30 14:41:21,301 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:41:24,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:41:29,781 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:41:31,645 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-30 14:41:33,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:41:35,327 INFO [train.py:1039] (1/4) Epoch 22, batch 50, loss[loss=0.1629, simple_loss=0.2461, pruned_loss=0.03987, over 24447.00 frames. ], tot_loss[loss=0.1774, simple_loss=0.2554, pruned_loss=0.04968, over 1065391.94 frames. ], batch size: 66, lr: 4.72e-03, grad_scale: 16.0 2023-09-30 14:41:37,751 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.78 vs. limit=15.0 2023-09-30 14:41:38,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:41:40,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:41:40,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-30 14:41:40,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 14:41:40,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:41:43,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:41:43,425 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:41:46,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:41:48,239 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=744026.6666666666, ans=0.0 2023-09-30 14:41:50,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-30 14:41:51,009 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:41:55,108 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=744093.3333333334, ans=0.0 2023-09-30 14:41:57,483 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.21 vs. limit=15.0 2023-09-30 14:41:59,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-30 14:42:00,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-30 14:42:02,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-30 14:42:03,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:42:06,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:42:06,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:42:06,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:42:08,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-30 14:42:09,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 14:42:09,884 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:42:16,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:42:19,739 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:42:19,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:42:19,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-30 14:42:21,569 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=744160.0, ans=0.0 2023-09-30 14:42:22,831 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:42:24,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:42:24,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-30 14:42:24,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:42:25,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-30 14:42:35,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:42:35,547 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:42:37,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:42:37,269 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=744226.6666666666, ans=0.125 2023-09-30 14:42:39,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:42:39,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-30 14:42:41,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-30 14:42:41,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-30 14:42:43,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:42:43,288 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-30 14:42:44,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:42:46,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:42:47,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-30 14:42:47,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-30 14:42:49,982 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-30 14:42:52,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:42:52,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:42:52,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-30 14:42:53,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-30 14:42:54,492 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:42:54,625 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:42:56,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-30 14:42:57,536 INFO [train.py:1039] (1/4) Epoch 22, batch 100, loss[loss=0.1792, simple_loss=0.2482, pruned_loss=0.05512, over 23796.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2525, pruned_loss=0.04894, over 1888679.60 frames. ], batch size: 164, lr: 4.72e-03, grad_scale: 16.0 2023-09-30 14:42:57,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:43:00,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:43:03,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:43:07,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:43:07,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-30 14:43:07,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:43:08,626 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=14.45 vs. limit=22.5 2023-09-30 14:43:12,721 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:43:12,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:43:12,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:43:12,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:43:12,840 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:43:14,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-30 14:43:16,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:43:17,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:43:17,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:43:17,627 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:43:21,512 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=744426.6666666666, ans=0.1 2023-09-30 14:43:21,603 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:43:22,474 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.832e+02 2.020e+02 2.279e+02 4.259e+02, threshold=4.040e+02, percent-clipped=0.0 2023-09-30 14:43:22,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-30 14:43:22,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:43:23,005 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=744426.6666666666, ans=0.1 2023-09-30 14:43:24,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:43:25,669 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-30 14:43:26,088 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=744426.6666666666, ans=0.125 2023-09-30 14:43:27,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 14:43:31,634 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-30 14:43:31,671 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-30 14:43:33,266 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:43:33,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:43:36,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-30 14:43:36,727 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=744493.3333333334, ans=0.2 2023-09-30 14:43:38,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:43:38,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:43:39,754 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.07 vs. limit=15.0 2023-09-30 14:43:45,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:43:47,240 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-30 14:43:48,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-30 14:43:49,055 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=744560.0, ans=0.1 2023-09-30 14:43:53,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:43:53,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:43:55,808 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=744560.0, ans=0.125 2023-09-30 14:43:56,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:44:00,018 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:44:03,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:44:04,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:44:07,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:44:07,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:44:09,635 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=744626.6666666666, ans=0.07 2023-09-30 14:44:10,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:44:10,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:44:10,731 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:44:10,960 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=744626.6666666666, ans=0.125 2023-09-30 14:44:12,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-30 14:44:12,225 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-30 14:44:12,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:44:14,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:44:14,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:44:14,433 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:44:14,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 14:44:14,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 14:44:14,568 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-30 14:44:16,551 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:44:18,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:44:18,558 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:44:20,013 INFO [train.py:1039] (1/4) Epoch 22, batch 150, loss[loss=0.1755, simple_loss=0.2658, pruned_loss=0.04259, over 24290.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.2521, pruned_loss=0.04874, over 2513964.24 frames. ], batch size: 74, lr: 4.72e-03, grad_scale: 16.0 2023-09-30 14:44:20,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:44:20,386 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=744693.3333333334, ans=0.2 2023-09-30 14:44:21,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:44:21,763 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=744693.3333333334, ans=0.2 2023-09-30 14:44:24,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:44:27,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:44:27,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:44:27,661 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:44:30,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:44:30,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:44:34,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:44:35,982 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:44:40,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-30 14:44:40,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-30 14:44:40,602 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-30 14:44:43,676 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:44:43,684 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:44:45,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:44:47,281 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:44:47,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:44:47,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:44:47,461 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:44:48,944 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-30 14:44:51,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:44:57,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:45:00,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:45:01,754 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-30 14:45:04,040 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.89 vs. limit=22.5 2023-09-30 14:45:05,228 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.58 vs. limit=6.0 2023-09-30 14:45:06,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:45:06,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:45:08,137 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:45:09,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:45:11,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:45:12,736 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-30 14:45:12,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:45:12,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-30 14:45:17,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:45:19,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:45:19,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:45:19,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:45:22,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:45:24,529 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 14:45:26,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:45:28,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:45:29,778 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:45:31,453 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:45:32,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-30 14:45:32,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:45:32,886 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-30 14:45:37,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:45:40,981 INFO [train.py:1039] (1/4) Epoch 22, batch 200, loss[loss=0.155, simple_loss=0.2337, pruned_loss=0.03813, over 24441.00 frames. ], tot_loss[loss=0.1769, simple_loss=0.2541, pruned_loss=0.04987, over 3002064.26 frames. ], batch size: 58, lr: 4.72e-03, grad_scale: 16.0 2023-09-30 14:45:41,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:45:41,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:45:45,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-30 14:45:45,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:45:47,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:45:48,787 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-30 14:45:50,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-30 14:45:50,674 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=745026.6666666666, ans=0.0 2023-09-30 14:45:51,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:45:53,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:45:58,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:45:58,335 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:45:58,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:46:02,277 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:46:05,543 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.853e+02 2.014e+02 2.371e+02 3.492e+02, threshold=4.028e+02, percent-clipped=0.0 2023-09-30 14:46:06,285 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.70 vs. limit=15.0 2023-09-30 14:46:13,405 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=745160.0, ans=0.0 2023-09-30 14:46:18,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:46:19,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:46:20,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:46:21,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:46:23,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 14:46:23,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 14:46:23,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:46:24,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:46:25,419 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.42 vs. limit=15.0 2023-09-30 14:46:26,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:46:26,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:46:28,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-30 14:46:28,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 14:46:28,551 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:46:31,735 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=745226.6666666666, ans=0.125 2023-09-30 14:46:35,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:46:36,896 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=745226.6666666666, ans=0.0 2023-09-30 14:46:38,326 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=745226.6666666666, ans=0.0 2023-09-30 14:46:41,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:46:47,870 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:46:47,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:46:56,556 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=745293.3333333334, ans=0.2 2023-09-30 14:46:57,003 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.08 vs. limit=12.0 2023-09-30 14:46:57,932 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:47:00,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-30 14:47:01,000 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:47:01,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-30 14:47:02,299 INFO [train.py:1039] (1/4) Epoch 22, batch 250, loss[loss=0.1723, simple_loss=0.2591, pruned_loss=0.0427, over 24675.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.2536, pruned_loss=0.04989, over 3388663.49 frames. ], batch size: 73, lr: 4.72e-03, grad_scale: 16.0 2023-09-30 14:47:02,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:47:02,485 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:47:04,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-30 14:47:04,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:47:04,161 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-30 14:47:06,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:47:09,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:47:09,543 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:47:11,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:47:13,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:47:14,355 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten.whitening_limit, batch_count=745360.0, ans=15.0 2023-09-30 14:47:15,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:47:16,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:47:19,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:47:32,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:47:35,441 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:47:35,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:47:35,669 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=745493.3333333334, ans=0.125 2023-09-30 14:47:42,629 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=745493.3333333334, ans=0.0 2023-09-30 14:47:43,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-30 14:47:43,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-30 14:47:45,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:47:45,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:47:47,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 14:47:47,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:47:47,377 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:47:51,931 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:47:55,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-30 14:47:55,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:47:56,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:47:56,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:47:56,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:47:57,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:47:57,238 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:47:58,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:48:00,127 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:48:02,335 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:48:03,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:48:05,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:48:09,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:48:12,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:48:12,962 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=745626.6666666666, ans=0.0 2023-09-30 14:48:18,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:48:19,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:48:25,009 INFO [train.py:1039] (1/4) Epoch 22, batch 300, loss[loss=0.1697, simple_loss=0.2541, pruned_loss=0.04265, over 24670.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.251, pruned_loss=0.04929, over 3668220.51 frames. ], batch size: 65, lr: 4.72e-03, grad_scale: 16.0 2023-09-30 14:48:25,066 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-30 14:48:25,243 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:48:25,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:48:28,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-30 14:48:28,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-30 14:48:29,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:48:29,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-30 14:48:35,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:48:36,331 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:48:39,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:48:41,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-30 14:48:42,658 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:48:44,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 14:48:44,202 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-30 14:48:44,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:48:49,076 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.843e+02 2.061e+02 2.404e+02 3.309e+02, threshold=4.123e+02, percent-clipped=0.0 2023-09-30 14:48:50,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:48:53,683 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:48:53,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-30 14:48:56,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-30 14:48:57,013 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:48:59,482 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=745826.6666666666, ans=0.125 2023-09-30 14:49:00,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:49:02,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:49:02,216 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-30 14:49:02,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:49:05,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:49:07,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:49:07,530 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:49:10,643 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-30 14:49:10,650 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-30 14:49:12,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:49:15,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:49:16,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-30 14:49:16,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:49:16,902 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=745893.3333333334, ans=0.125 2023-09-30 14:49:22,264 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:49:25,223 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:49:25,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-30 14:49:29,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:49:29,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 14:49:31,912 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:49:34,119 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-30 14:49:35,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-30 14:49:35,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 14:49:35,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:49:37,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-30 14:49:38,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:49:38,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:49:41,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:49:42,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:49:42,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:49:47,001 INFO [train.py:1039] (1/4) Epoch 22, batch 350, loss[loss=0.1842, simple_loss=0.2611, pruned_loss=0.05368, over 23323.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2487, pruned_loss=0.04872, over 3901340.79 frames. ], batch size: 93, lr: 4.72e-03, grad_scale: 16.0 2023-09-30 14:49:48,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:49:48,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 14:49:53,777 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:49:55,763 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=746026.6666666666, ans=0.2 2023-09-30 14:49:57,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:49:58,855 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=746026.6666666666, ans=0.125 2023-09-30 14:50:01,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:50:01,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:50:05,506 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-30 14:50:07,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:50:07,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-30 14:50:10,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:50:10,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-30 14:50:12,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:50:14,266 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=746093.3333333334, ans=0.09899494936611666 2023-09-30 14:50:15,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-30 14:50:17,816 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=746093.3333333334, ans=0.0 2023-09-30 14:50:19,001 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:50:20,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:50:22,088 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:50:22,491 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=746160.0, ans=0.125 2023-09-30 14:50:23,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:50:23,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:50:23,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:50:23,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:50:23,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:50:27,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:50:27,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:50:32,256 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:50:35,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:50:35,081 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-30 14:50:35,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:50:35,205 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:50:36,979 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=746226.6666666666, ans=0.0 2023-09-30 14:50:40,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-30 14:50:40,604 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:50:45,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:50:45,807 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:50:45,844 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:50:47,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-30 14:50:49,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:50:49,690 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-30 14:50:51,333 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-30 14:50:52,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:50:54,518 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=746293.3333333334, ans=0.2 2023-09-30 14:50:55,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:50:55,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-30 14:50:58,768 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:50:59,079 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=746293.3333333334, ans=0.1 2023-09-30 14:51:00,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:51:04,040 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:51:05,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:51:05,542 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:51:07,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:51:10,229 INFO [train.py:1039] (1/4) Epoch 22, batch 400, loss[loss=0.1705, simple_loss=0.2522, pruned_loss=0.04436, over 23649.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2487, pruned_loss=0.04873, over 4084043.11 frames. ], batch size: 85, lr: 4.72e-03, grad_scale: 32.0 2023-09-30 14:51:10,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:51:11,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-30 14:51:12,220 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=746360.0, ans=0.1 2023-09-30 14:51:13,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-30 14:51:13,491 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:51:15,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:51:15,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:51:17,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:51:20,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:51:22,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:51:24,311 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-30 14:51:27,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-30 14:51:27,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:51:28,227 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.65 vs. limit=15.0 2023-09-30 14:51:28,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-30 14:51:30,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:51:33,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:51:33,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:51:33,646 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-30 14:51:34,930 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.861e+02 2.105e+02 2.651e+02 3.953e+02, threshold=4.209e+02, percent-clipped=0.0 2023-09-30 14:51:35,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:51:35,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:51:36,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:51:36,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:51:38,955 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-30 14:51:40,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-30 14:51:43,851 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=746493.3333333334, ans=0.2 2023-09-30 14:51:45,101 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:51:46,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:51:46,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-30 14:51:49,133 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.56 vs. limit=6.0 2023-09-30 14:51:50,098 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-30 14:51:53,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:51:55,332 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:51:57,149 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=746493.3333333334, ans=0.2 2023-09-30 14:52:02,350 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-30 14:52:05,361 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-30 14:52:05,631 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=746560.0, ans=0.09899494936611666 2023-09-30 14:52:08,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-30 14:52:09,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:52:10,218 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=746560.0, ans=0.125 2023-09-30 14:52:12,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:52:12,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-30 14:52:14,739 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.07 vs. limit=22.5 2023-09-30 14:52:15,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:52:18,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 14:52:19,363 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=13.97 vs. limit=22.5 2023-09-30 14:52:19,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:52:23,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:52:23,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-30 14:52:25,359 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-30 14:52:31,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-30 14:52:33,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:52:33,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:52:35,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-30 14:52:35,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:52:36,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:52:39,081 INFO [train.py:1039] (1/4) Epoch 22, batch 450, loss[loss=0.155, simple_loss=0.239, pruned_loss=0.03546, over 24327.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2492, pruned_loss=0.04887, over 4218287.68 frames. ], batch size: 61, lr: 4.72e-03, grad_scale: 32.0 2023-09-30 14:52:39,149 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-30 14:52:40,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-30 14:52:41,991 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:52:42,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:52:43,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:52:43,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-30 14:52:43,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:52:45,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:52:48,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:52:48,545 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=746693.3333333334, ans=0.125 2023-09-30 14:52:52,406 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=746693.3333333334, ans=0.125 2023-09-30 14:52:56,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:52:58,164 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:52:59,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-30 14:52:59,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-30 14:53:03,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-30 14:53:06,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:53:07,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:53:14,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:53:16,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:53:18,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-30 14:53:19,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-30 14:53:21,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-30 14:53:21,453 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:53:22,089 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.86 vs. limit=15.0 2023-09-30 14:53:23,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:53:24,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 14:53:25,132 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-30 14:53:25,145 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-30 14:53:26,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:53:28,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:53:28,392 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-30 14:53:32,846 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-30 14:53:32,910 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:53:34,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-30 14:53:34,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-30 14:53:36,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:53:38,391 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-30 14:53:38,448 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 14:53:40,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-30 14:53:45,160 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:53:46,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-30 14:53:46,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-30 14:53:48,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:53:53,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:53:56,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:53:58,551 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:53:58,598 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-30 14:54:01,604 INFO [train.py:1039] (1/4) Epoch 22, batch 500, loss[loss=0.1582, simple_loss=0.2345, pruned_loss=0.04089, over 23652.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.2493, pruned_loss=0.04863, over 4330250.30 frames. ], batch size: 149, lr: 4.72e-03, grad_scale: 32.0 2023-09-30 14:54:01,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:54:03,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:54:03,316 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:54:03,332 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-30 14:54:04,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-30 14:54:04,984 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:54:09,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 14:54:13,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 14:54:16,101 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:54:17,865 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:54:17,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:54:19,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:54:19,522 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=747093.3333333334, ans=0.125 2023-09-30 14:54:26,493 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.831e+02 2.107e+02 2.492e+02 3.806e+02, threshold=4.214e+02, percent-clipped=0.0 2023-09-30 14:54:27,033 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=747093.3333333334, ans=0.0 2023-09-30 14:54:28,548 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=747093.3333333334, ans=0.125 2023-09-30 14:54:30,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:54:31,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-30 14:54:31,859 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:54:31,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:54:33,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-30 14:54:33,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:54:38,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:54:39,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:54:39,564 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:54:39,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:54:41,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-30 14:54:45,674 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-30 14:54:47,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:54:49,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:54:51,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:54:51,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:54:52,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-30 14:54:55,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-30 14:54:59,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:55:01,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:55:04,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:55:09,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:55:15,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:55:17,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-30 14:55:17,679 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:55:17,708 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:55:20,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-30 14:55:22,211 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-30 14:55:24,278 INFO [train.py:1039] (1/4) Epoch 22, batch 550, loss[loss=0.2353, simple_loss=0.2988, pruned_loss=0.0859, over 19569.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2514, pruned_loss=0.0495, over 4415011.01 frames. ], batch size: 388, lr: 4.71e-03, grad_scale: 32.0 2023-09-30 14:55:24,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:55:27,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-30 14:55:30,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-30 14:55:30,726 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:55:30,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-30 14:55:30,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:55:32,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:55:32,829 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:55:32,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:55:32,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:55:35,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:55:38,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:55:39,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-30 14:55:39,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:55:46,100 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:55:46,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:55:47,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:55:49,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:55:53,968 WARNING [train.py:1197] (1/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-30 14:55:54,087 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-30 14:55:55,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:55:59,490 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=747493.3333333334, ans=0.125 2023-09-30 14:56:02,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:56:02,304 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:56:03,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-30 14:56:06,772 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.60 vs. limit=12.0 2023-09-30 14:56:09,081 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:56:09,103 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-30 14:56:10,478 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:56:10,887 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=747493.3333333334, ans=0.125 2023-09-30 14:56:12,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 14:56:14,360 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:56:15,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:56:15,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-30 14:56:17,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:56:17,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-30 14:56:19,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-30 14:56:21,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:56:21,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:56:21,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:56:21,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:56:24,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:56:25,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:56:28,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:56:29,072 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:56:30,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 14:56:32,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:56:34,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:56:35,675 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:56:35,759 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:56:38,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-30 14:56:38,757 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-30 14:56:44,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-30 14:56:47,701 INFO [train.py:1039] (1/4) Epoch 22, batch 600, loss[loss=0.2346, simple_loss=0.2994, pruned_loss=0.08492, over 19651.00 frames. ], tot_loss[loss=0.1769, simple_loss=0.2527, pruned_loss=0.05049, over 4473856.58 frames. ], batch size: 389, lr: 4.71e-03, grad_scale: 16.0 2023-09-30 14:56:49,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-30 14:56:50,745 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:56:50,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 14:56:50,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:56:57,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:57:00,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 14:57:01,990 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-30 14:57:03,617 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-30 14:57:07,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:57:08,710 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:57:11,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-30 14:57:11,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:57:13,347 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.458e+02 1.858e+02 2.031e+02 2.339e+02 3.248e+02, threshold=4.061e+02, percent-clipped=0.0 2023-09-30 14:57:18,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-30 14:57:22,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:57:22,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:57:23,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:57:28,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:57:28,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:57:30,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:57:37,264 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:57:40,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:57:40,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:57:40,962 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:57:42,897 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:57:49,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-30 14:57:54,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-30 14:57:56,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:57:59,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-30 14:58:01,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:58:03,257 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:58:04,735 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-30 14:58:04,789 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:58:04,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 14:58:10,811 INFO [train.py:1039] (1/4) Epoch 22, batch 650, loss[loss=0.1683, simple_loss=0.2396, pruned_loss=0.04851, over 24355.00 frames. ], tot_loss[loss=0.176, simple_loss=0.2518, pruned_loss=0.05008, over 4530656.53 frames. ], batch size: 56, lr: 4.71e-03, grad_scale: 16.0 2023-09-30 14:58:10,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 14:58:12,560 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:58:16,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:58:17,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:58:19,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:58:21,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-30 14:58:23,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:58:26,515 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=748093.3333333334, ans=0.0 2023-09-30 14:58:29,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:58:29,862 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:58:30,245 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=748093.3333333334, ans=0.0 2023-09-30 14:58:33,651 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=748093.3333333334, ans=0.125 2023-09-30 14:58:34,726 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:58:38,602 WARNING [train.py:1197] (1/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-30 14:58:38,908 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=748093.3333333334, ans=0.0 2023-09-30 14:58:40,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:58:40,430 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=748093.3333333334, ans=0.0 2023-09-30 14:58:41,656 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:58:44,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:58:44,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 14:58:47,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:58:47,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:58:48,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 14:58:51,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:58:51,672 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:58:54,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:58:54,765 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-30 14:58:54,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:58:54,815 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:58:59,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:58:59,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:59:01,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:59:01,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:59:01,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-30 14:59:04,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:59:04,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:59:06,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-30 14:59:06,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:59:08,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 14:59:09,751 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-30 14:59:11,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-30 14:59:11,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:59:11,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:59:12,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:59:12,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:59:14,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:59:20,647 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:59:20,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:59:22,120 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:59:23,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:59:23,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 14:59:25,730 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:59:32,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 14:59:32,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:59:32,539 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:59:32,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:59:33,874 INFO [train.py:1039] (1/4) Epoch 22, batch 700, loss[loss=0.1749, simple_loss=0.24, pruned_loss=0.0549, over 23880.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2502, pruned_loss=0.04915, over 4572333.49 frames. ], batch size: 195, lr: 4.71e-03, grad_scale: 16.0 2023-09-30 14:59:37,656 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-30 14:59:37,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-30 14:59:41,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-30 14:59:41,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:59:43,092 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:59:45,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-30 14:59:50,400 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:59:53,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:59:55,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:59:58,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-30 14:59:58,836 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:59:59,138 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=748426.6666666666, ans=0.125 2023-09-30 15:00:00,127 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 1.824e+02 1.972e+02 2.211e+02 2.960e+02, threshold=3.944e+02, percent-clipped=0.0 2023-09-30 15:00:01,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:00:07,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 15:00:07,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:00:07,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-30 15:00:07,403 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 15:00:11,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-30 15:00:15,938 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:00:15,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:00:18,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:00:22,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:00:23,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-30 15:00:26,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:00:26,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 15:00:28,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-30 15:00:30,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:00:32,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:00:35,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:00:40,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:00:41,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-30 15:00:45,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-30 15:00:45,773 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-30 15:00:49,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:00:52,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:00:52,500 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=748626.6666666666, ans=0.125 2023-09-30 15:00:53,753 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:00:53,987 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:00:53,996 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-30 15:00:56,884 INFO [train.py:1039] (1/4) Epoch 22, batch 750, loss[loss=0.1622, simple_loss=0.2401, pruned_loss=0.0421, over 24602.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2489, pruned_loss=0.04875, over 4603812.18 frames. ], batch size: 60, lr: 4.71e-03, grad_scale: 16.0 2023-09-30 15:00:58,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-30 15:00:58,562 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-30 15:00:58,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-30 15:01:00,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-30 15:01:00,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-30 15:01:00,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:01:01,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-30 15:01:03,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:01:03,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:01:05,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:01:06,991 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:01:07,228 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=748693.3333333334, ans=0.125 2023-09-30 15:01:08,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-30 15:01:08,485 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:01:10,764 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=748693.3333333334, ans=0.0 2023-09-30 15:01:13,449 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:01:14,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:01:16,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:01:18,560 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=748760.0, ans=0.125 2023-09-30 15:01:20,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:01:20,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:01:20,505 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-30 15:01:24,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:01:24,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:01:25,792 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:01:27,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-30 15:01:28,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-30 15:01:28,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:01:30,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-30 15:01:30,474 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-30 15:01:31,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-30 15:01:31,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-30 15:01:33,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 15:01:34,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:01:41,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:01:41,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:01:41,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 15:01:42,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:01:45,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:01:45,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-30 15:01:46,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:01:46,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-30 15:01:47,469 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.47 vs. limit=12.0 2023-09-30 15:01:48,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:01:49,962 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 15:01:52,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:01:54,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-30 15:01:54,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:02:00,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:02:01,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:02:03,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:02:06,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 15:02:07,976 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=748960.0, ans=0.0 2023-09-30 15:02:10,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-30 15:02:10,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:02:10,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:02:13,116 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:02:14,524 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:02:16,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:02:17,938 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-30 15:02:19,337 INFO [train.py:1039] (1/4) Epoch 22, batch 800, loss[loss=0.1752, simple_loss=0.248, pruned_loss=0.05123, over 23485.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2499, pruned_loss=0.04883, over 4625609.12 frames. ], batch size: 134, lr: 4.71e-03, grad_scale: 32.0 2023-09-30 15:02:25,212 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=749026.6666666666, ans=0.125 2023-09-30 15:02:26,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:02:26,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:02:28,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:02:28,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:02:30,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:02:30,493 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:02:33,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:02:37,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:02:38,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:02:40,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-30 15:02:41,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:02:42,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:02:42,089 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:02:43,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:02:43,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-30 15:02:44,122 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.23 vs. limit=15.0 2023-09-30 15:02:45,048 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:02:45,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-30 15:02:47,014 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.835e+02 2.019e+02 2.284e+02 4.447e+02, threshold=4.039e+02, percent-clipped=1.0 2023-09-30 15:02:48,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:02:48,961 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=749093.3333333334, ans=0.0 2023-09-30 15:02:51,743 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:02:53,427 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=749160.0, ans=0.035 2023-09-30 15:02:55,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:02:56,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:02:59,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:02:59,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:03:01,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:03:03,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:03:03,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-30 15:03:06,947 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-30 15:03:06,994 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-30 15:03:07,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 15:03:08,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:03:09,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:03:09,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:03:15,310 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-30 15:03:15,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-30 15:03:16,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:03:17,400 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=749226.6666666666, ans=0.0 2023-09-30 15:03:17,749 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=8.58 vs. limit=15.0 2023-09-30 15:03:18,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:03:18,934 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=749226.6666666666, ans=0.125 2023-09-30 15:03:22,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:03:25,447 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:03:27,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-30 15:03:27,853 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=749293.3333333334, ans=0.2 2023-09-30 15:03:28,130 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.whiten.whitening_limit, batch_count=749293.3333333334, ans=12.0 2023-09-30 15:03:28,175 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.37 vs. limit=15.0 2023-09-30 15:03:28,908 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:03:30,639 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-30 15:03:31,254 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.70 vs. limit=15.0 2023-09-30 15:03:38,778 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:03:42,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:03:42,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-30 15:03:42,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:03:44,064 INFO [train.py:1039] (1/4) Epoch 22, batch 850, loss[loss=0.1479, simple_loss=0.2197, pruned_loss=0.03807, over 24443.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2511, pruned_loss=0.04973, over 4640250.69 frames. ], batch size: 58, lr: 4.71e-03, grad_scale: 32.0 2023-09-30 15:03:44,216 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:03:45,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-30 15:03:45,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:03:47,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:03:49,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:03:50,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 15:03:52,274 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:03:52,460 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-30 15:03:53,963 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-30 15:03:53,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-30 15:03:55,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:03:55,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:03:58,108 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=749360.0, ans=0.125 2023-09-30 15:03:59,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:03:59,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:03:59,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 15:04:04,537 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:04:04,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:04:04,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-30 15:04:07,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-30 15:04:12,895 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:04:13,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-30 15:04:14,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-30 15:04:16,478 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-30 15:04:19,511 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-30 15:04:19,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:04:19,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:04:19,561 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 15:04:24,333 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:04:24,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:04:25,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-30 15:04:26,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:04:28,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:04:28,276 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:04:29,772 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-30 15:04:31,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:04:32,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-30 15:04:32,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-30 15:04:36,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:04:36,705 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:04:37,000 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=749560.0, ans=0.125 2023-09-30 15:04:38,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:04:38,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:04:41,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:04:44,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:04:46,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-30 15:04:48,059 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=749560.0, ans=0.125 2023-09-30 15:04:49,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-30 15:04:49,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:04:50,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-30 15:05:00,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-30 15:05:02,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:05:02,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-30 15:05:02,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:05:02,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:05:05,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-30 15:05:07,238 INFO [train.py:1039] (1/4) Epoch 22, batch 900, loss[loss=0.1615, simple_loss=0.2454, pruned_loss=0.03882, over 24512.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2518, pruned_loss=0.04955, over 4660309.90 frames. ], batch size: 63, lr: 4.71e-03, grad_scale: 32.0 2023-09-30 15:05:13,371 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:05:18,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:05:18,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-30 15:05:21,982 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:05:22,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-30 15:05:23,541 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-30 15:05:25,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:05:25,141 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:05:25,222 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 15:05:26,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:05:32,811 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.809e+02 2.095e+02 2.574e+02 4.591e+02, threshold=4.190e+02, percent-clipped=1.0 2023-09-30 15:05:36,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:05:36,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:05:36,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 15:05:38,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:05:43,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-30 15:05:45,433 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:05:48,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-30 15:05:49,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:05:50,090 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-30 15:05:51,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-30 15:05:59,652 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-30 15:05:59,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:05:59,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 15:06:07,382 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:06:07,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:06:09,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-30 15:06:10,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:06:12,679 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-30 15:06:15,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:06:15,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:06:17,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:06:17,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:06:24,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-30 15:06:25,502 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-30 15:06:25,708 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-30 15:06:25,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-30 15:06:28,563 INFO [train.py:1039] (1/4) Epoch 22, batch 950, loss[loss=0.1746, simple_loss=0.2566, pruned_loss=0.04635, over 23695.00 frames. ], tot_loss[loss=0.1758, simple_loss=0.2522, pruned_loss=0.04967, over 4682098.73 frames. ], batch size: 85, lr: 4.71e-03, grad_scale: 32.0 2023-09-30 15:06:28,789 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:06:33,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-30 15:06:38,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:06:38,904 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=750026.6666666666, ans=0.1 2023-09-30 15:06:41,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:06:41,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:06:43,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 15:06:46,307 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-30 15:06:48,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:06:50,642 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:06:52,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:06:52,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:06:52,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-30 15:06:52,438 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-30 15:06:55,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:06:56,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-30 15:06:57,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:06:59,377 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=750093.3333333334, ans=0.1 2023-09-30 15:07:00,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:07:00,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:07:00,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:07:00,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-30 15:07:04,553 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 15:07:06,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:07:07,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:07:12,193 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:07:12,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:07:16,651 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-30 15:07:20,137 WARNING [train.py:1197] (1/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 15:07:20,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:07:20,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:07:22,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:07:22,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:07:27,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-30 15:07:27,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:07:30,202 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:07:32,385 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:07:32,417 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-30 15:07:32,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:07:32,447 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:07:32,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-30 15:07:34,363 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=750293.3333333334, ans=0.025 2023-09-30 15:07:37,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 15:07:40,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:07:45,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:07:47,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-30 15:07:47,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-30 15:07:51,488 INFO [train.py:1039] (1/4) Epoch 22, batch 1000, loss[loss=0.1509, simple_loss=0.2292, pruned_loss=0.03631, over 24596.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.2513, pruned_loss=0.04915, over 4698112.52 frames. ], batch size: 60, lr: 4.70e-03, grad_scale: 16.0 2023-09-30 15:07:54,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:07:57,074 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=750360.0, ans=0.125 2023-09-30 15:07:58,365 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-30 15:07:58,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:08:03,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:08:05,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-30 15:08:05,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-30 15:08:10,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:08:10,613 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:08:14,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:08:15,934 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-30 15:08:18,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-30 15:08:19,276 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=750426.6666666666, ans=0.125 2023-09-30 15:08:20,244 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.495e+02 1.832e+02 2.013e+02 2.220e+02 3.757e+02, threshold=4.026e+02, percent-clipped=0.0 2023-09-30 15:08:21,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-30 15:08:21,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:08:24,817 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-30 15:08:25,003 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-30 15:08:26,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-30 15:08:27,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:08:28,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:08:38,519 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:08:38,777 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=750493.3333333334, ans=0.1 2023-09-30 15:08:39,219 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.32 vs. limit=15.0 2023-09-30 15:08:39,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:08:40,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:08:40,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:08:40,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-30 15:08:42,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:08:43,791 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:08:43,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:08:43,952 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-30 15:08:47,745 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=750560.0, ans=0.2 2023-09-30 15:08:49,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-30 15:08:50,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-30 15:08:52,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-30 15:08:53,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:08:55,836 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=750560.0, ans=0.2 2023-09-30 15:08:58,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:08:58,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:08:59,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:09:00,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:09:01,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-30 15:09:02,170 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:09:02,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-30 15:09:04,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-30 15:09:04,347 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:09:04,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:09:09,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:09:11,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 15:09:12,917 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=750626.6666666666, ans=0.0 2023-09-30 15:09:14,067 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:09:15,517 INFO [train.py:1039] (1/4) Epoch 22, batch 1050, loss[loss=0.1559, simple_loss=0.2337, pruned_loss=0.03907, over 24618.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2508, pruned_loss=0.04873, over 4720647.73 frames. ], batch size: 60, lr: 4.70e-03, grad_scale: 16.0 2023-09-30 15:09:17,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:09:21,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:09:22,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 15:09:24,359 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:09:24,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:09:27,871 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=750693.3333333334, ans=0.125 2023-09-30 15:09:29,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:09:30,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:09:32,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:09:33,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:09:33,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-30 15:09:34,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:09:35,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-30 15:09:35,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:09:37,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-30 15:09:41,273 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:09:41,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-30 15:09:41,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-30 15:09:49,035 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:09:50,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-30 15:09:50,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:09:52,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-30 15:09:54,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-30 15:09:54,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:09:57,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-30 15:09:58,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-30 15:09:59,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:10:02,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 15:10:03,835 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=9.15 vs. limit=15.0 2023-09-30 15:10:04,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-30 15:10:05,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:10:06,085 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:10:10,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:10:14,266 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-30 15:10:16,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-30 15:10:16,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-30 15:10:16,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:10:17,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:10:19,487 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-30 15:10:23,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:10:24,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:10:24,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:10:24,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:10:24,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:10:28,481 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=750960.0, ans=0.125 2023-09-30 15:10:29,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:10:29,864 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-30 15:10:32,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:10:32,824 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-30 15:10:32,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-30 15:10:34,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:10:37,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:10:38,860 INFO [train.py:1039] (1/4) Epoch 22, batch 1100, loss[loss=0.1882, simple_loss=0.2744, pruned_loss=0.05102, over 24648.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2495, pruned_loss=0.04862, over 4711929.69 frames. ], batch size: 73, lr: 4.70e-03, grad_scale: 8.0 2023-09-30 15:10:45,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:10:50,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 15:10:50,349 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=751026.6666666666, ans=0.125 2023-09-30 15:10:53,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:10:53,663 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:10:53,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-30 15:10:55,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:10:56,921 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-30 15:10:59,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:11:02,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:11:02,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-30 15:11:05,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 15:11:06,753 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:11:06,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:11:08,138 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.614e+02 1.829e+02 2.096e+02 2.478e+02 4.106e+02, threshold=4.191e+02, percent-clipped=1.0 2023-09-30 15:11:08,605 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=751093.3333333334, ans=0.125 2023-09-30 15:11:09,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:11:10,004 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:11:14,796 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=751160.0, ans=0.0 2023-09-30 15:11:14,886 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=751160.0, ans=0.0 2023-09-30 15:11:16,022 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:11:19,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-30 15:11:19,941 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-30 15:11:20,180 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=751160.0, ans=0.0 2023-09-30 15:11:21,272 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:11:21,589 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=751160.0, ans=0.0 2023-09-30 15:11:22,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:11:24,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-30 15:11:24,885 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:11:26,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-30 15:11:28,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:11:28,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:11:28,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:11:29,472 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:11:29,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-30 15:11:34,897 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:11:34,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-30 15:11:37,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 15:11:41,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 15:11:42,203 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.21 vs. limit=6.0 2023-09-30 15:11:44,264 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-30 15:11:44,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-30 15:11:45,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:11:47,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:11:48,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:11:50,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-30 15:11:52,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:11:52,418 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:11:52,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-30 15:11:54,046 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:11:54,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-30 15:11:56,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:11:56,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:11:57,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:11:58,040 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=751293.3333333334, ans=0.1 2023-09-30 15:12:00,867 INFO [train.py:1039] (1/4) Epoch 22, batch 1150, loss[loss=0.1855, simple_loss=0.2514, pruned_loss=0.05978, over 23455.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2496, pruned_loss=0.04838, over 4717205.91 frames. ], batch size: 285, lr: 4.70e-03, grad_scale: 8.0 2023-09-30 15:12:01,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:12:05,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:12:08,041 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:12:08,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:12:08,157 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-30 15:12:09,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:12:12,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-30 15:12:15,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:12:15,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 15:12:21,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-30 15:12:23,212 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:12:28,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:12:28,651 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=751426.6666666666, ans=0.125 2023-09-30 15:12:29,671 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:12:29,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-30 15:12:29,759 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-30 15:12:29,798 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:12:35,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-30 15:12:36,592 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:12:38,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:12:48,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:12:55,285 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:12:56,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-30 15:12:56,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:12:56,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:13:01,951 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-30 15:13:03,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:13:03,871 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=751560.0, ans=10.0 2023-09-30 15:13:10,491 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-30 15:13:17,053 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:13:18,654 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:13:18,701 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-30 15:13:18,765 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:13:22,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:13:23,903 INFO [train.py:1039] (1/4) Epoch 22, batch 1200, loss[loss=0.1771, simple_loss=0.2468, pruned_loss=0.0537, over 24432.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2509, pruned_loss=0.04873, over 4720289.22 frames. ], batch size: 58, lr: 4.70e-03, grad_scale: 16.0 2023-09-30 15:13:24,217 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=751693.3333333334, ans=0.1 2023-09-30 15:13:27,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:13:27,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:13:28,916 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:13:28,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:13:30,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:13:33,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:13:35,372 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 15:13:35,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:13:35,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:13:38,649 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-30 15:13:41,729 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-30 15:13:43,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 15:13:45,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:13:47,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:13:51,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:13:51,380 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-30 15:13:52,834 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:13:55,745 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.455e+02 1.831e+02 1.967e+02 2.216e+02 3.733e+02, threshold=3.933e+02, percent-clipped=0.0 2023-09-30 15:14:00,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-30 15:14:00,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:14:00,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-30 15:14:00,933 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:14:05,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-30 15:14:09,073 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=751826.6666666666, ans=0.04949747468305833 2023-09-30 15:14:10,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-30 15:14:10,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:14:11,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:14:13,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:14:13,673 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-30 15:14:15,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:14:15,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:14:15,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:14:15,573 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=751893.3333333334, ans=0.125 2023-09-30 15:14:16,849 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-30 15:14:18,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 15:14:18,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:14:18,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 15:14:22,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:14:22,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:14:22,599 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=751893.3333333334, ans=0.125 2023-09-30 15:14:27,306 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-30 15:14:29,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:14:32,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-30 15:14:36,932 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-30 15:14:40,071 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:14:42,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:14:43,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:14:45,247 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:14:46,641 INFO [train.py:1039] (1/4) Epoch 22, batch 1250, loss[loss=0.1549, simple_loss=0.2391, pruned_loss=0.0354, over 24667.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2519, pruned_loss=0.04951, over 4712406.26 frames. ], batch size: 65, lr: 4.70e-03, grad_scale: 4.0 2023-09-30 15:14:48,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-30 15:14:52,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:14:52,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:14:54,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-30 15:14:55,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:14:56,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 15:14:58,668 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=752026.6666666666, ans=0.0 2023-09-30 15:15:00,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 15:15:02,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:15:02,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:15:02,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:15:05,333 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=752093.3333333334, ans=0.0 2023-09-30 15:15:06,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-30 15:15:09,668 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=752093.3333333334, ans=0.0 2023-09-30 15:15:10,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 15:15:10,974 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-30 15:15:10,982 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:15:12,564 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:15:14,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:15:17,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:15:18,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-30 15:15:19,859 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=752160.0, ans=0.0 2023-09-30 15:15:24,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-30 15:15:24,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:15:24,515 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=752160.0, ans=0.125 2023-09-30 15:15:26,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:15:26,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-30 15:15:28,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:15:28,368 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-30 15:15:28,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:15:28,403 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:15:32,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:15:36,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:15:37,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:15:39,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-30 15:15:39,066 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-30 15:15:40,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-30 15:15:43,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:15:43,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-30 15:15:43,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:15:48,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-30 15:15:48,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:15:50,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-30 15:15:50,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-30 15:15:51,669 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:15:51,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-30 15:15:53,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:15:54,799 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-30 15:15:57,793 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:15:57,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:15:59,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 15:16:01,672 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-30 15:16:03,464 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=752293.3333333334, ans=0.125 2023-09-30 15:16:06,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:16:08,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-30 15:16:09,837 INFO [train.py:1039] (1/4) Epoch 22, batch 1300, loss[loss=0.1842, simple_loss=0.2528, pruned_loss=0.0578, over 23748.00 frames. ], tot_loss[loss=0.1758, simple_loss=0.2523, pruned_loss=0.04964, over 4705835.39 frames. ], batch size: 212, lr: 4.70e-03, grad_scale: 8.0 2023-09-30 15:16:11,499 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:16:11,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-30 15:16:13,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:16:14,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:16:16,315 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:16:17,762 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-30 15:16:25,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 15:16:25,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-30 15:16:27,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-30 15:16:31,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 15:16:35,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:16:37,249 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:16:37,405 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:16:38,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:16:39,731 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.90 vs. limit=15.0 2023-09-30 15:16:41,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 15:16:41,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-30 15:16:41,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-30 15:16:43,114 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 1.847e+02 1.992e+02 2.255e+02 3.036e+02, threshold=3.984e+02, percent-clipped=0.0 2023-09-30 15:16:47,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:16:47,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 15:16:50,905 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-30 15:16:52,497 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 15:16:54,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:16:57,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:16:57,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-30 15:16:57,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:16:59,244 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-30 15:17:00,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:17:06,885 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:17:06,890 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:17:10,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-30 15:17:10,124 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-30 15:17:12,318 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-30 15:17:12,597 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=752560.0, ans=0.125 2023-09-30 15:17:19,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:17:22,196 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-30 15:17:23,758 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:17:30,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-30 15:17:31,379 INFO [train.py:1039] (1/4) Epoch 22, batch 1350, loss[loss=0.1547, simple_loss=0.2352, pruned_loss=0.03715, over 24465.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.2512, pruned_loss=0.04928, over 4698951.41 frames. ], batch size: 63, lr: 4.70e-03, grad_scale: 8.0 2023-09-30 15:17:35,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:17:35,617 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 15:17:38,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:17:39,964 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:17:41,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:17:41,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:17:43,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:17:48,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:17:50,591 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-30 15:17:50,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-30 15:17:52,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:17:55,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-30 15:17:57,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:17:57,803 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=752760.0, ans=0.125 2023-09-30 15:17:58,884 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:17:58,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-30 15:18:00,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-30 15:18:02,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-30 15:18:03,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:18:03,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-30 15:18:05,776 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=752826.6666666666, ans=0.125 2023-09-30 15:18:05,806 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=752826.6666666666, ans=0.0 2023-09-30 15:18:14,796 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.03 vs. limit=6.0 2023-09-30 15:18:15,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:18:19,094 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.64 vs. limit=15.0 2023-09-30 15:18:25,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:18:26,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:18:26,090 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-30 15:18:29,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:18:31,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-30 15:18:31,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-30 15:18:31,503 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:18:35,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:18:37,550 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-30 15:18:39,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:18:39,813 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 15:18:45,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-30 15:18:47,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-30 15:18:53,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-30 15:18:54,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:18:55,671 INFO [train.py:1039] (1/4) Epoch 22, batch 1400, loss[loss=0.1511, simple_loss=0.2324, pruned_loss=0.03496, over 24299.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.249, pruned_loss=0.04867, over 4697835.18 frames. ], batch size: 61, lr: 4.70e-03, grad_scale: 8.0 2023-09-30 15:18:57,444 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:18:57,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:19:04,621 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-30 15:19:06,186 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-30 15:19:14,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 15:19:16,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:19:19,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:19:19,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-30 15:19:25,542 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:19:25,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 15:19:29,132 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.828e+02 2.054e+02 2.330e+02 3.479e+02, threshold=4.107e+02, percent-clipped=0.0 2023-09-30 15:19:29,615 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=753160.0, ans=0.0 2023-09-30 15:19:38,111 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:19:38,221 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:19:42,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-30 15:19:44,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:19:44,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:19:45,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:19:47,775 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:19:47,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:19:47,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:19:48,028 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:19:50,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-30 15:19:50,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:19:55,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:19:58,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:20:08,692 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-30 15:20:08,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 15:20:10,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:20:12,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 15:20:13,188 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=753293.3333333334, ans=0.0 2023-09-30 15:20:16,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:20:17,176 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.15 vs. limit=15.0 2023-09-30 15:20:17,857 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:20:19,330 INFO [train.py:1039] (1/4) Epoch 22, batch 1450, loss[loss=0.1618, simple_loss=0.2454, pruned_loss=0.03911, over 24429.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2487, pruned_loss=0.04822, over 4706512.18 frames. ], batch size: 69, lr: 4.70e-03, grad_scale: 8.0 2023-09-30 15:20:20,349 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=10.06 vs. limit=15.0 2023-09-30 15:20:21,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-30 15:20:23,308 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:20:23,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:20:23,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-30 15:20:29,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:20:29,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 15:20:31,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:20:32,385 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-30 15:20:32,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 15:20:33,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-30 15:20:35,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:20:35,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:20:35,574 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-30 15:20:37,290 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:20:38,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-30 15:20:38,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 15:20:38,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:20:40,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:20:43,695 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:20:47,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:20:51,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:20:51,254 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:20:54,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:20:54,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:20:55,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:20:57,348 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:20:57,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:20:57,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:21:02,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-30 15:21:04,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:21:07,274 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-30 15:21:08,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:21:10,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:21:11,936 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:21:14,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-30 15:21:16,715 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=753560.0, ans=0.125 2023-09-30 15:21:18,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:21:20,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-30 15:21:20,534 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 15:21:21,847 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-30 15:21:23,475 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:21:26,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:21:26,517 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:21:28,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-30 15:21:31,506 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-30 15:21:33,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-30 15:21:33,651 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:21:35,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 15:21:41,325 INFO [train.py:1039] (1/4) Epoch 22, batch 1500, loss[loss=0.2293, simple_loss=0.2839, pruned_loss=0.08738, over 19904.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2492, pruned_loss=0.04799, over 4715209.80 frames. ], batch size: 388, lr: 4.69e-03, grad_scale: 8.0 2023-09-30 15:21:45,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-30 15:21:46,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:21:46,067 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:21:47,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:21:47,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:21:49,167 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:21:50,694 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-30 15:21:52,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:21:53,014 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-30 15:21:53,026 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:21:54,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:21:56,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:21:57,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:22:02,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:22:02,750 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-30 15:22:04,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-30 15:22:04,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:22:04,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:22:07,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-30 15:22:12,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-30 15:22:14,236 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:22:14,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-30 15:22:15,596 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.832e+02 2.019e+02 2.350e+02 4.853e+02, threshold=4.037e+02, percent-clipped=1.0 2023-09-30 15:22:17,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-30 15:22:20,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:22:20,400 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=753826.6666666666, ans=0.125 2023-09-30 15:22:21,697 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:22:21,721 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:22:23,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-30 15:22:23,366 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:22:24,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:22:24,828 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-30 15:22:24,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:22:30,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:22:30,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-30 15:22:37,034 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 15:22:37,266 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=753893.3333333334, ans=0.0 2023-09-30 15:22:37,888 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.44 vs. limit=15.0 2023-09-30 15:22:38,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 15:22:42,303 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-30 15:22:43,720 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:22:43,728 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-30 15:22:46,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:22:46,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:22:48,196 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-30 15:22:48,525 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=753960.0, ans=0.0 2023-09-30 15:22:49,663 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-30 15:22:52,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-30 15:22:52,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:22:57,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:22:57,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:22:58,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:22:58,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:22:59,083 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:22:59,251 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=753960.0, ans=0.125 2023-09-30 15:23:03,215 INFO [train.py:1039] (1/4) Epoch 22, batch 1550, loss[loss=0.1879, simple_loss=0.2568, pruned_loss=0.05949, over 23869.00 frames. ], tot_loss[loss=0.1728, simple_loss=0.2495, pruned_loss=0.04805, over 4733895.64 frames. ], batch size: 179, lr: 4.69e-03, grad_scale: 8.0 2023-09-30 15:23:03,380 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-30 15:23:04,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-30 15:23:04,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:23:06,937 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-30 15:23:06,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-30 15:23:08,610 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:23:10,290 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:23:10,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:23:11,692 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:23:11,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:23:13,340 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:23:17,010 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-30 15:23:17,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:23:17,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 15:23:18,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 15:23:21,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:23:21,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-30 15:23:21,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:23:23,057 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-30 15:23:23,392 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=754093.3333333334, ans=0.07 2023-09-30 15:23:24,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-30 15:23:24,605 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-30 15:23:24,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:23:26,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:23:30,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:23:32,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-30 15:23:32,645 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-30 15:23:34,490 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=754160.0, ans=0.0 2023-09-30 15:23:41,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:23:41,882 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=754160.0, ans=0.125 2023-09-30 15:23:46,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:23:48,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-30 15:23:48,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:23:49,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-30 15:23:55,448 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.51 vs. limit=12.0 2023-09-30 15:23:56,012 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 15:23:56,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:23:59,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:24:01,006 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:24:02,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:24:02,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-30 15:24:03,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 15:24:05,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:24:05,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:24:06,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-30 15:24:06,981 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-30 15:24:10,176 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:24:16,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-30 15:24:22,800 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:24:22,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:24:24,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-30 15:24:26,438 INFO [train.py:1039] (1/4) Epoch 22, batch 1600, loss[loss=0.1731, simple_loss=0.2616, pruned_loss=0.04231, over 24592.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2509, pruned_loss=0.04859, over 4736917.44 frames. ], batch size: 71, lr: 4.69e-03, grad_scale: 16.0 2023-09-30 15:24:26,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 15:24:28,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:24:28,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:24:28,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:24:29,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:24:34,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:24:34,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-30 15:24:35,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-30 15:24:38,196 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=11.84 vs. limit=15.0 2023-09-30 15:24:38,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-30 15:24:40,396 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:24:41,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-30 15:24:42,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:24:45,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:24:50,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:24:54,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-30 15:24:57,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:24:58,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-30 15:24:59,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:25:00,932 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 1.869e+02 2.022e+02 2.307e+02 3.871e+02, threshold=4.044e+02, percent-clipped=0.0 2023-09-30 15:25:01,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-30 15:25:07,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-30 15:25:13,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:25:15,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-30 15:25:16,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:25:16,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:25:16,468 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:25:18,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-30 15:25:22,334 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.18 vs. limit=10.0 2023-09-30 15:25:25,037 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 15:25:26,620 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:25:26,682 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:25:28,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:25:28,230 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:25:31,886 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:25:32,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:25:33,546 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:25:33,784 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=754626.6666666666, ans=0.125 2023-09-30 15:25:35,240 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=754626.6666666666, ans=0.125 2023-09-30 15:25:40,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:25:40,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:25:43,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-30 15:25:43,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:25:46,472 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-30 15:25:47,985 INFO [train.py:1039] (1/4) Epoch 22, batch 1650, loss[loss=0.1726, simple_loss=0.2363, pruned_loss=0.05443, over 23711.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.2511, pruned_loss=0.04907, over 4717405.54 frames. ], batch size: 232, lr: 4.69e-03, grad_scale: 16.0 2023-09-30 15:25:49,900 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=754693.3333333334, ans=0.125 2023-09-30 15:25:51,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:25:51,448 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:25:52,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:25:52,864 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-30 15:25:52,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-30 15:25:52,893 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-30 15:25:52,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-30 15:25:53,127 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=754693.3333333334, ans=0.125 2023-09-30 15:25:58,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:25:58,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:26:00,163 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:26:00,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-30 15:26:00,593 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=754693.3333333334, ans=0.125 2023-09-30 15:26:03,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:26:06,234 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-30 15:26:08,440 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:26:08,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:26:08,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:26:08,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:26:09,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-30 15:26:10,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-30 15:26:16,413 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 15:26:18,273 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=754760.0, ans=0.2 2023-09-30 15:26:19,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-30 15:26:27,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-30 15:26:27,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:26:28,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-30 15:26:32,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:26:35,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:26:35,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:26:37,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:26:38,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:26:38,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:26:41,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:26:43,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:26:43,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:26:44,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:26:46,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:26:47,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 15:26:50,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:26:52,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-30 15:26:53,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:26:53,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-30 15:26:55,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-30 15:26:56,889 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-30 15:26:56,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:26:57,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:26:57,091 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:26:58,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:26:58,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-30 15:27:01,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:27:04,082 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:27:04,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:27:07,081 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-30 15:27:10,541 INFO [train.py:1039] (1/4) Epoch 22, batch 1700, loss[loss=0.1427, simple_loss=0.2216, pruned_loss=0.03185, over 24408.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2499, pruned_loss=0.0487, over 4717528.87 frames. ], batch size: 58, lr: 4.69e-03, grad_scale: 16.0 2023-09-30 15:27:12,779 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:27:12,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:27:14,730 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-30 15:27:16,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:27:16,179 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:27:16,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:27:17,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:27:19,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:27:19,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-30 15:27:22,321 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:27:27,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:27:30,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:27:30,624 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=755093.3333333334, ans=0.125 2023-09-30 15:27:31,444 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.78 vs. limit=15.0 2023-09-30 15:27:37,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-30 15:27:37,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:27:37,311 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:27:39,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:27:41,144 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=755093.3333333334, ans=0.125 2023-09-30 15:27:42,319 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-30 15:27:43,989 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:27:44,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:27:45,841 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.451e+02 1.865e+02 2.081e+02 2.403e+02 3.253e+02, threshold=4.162e+02, percent-clipped=0.0 2023-09-30 15:27:46,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-30 15:27:48,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-30 15:27:49,830 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-30 15:27:51,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-30 15:27:51,561 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:27:53,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-30 15:27:54,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:27:54,981 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=755160.0, ans=0.0 2023-09-30 15:27:56,501 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=755160.0, ans=0.125 2023-09-30 15:28:03,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:28:05,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:28:06,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:28:07,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-30 15:28:07,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-30 15:28:07,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:28:10,797 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:28:10,798 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-30 15:28:10,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:28:10,938 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:28:12,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:28:13,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:28:14,736 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=755226.6666666666, ans=0.125 2023-09-30 15:28:16,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:28:16,170 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:28:17,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:28:19,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:28:19,827 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:28:22,491 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=755293.3333333334, ans=0.125 2023-09-30 15:28:23,612 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:28:24,571 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=13.53 vs. limit=22.5 2023-09-30 15:28:25,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-30 15:28:28,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:28:29,626 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:28:32,637 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-30 15:28:34,060 INFO [train.py:1039] (1/4) Epoch 22, batch 1750, loss[loss=0.1713, simple_loss=0.2518, pruned_loss=0.04536, over 24666.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2493, pruned_loss=0.04839, over 4717310.83 frames. ], batch size: 65, lr: 4.69e-03, grad_scale: 16.0 2023-09-30 15:28:38,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:28:41,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:28:41,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-30 15:28:43,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-30 15:28:43,388 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:28:46,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:28:46,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:28:46,772 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=755360.0, ans=0.125 2023-09-30 15:28:52,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-30 15:28:54,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:28:57,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-30 15:28:57,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:28:59,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:29:02,725 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 15:29:02,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-30 15:29:06,036 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:29:06,087 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-30 15:29:13,685 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:29:16,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:29:16,831 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:29:20,020 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:29:20,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:29:22,358 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:29:24,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:29:28,294 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:29:28,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:29:28,705 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=755560.0, ans=0.07 2023-09-30 15:29:29,911 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-30 15:29:32,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:29:35,342 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-30 15:29:36,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:29:38,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:29:38,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:29:43,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 15:29:44,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-30 15:29:44,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:29:46,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:29:50,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:29:52,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:29:52,977 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=755626.6666666666, ans=0.125 2023-09-30 15:29:54,015 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:29:54,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-30 15:29:56,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:29:56,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-30 15:29:56,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:29:56,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:29:58,267 INFO [train.py:1039] (1/4) Epoch 22, batch 1800, loss[loss=0.1826, simple_loss=0.2717, pruned_loss=0.0467, over 24574.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2494, pruned_loss=0.0483, over 4725683.77 frames. ], batch size: 71, lr: 4.69e-03, grad_scale: 16.0 2023-09-30 15:29:58,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:29:58,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:30:01,992 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:30:02,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:30:03,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 15:30:05,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:30:07,617 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=755693.3333333334, ans=0.5 2023-09-30 15:30:10,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 15:30:11,865 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:30:13,561 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:30:16,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:30:18,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:30:18,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:30:19,757 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:30:19,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-30 15:30:21,200 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:30:24,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:30:27,909 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-30 15:30:28,459 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.20 vs. limit=15.0 2023-09-30 15:30:30,403 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=755826.6666666666, ans=0.0 2023-09-30 15:30:31,545 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-30 15:30:31,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-30 15:30:33,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:30:33,294 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=755826.6666666666, ans=0.125 2023-09-30 15:30:34,970 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.922e+02 2.224e+02 2.607e+02 3.579e+02, threshold=4.447e+02, percent-clipped=0.0 2023-09-30 15:30:35,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:30:35,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:30:37,167 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:30:37,618 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=755826.6666666666, ans=0.125 2023-09-30 15:30:42,167 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-30 15:30:43,704 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:30:43,941 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=755826.6666666666, ans=0.0 2023-09-30 15:30:45,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:30:48,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-30 15:30:49,712 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-30 15:30:49,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:30:51,330 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:30:52,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 15:30:57,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-30 15:31:04,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:31:04,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-30 15:31:05,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:31:06,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:31:08,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:31:08,085 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-30 15:31:09,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:31:09,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:31:11,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-30 15:31:11,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:31:15,350 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:31:16,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-30 15:31:16,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:31:18,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:31:18,337 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:31:19,990 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:31:20,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:31:21,285 INFO [train.py:1039] (1/4) Epoch 22, batch 1850, loss[loss=0.1785, simple_loss=0.2625, pruned_loss=0.04723, over 23663.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.25, pruned_loss=0.04863, over 4725579.36 frames. ], batch size: 85, lr: 4.69e-03, grad_scale: 8.0 2023-09-30 15:31:24,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:31:24,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:31:32,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:31:32,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-30 15:31:35,356 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-30 15:31:39,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-30 15:31:44,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:31:44,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-30 15:31:45,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 15:31:54,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:31:55,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-30 15:31:58,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:31:58,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:32:03,474 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-30 15:32:04,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:32:04,997 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 15:32:06,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:32:08,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:32:13,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:32:17,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-30 15:32:17,093 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:32:17,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 15:32:17,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:32:18,657 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:32:20,203 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:32:23,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-30 15:32:25,199 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:32:29,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:32:31,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 15:32:31,233 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-30 15:32:31,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-30 15:32:34,196 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-30 15:32:34,326 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-30 15:32:35,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 15:32:35,921 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:32:35,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:32:35,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:32:37,489 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-30 15:32:37,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:32:37,570 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:32:39,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-30 15:32:40,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 15:32:41,985 INFO [train.py:1039] (1/4) Epoch 22, batch 1900, loss[loss=0.193, simple_loss=0.2659, pruned_loss=0.06009, over 23865.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2503, pruned_loss=0.04839, over 4734785.70 frames. ], batch size: 212, lr: 4.69e-03, grad_scale: 8.0 2023-09-30 15:32:42,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:32:42,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-30 15:32:43,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:32:43,800 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-30 15:32:43,825 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:32:45,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:32:52,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:32:54,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:32:56,153 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-30 15:32:57,626 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-30 15:32:59,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:32:59,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:33:01,237 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-30 15:33:01,280 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-30 15:33:04,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-30 15:33:07,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:33:07,587 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=756426.6666666666, ans=0.125 2023-09-30 15:33:11,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-30 15:33:12,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-30 15:33:18,267 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.802e+02 1.975e+02 2.316e+02 3.738e+02, threshold=3.949e+02, percent-clipped=0.0 2023-09-30 15:33:23,496 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-30 15:33:27,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-30 15:33:27,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:33:28,037 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-30 15:33:28,043 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-30 15:33:29,386 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-30 15:33:29,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-30 15:33:29,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:33:32,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-30 15:33:36,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:33:38,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:33:38,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-30 15:33:41,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:33:44,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-30 15:33:45,811 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:33:53,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:33:53,420 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:33:53,441 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:33:53,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:33:55,631 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 15:33:57,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-30 15:33:58,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:34:00,901 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:34:00,903 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:34:02,518 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=756626.6666666666, ans=0.0 2023-09-30 15:34:03,862 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:34:03,876 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:34:05,184 INFO [train.py:1039] (1/4) Epoch 22, batch 1950, loss[loss=0.1836, simple_loss=0.272, pruned_loss=0.04763, over 24467.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.2513, pruned_loss=0.04917, over 4729725.80 frames. ], batch size: 69, lr: 4.69e-03, grad_scale: 8.0 2023-09-30 15:34:05,287 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:34:06,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:34:10,101 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:34:13,028 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:34:13,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:34:13,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 15:34:18,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-30 15:34:18,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 15:34:18,259 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:34:19,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:34:22,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:34:22,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:34:22,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:34:25,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:34:29,100 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:34:29,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 15:34:29,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:34:31,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:34:34,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:34:38,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:34:38,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:34:38,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-30 15:34:38,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-30 15:34:40,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 15:34:40,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:34:41,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:34:46,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:34:47,870 WARNING [train.py:1197] (1/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:34:52,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 15:34:57,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:34:57,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:34:57,749 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-30 15:34:59,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:35:03,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:35:03,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:35:05,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-30 15:35:14,848 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:35:14,967 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:35:16,717 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=756960.0, ans=0.125 2023-09-30 15:35:17,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:35:19,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:35:19,884 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=756960.0, ans=0.125 2023-09-30 15:35:21,376 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=756960.0, ans=0.1 2023-09-30 15:35:22,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:35:22,829 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:35:24,169 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-30 15:35:24,178 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 15:35:25,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:35:26,134 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=757026.6666666666, ans=0.07 2023-09-30 15:35:27,169 INFO [train.py:1039] (1/4) Epoch 22, batch 2000, loss[loss=0.1548, simple_loss=0.2384, pruned_loss=0.03561, over 24311.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2519, pruned_loss=0.04928, over 4723314.44 frames. ], batch size: 61, lr: 4.68e-03, grad_scale: 16.0 2023-09-30 15:35:27,254 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-30 15:35:27,788 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=757026.6666666666, ans=0.0 2023-09-30 15:35:29,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:35:32,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-30 15:35:34,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:35:34,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:35:37,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:35:38,670 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:35:39,401 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=7.34 vs. limit=15.0 2023-09-30 15:35:40,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-30 15:35:42,352 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:35:44,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:35:46,315 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-30 15:35:47,368 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=757093.3333333334, ans=0.125 2023-09-30 15:35:49,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 15:35:49,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:35:52,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:35:54,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-30 15:35:54,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:35:56,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:35:57,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:35:57,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-30 15:35:59,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 15:36:00,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-30 15:36:00,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:36:04,195 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 2.020e+02 2.308e+02 2.637e+02 3.987e+02, threshold=4.617e+02, percent-clipped=1.0 2023-09-30 15:36:04,426 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:36:05,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-30 15:36:05,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:36:05,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:36:07,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:36:08,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-30 15:36:11,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-30 15:36:11,910 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:36:11,922 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:36:17,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:36:18,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:36:18,565 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:36:19,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:36:20,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:36:22,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:36:23,821 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:36:23,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:36:24,005 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:36:27,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:36:27,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-30 15:36:33,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 15:36:33,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:36:35,624 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=757293.3333333334, ans=0.04949747468305833 2023-09-30 15:36:38,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:36:38,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:36:41,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:36:43,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:36:43,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:36:45,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 15:36:45,109 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 15:36:49,052 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=757360.0, ans=0.125 2023-09-30 15:36:50,066 INFO [train.py:1039] (1/4) Epoch 22, batch 2050, loss[loss=0.1864, simple_loss=0.2718, pruned_loss=0.05049, over 24384.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.2516, pruned_loss=0.04927, over 4713808.93 frames. ], batch size: 77, lr: 4.68e-03, grad_scale: 16.0 2023-09-30 15:36:50,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:36:52,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:36:55,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:36:55,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:37:01,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:37:03,039 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:37:03,138 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:37:04,597 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:37:04,996 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=757426.6666666666, ans=0.125 2023-09-30 15:37:06,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-30 15:37:06,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:37:06,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:37:06,536 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=757426.6666666666, ans=0.125 2023-09-30 15:37:07,797 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-30 15:37:08,001 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=757426.6666666666, ans=0.125 2023-09-30 15:37:15,159 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.29 vs. limit=15.0 2023-09-30 15:37:17,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:37:17,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:37:19,830 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-30 15:37:22,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:37:22,636 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=757493.3333333334, ans=0.0 2023-09-30 15:37:24,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-30 15:37:24,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:37:27,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:37:30,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:37:32,096 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-30 15:37:32,159 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:37:35,064 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:37:36,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:37:36,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:37:39,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:37:41,453 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:37:43,094 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-30 15:37:45,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:37:45,673 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=757560.0, ans=0.1 2023-09-30 15:37:48,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:37:51,608 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=757560.0, ans=0.125 2023-09-30 15:37:55,006 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:37:58,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-30 15:38:03,740 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:38:05,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:38:08,399 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:38:08,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-30 15:38:11,767 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-30 15:38:11,767 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:38:11,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:38:13,222 INFO [train.py:1039] (1/4) Epoch 22, batch 2100, loss[loss=0.1505, simple_loss=0.2359, pruned_loss=0.03255, over 24509.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2498, pruned_loss=0.04893, over 4703137.70 frames. ], batch size: 63, lr: 4.68e-03, grad_scale: 16.0 2023-09-30 15:38:13,317 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 15:38:13,446 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:38:13,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-30 15:38:14,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-30 15:38:17,737 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:38:18,122 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=757693.3333333334, ans=0.0 2023-09-30 15:38:21,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:38:21,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:38:24,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:38:24,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:38:24,561 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-30 15:38:26,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:38:28,294 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-30 15:38:28,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-30 15:38:31,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:38:31,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:38:31,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-30 15:38:31,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 15:38:38,280 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-30 15:38:38,281 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 15:38:41,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:38:41,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:38:46,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:38:46,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-30 15:38:47,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:38:47,837 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 15:38:48,231 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=757826.6666666666, ans=0.125 2023-09-30 15:38:49,197 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.822e+02 2.007e+02 2.255e+02 3.053e+02, threshold=4.015e+02, percent-clipped=0.0 2023-09-30 15:38:49,402 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-30 15:38:49,477 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:38:49,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-30 15:38:50,918 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-30 15:38:50,980 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-30 15:38:52,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-30 15:38:52,902 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=757826.6666666666, ans=0.1 2023-09-30 15:38:52,950 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=757826.6666666666, ans=0.125 2023-09-30 15:38:56,266 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:38:56,914 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.19 vs. limit=15.0 2023-09-30 15:38:57,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 15:38:59,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 15:39:01,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:39:05,079 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:39:05,082 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-30 15:39:05,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:39:05,136 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:39:05,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:39:06,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-30 15:39:08,910 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-30 15:39:09,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-30 15:39:13,468 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:39:18,009 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:39:18,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-30 15:39:22,772 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:39:24,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:39:25,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:39:25,948 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:39:25,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-30 15:39:26,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:39:27,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:39:27,788 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-30 15:39:31,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:39:31,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:39:31,735 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=757960.0, ans=0.0 2023-09-30 15:39:32,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-30 15:39:34,447 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-30 15:39:34,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:39:36,500 INFO [train.py:1039] (1/4) Epoch 22, batch 2150, loss[loss=0.1577, simple_loss=0.2265, pruned_loss=0.04444, over 24306.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2479, pruned_loss=0.04832, over 4696187.71 frames. ], batch size: 56, lr: 4.68e-03, grad_scale: 16.0 2023-09-30 15:39:36,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:39:36,695 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:39:36,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:39:38,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:39:41,546 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=758026.6666666666, ans=0.1 2023-09-30 15:39:43,041 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=758026.6666666666, ans=0.1 2023-09-30 15:39:44,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 15:39:45,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:39:47,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:39:49,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:39:49,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:39:50,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:39:53,618 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:39:53,714 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:39:53,718 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:39:56,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:39:58,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-30 15:40:01,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:40:05,178 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:40:07,174 WARNING [train.py:1197] (1/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:40:07,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:40:07,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:40:07,350 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-30 15:40:08,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:40:08,744 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:40:10,241 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:40:10,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-30 15:40:12,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:40:14,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:40:14,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:40:16,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:40:16,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:40:19,386 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:40:20,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:40:22,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:40:22,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-30 15:40:22,363 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-30 15:40:24,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:40:25,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:40:25,622 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=758226.6666666666, ans=0.125 2023-09-30 15:40:26,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:40:28,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 15:40:28,554 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:40:30,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:40:30,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-30 15:40:32,975 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-30 15:40:33,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:40:34,559 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-30 15:40:34,641 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:40:34,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:40:36,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-30 15:40:36,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:40:36,184 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-30 15:40:37,553 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-30 15:40:37,553 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-30 15:40:37,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-30 15:40:39,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:40:39,371 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:40:39,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:40:41,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:40:42,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 15:40:44,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:40:44,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:40:52,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:40:52,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-30 15:40:58,102 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:40:59,522 INFO [train.py:1039] (1/4) Epoch 22, batch 2200, loss[loss=0.1567, simple_loss=0.2332, pruned_loss=0.04013, over 24298.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2485, pruned_loss=0.04829, over 4703413.14 frames. ], batch size: 56, lr: 4.68e-03, grad_scale: 8.0 2023-09-30 15:41:01,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:41:02,755 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:41:02,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:41:04,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-30 15:41:07,394 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:41:08,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:41:08,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-30 15:41:14,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-30 15:41:17,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 15:41:18,844 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.12 vs. limit=15.0 2023-09-30 15:41:21,982 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=758426.6666666666, ans=0.125 2023-09-30 15:41:24,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-30 15:41:26,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:41:26,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:41:27,712 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:41:31,007 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:41:31,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-30 15:41:34,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-30 15:41:36,970 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.488e+02 1.820e+02 1.960e+02 2.258e+02 2.788e+02, threshold=3.920e+02, percent-clipped=0.0 2023-09-30 15:41:37,115 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:41:38,518 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-30 15:41:41,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:41:43,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:41:45,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:41:46,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:41:48,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-30 15:41:50,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:41:53,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-30 15:41:53,260 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=758560.0, ans=0.125 2023-09-30 15:41:56,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:41:56,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-30 15:41:56,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:41:59,581 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:41:59,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:41:59,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:41:59,716 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:42:01,127 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-30 15:42:01,197 WARNING [train.py:1197] (1/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:42:04,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 15:42:07,400 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 15:42:08,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:42:09,261 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=758626.6666666666, ans=0.1 2023-09-30 15:42:10,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-30 15:42:12,028 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-30 15:42:12,205 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=758626.6666666666, ans=0.1 2023-09-30 15:42:13,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 15:42:15,458 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-30 15:42:15,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-30 15:42:16,874 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-30 15:42:18,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:42:19,872 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-30 15:42:20,053 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:42:21,877 INFO [train.py:1039] (1/4) Epoch 22, batch 2250, loss[loss=0.1943, simple_loss=0.2621, pruned_loss=0.06329, over 23788.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2491, pruned_loss=0.04842, over 4707573.67 frames. ], batch size: 212, lr: 4.68e-03, grad_scale: 8.0 2023-09-30 15:42:22,037 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-30 15:42:25,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:42:25,691 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=758693.3333333334, ans=0.125 2023-09-30 15:42:27,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:42:34,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:42:35,856 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:42:38,213 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.60 vs. limit=22.5 2023-09-30 15:42:38,352 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.46 vs. limit=10.0 2023-09-30 15:42:39,007 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:42:39,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 15:42:40,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:42:42,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-30 15:42:42,284 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:42:42,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:42:43,958 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-30 15:42:45,524 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:42:45,544 WARNING [train.py:1197] (1/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:42:47,163 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 15:42:51,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:42:51,884 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=758760.0, ans=0.2 2023-09-30 15:42:53,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 15:42:55,117 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-30 15:42:56,719 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-30 15:42:58,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:43:03,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:43:07,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:43:09,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:43:10,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:43:10,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:43:13,629 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:43:15,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:43:19,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:43:21,218 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-30 15:43:25,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 15:43:25,875 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-30 15:43:27,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:43:33,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 15:43:35,458 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-30 15:43:35,459 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-30 15:43:35,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:43:37,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:43:41,098 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-30 15:43:42,998 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=759026.6666666666, ans=0.1 2023-09-30 15:43:44,070 INFO [train.py:1039] (1/4) Epoch 22, batch 2300, loss[loss=0.1527, simple_loss=0.2385, pruned_loss=0.03346, over 24461.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.2492, pruned_loss=0.04869, over 4710158.30 frames. ], batch size: 63, lr: 4.68e-03, grad_scale: 8.0 2023-09-30 15:43:44,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:43:44,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:43:50,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:43:50,543 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:43:51,361 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.44 vs. limit=22.5 2023-09-30 15:43:53,707 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-30 15:43:55,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:43:56,351 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=10.99 vs. limit=15.0 2023-09-30 15:44:02,262 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:44:02,468 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=759093.3333333334, ans=0.2 2023-09-30 15:44:03,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-30 15:44:03,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:44:05,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:44:05,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-30 15:44:05,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:44:08,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:44:08,192 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:44:11,865 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:44:14,170 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:44:18,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:44:19,111 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=759160.0, ans=0.125 2023-09-30 15:44:21,554 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.502e+02 1.881e+02 2.122e+02 2.530e+02 4.417e+02, threshold=4.245e+02, percent-clipped=2.0 2023-09-30 15:44:24,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:44:24,911 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:44:28,020 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:44:31,057 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:44:35,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:44:35,854 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 15:44:37,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:44:37,919 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-30 15:44:42,618 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 15:44:42,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:44:43,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:44:44,019 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:44:45,427 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:44:45,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 15:44:45,565 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-30 15:44:47,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-30 15:44:47,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:44:47,096 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:44:49,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-30 15:44:55,398 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:44:59,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:45:04,653 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:45:04,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:45:04,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-30 15:45:06,185 INFO [train.py:1039] (1/4) Epoch 22, batch 2350, loss[loss=0.1787, simple_loss=0.2585, pruned_loss=0.04947, over 23289.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2496, pruned_loss=0.04821, over 4721828.72 frames. ], batch size: 105, lr: 4.68e-03, grad_scale: 8.0 2023-09-30 15:45:07,757 WARNING [train.py:1197] (1/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 15:45:07,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:45:07,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:45:09,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-30 15:45:11,270 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=759360.0, ans=0.0 2023-09-30 15:45:14,694 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:45:14,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-30 15:45:16,481 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=759360.0, ans=0.125 2023-09-30 15:45:22,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-30 15:45:25,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:45:29,504 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:45:29,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:45:29,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:45:29,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:45:30,354 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.92 vs. limit=22.5 2023-09-30 15:45:31,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-30 15:45:32,984 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=759426.6666666666, ans=0.0 2023-09-30 15:45:33,640 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.63 vs. limit=15.0 2023-09-30 15:45:34,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:45:37,779 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=759493.3333333334, ans=0.0 2023-09-30 15:45:38,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-30 15:45:40,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:45:43,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 15:45:43,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:45:45,323 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=759493.3333333334, ans=0.1 2023-09-30 15:45:46,462 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:45:46,698 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=759493.3333333334, ans=0.035 2023-09-30 15:45:46,699 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=759493.3333333334, ans=0.125 2023-09-30 15:45:48,572 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-30 15:45:48,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:45:51,664 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:45:51,686 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:45:51,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:45:54,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:45:56,951 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-30 15:45:58,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:46:02,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:46:03,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:46:05,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-30 15:46:05,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:46:05,889 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=759560.0, ans=0.125 2023-09-30 15:46:08,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-30 15:46:08,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-30 15:46:13,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-30 15:46:14,362 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.62 vs. limit=12.0 2023-09-30 15:46:17,997 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-30 15:46:19,882 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:46:19,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-30 15:46:19,946 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-30 15:46:21,333 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-30 15:46:22,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-30 15:46:26,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:46:27,430 INFO [train.py:1039] (1/4) Epoch 22, batch 2400, loss[loss=0.1655, simple_loss=0.2527, pruned_loss=0.03918, over 24039.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2501, pruned_loss=0.04845, over 4721490.72 frames. ], batch size: 80, lr: 4.68e-03, grad_scale: 16.0 2023-09-30 15:46:30,827 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:46:34,949 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:46:35,165 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:46:37,164 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-30 15:46:37,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-30 15:46:44,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 15:46:44,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:46:46,582 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-30 15:46:47,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:46:49,482 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:46:49,555 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-30 15:46:57,399 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:46:59,009 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-30 15:47:02,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-30 15:47:05,698 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.822e+02 2.005e+02 2.211e+02 3.199e+02, threshold=4.010e+02, percent-clipped=0.0 2023-09-30 15:47:08,016 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-30 15:47:11,557 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:47:13,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:47:16,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:47:17,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-30 15:47:19,243 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 15:47:24,301 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:47:27,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:47:30,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:47:32,418 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:47:32,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-30 15:47:32,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:47:32,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:47:33,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:47:34,009 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 15:47:35,719 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=759960.0, ans=0.1 2023-09-30 15:47:38,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:47:40,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 15:47:40,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-30 15:47:42,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-30 15:47:44,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:47:44,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:47:46,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-30 15:47:46,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-30 15:47:47,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-30 15:47:47,730 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-30 15:47:47,896 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-30 15:47:49,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:47:49,716 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=760026.6666666666, ans=0.0 2023-09-30 15:47:50,800 INFO [train.py:1039] (1/4) Epoch 22, batch 2450, loss[loss=0.1855, simple_loss=0.2579, pruned_loss=0.05655, over 23764.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2492, pruned_loss=0.04849, over 4720993.23 frames. ], batch size: 212, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 15:47:50,896 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:47:50,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:47:52,401 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-30 15:47:52,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:47:52,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-30 15:47:56,453 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.82 vs. limit=15.0 2023-09-30 15:47:57,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:47:57,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:48:00,968 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:48:00,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:48:02,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-30 15:48:07,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:48:08,622 WARNING [train.py:1197] (1/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:48:10,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:48:10,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:48:11,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:48:12,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-30 15:48:17,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:48:20,290 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 15:48:20,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:48:23,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-30 15:48:23,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:48:25,231 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:48:25,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:48:28,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-30 15:48:29,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:48:30,104 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=760160.0, ans=0.125 2023-09-30 15:48:34,806 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=760160.0, ans=0.125 2023-09-30 15:48:38,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:48:39,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:48:41,157 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:48:41,204 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:48:41,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:48:42,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:48:43,039 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-30 15:48:46,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:48:48,306 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:48:52,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:48:53,375 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:48:58,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:48:58,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-30 15:48:58,277 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:48:59,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:48:59,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-30 15:49:01,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:49:01,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:49:06,406 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:49:06,520 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=760293.3333333334, ans=0.0 2023-09-30 15:49:10,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:49:10,202 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:49:12,053 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=760360.0, ans=0.125 2023-09-30 15:49:13,146 INFO [train.py:1039] (1/4) Epoch 22, batch 2500, loss[loss=0.1768, simple_loss=0.253, pruned_loss=0.05031, over 23617.00 frames. ], tot_loss[loss=0.1719, simple_loss=0.2482, pruned_loss=0.04779, over 4730624.71 frames. ], batch size: 135, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 15:49:13,383 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-30 15:49:14,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:49:21,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:49:31,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:49:31,767 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:49:33,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:49:33,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-30 15:49:40,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 15:49:42,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:49:42,668 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-30 15:49:42,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 15:49:43,062 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=760426.6666666666, ans=0.125 2023-09-30 15:49:44,737 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-30 15:49:44,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:49:46,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:49:46,356 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-30 15:49:46,382 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:49:47,806 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-30 15:49:47,873 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:49:48,204 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=760493.3333333334, ans=0.0 2023-09-30 15:49:50,872 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.767e+02 1.934e+02 2.176e+02 2.965e+02, threshold=3.869e+02, percent-clipped=0.0 2023-09-30 15:49:52,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:49:52,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:49:56,456 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 15:49:57,187 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-30 15:49:57,289 WARNING [train.py:1197] (1/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:49:57,496 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=760493.3333333334, ans=0.0 2023-09-30 15:49:58,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:50:02,793 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.47 vs. limit=22.5 2023-09-30 15:50:03,632 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:50:05,384 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=760560.0, ans=0.125 2023-09-30 15:50:08,104 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:50:09,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:50:14,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-30 15:50:16,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-30 15:50:18,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:50:18,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-30 15:50:19,806 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:50:19,806 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 15:50:21,306 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-30 15:50:21,307 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-30 15:50:21,330 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-30 15:50:24,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:50:26,580 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-30 15:50:26,808 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=760626.6666666666, ans=0.0 2023-09-30 15:50:27,941 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-30 15:50:28,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:50:30,152 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-30 15:50:30,558 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=760626.6666666666, ans=0.0 2023-09-30 15:50:33,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-30 15:50:33,505 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=760626.6666666666, ans=0.125 2023-09-30 15:50:36,282 INFO [train.py:1039] (1/4) Epoch 22, batch 2550, loss[loss=0.15, simple_loss=0.2292, pruned_loss=0.03543, over 24378.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2491, pruned_loss=0.04853, over 4722885.00 frames. ], batch size: 56, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 15:50:36,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:50:37,889 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:50:37,982 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:50:40,975 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:50:41,092 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-30 15:50:42,522 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-30 15:50:45,787 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-30 15:50:47,245 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:50:48,886 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:50:52,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:50:52,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 15:50:53,478 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.32 vs. limit=15.0 2023-09-30 15:50:53,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 15:50:54,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:50:54,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:50:55,956 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=760760.0, ans=0.125 2023-09-30 15:50:57,125 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:50:57,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-30 15:50:58,627 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-30 15:50:58,638 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:50:58,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-30 15:51:11,003 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:51:16,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:51:16,898 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:51:16,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:51:17,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 15:51:23,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:51:26,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 15:51:26,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:51:26,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:51:28,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-30 15:51:28,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-30 15:51:31,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:51:31,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:51:38,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:51:38,899 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-30 15:51:38,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:51:40,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:51:41,931 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:51:43,416 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 15:51:45,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:51:51,255 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:51:51,463 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=760960.0, ans=0.0 2023-09-30 15:51:54,257 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:51:57,745 INFO [train.py:1039] (1/4) Epoch 22, batch 2600, loss[loss=0.1603, simple_loss=0.2426, pruned_loss=0.03902, over 24454.00 frames. ], tot_loss[loss=0.1728, simple_loss=0.2491, pruned_loss=0.04825, over 4728273.18 frames. ], batch size: 66, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 15:51:57,839 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-30 15:51:59,463 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-30 15:51:59,491 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:51:59,542 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-30 15:51:59,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-30 15:52:01,014 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-30 15:52:02,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:52:02,787 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-30 15:52:04,354 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-30 15:52:05,863 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-30 15:52:09,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:52:12,833 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-30 15:52:14,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-30 15:52:15,891 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-30 15:52:15,952 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-30 15:52:18,911 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-30 15:52:18,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-30 15:52:26,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:52:26,708 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:52:28,077 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:52:28,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-30 15:52:28,515 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=761160.0, ans=0.05 2023-09-30 15:52:29,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:52:34,441 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.852e+02 2.092e+02 2.401e+02 4.337e+02, threshold=4.185e+02, percent-clipped=2.0 2023-09-30 15:52:37,585 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-30 15:52:41,070 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=761160.0, ans=0.04949747468305833 2023-09-30 15:52:43,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:52:45,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:52:45,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-30 15:52:45,754 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=761226.6666666666, ans=0.2 2023-09-30 15:52:47,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:52:47,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:52:48,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-30 15:52:49,717 WARNING [train.py:1197] (1/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-30 15:52:51,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:52:52,742 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:52:57,167 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-30 15:52:57,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:52:58,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:53:00,723 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=761293.3333333334, ans=0.125 2023-09-30 15:53:04,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:53:04,990 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:53:05,011 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-30 15:53:06,465 WARNING [train.py:1197] (1/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:53:08,665 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:53:10,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:53:16,549 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-30 15:53:16,912 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=761360.0, ans=0.125 2023-09-30 15:53:18,063 INFO [train.py:1039] (1/4) Epoch 22, batch 2650, loss[loss=0.1843, simple_loss=0.2616, pruned_loss=0.05349, over 23383.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2495, pruned_loss=0.04816, over 4736515.31 frames. ], batch size: 105, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 15:53:18,161 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:53:20,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 15:53:24,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-30 15:53:24,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:53:26,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 15:53:27,949 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-30 15:53:27,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:53:29,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:53:31,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 15:53:32,807 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:53:35,890 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:53:36,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-30 15:53:36,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:53:37,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:53:40,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-30 15:53:42,585 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-30 15:53:43,502 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.21 vs. limit=15.0 2023-09-30 15:53:44,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:53:48,489 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-30 15:53:49,881 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:53:49,985 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-30 15:53:50,302 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=761493.3333333334, ans=0.125 2023-09-30 15:53:54,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:53:54,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-30 15:53:54,654 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:53:54,832 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=761493.3333333334, ans=0.125 2023-09-30 15:53:56,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:54:00,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-30 15:54:00,593 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-30 15:54:05,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-30 15:54:09,617 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-30 15:54:09,660 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:54:09,792 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:54:11,192 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:54:11,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:54:11,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:54:12,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:54:14,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:54:14,684 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:54:16,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-30 15:54:18,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:54:19,861 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:54:19,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:54:21,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:54:22,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:54:22,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-30 15:54:23,886 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.69 vs. limit=15.0 2023-09-30 15:54:26,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:54:27,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:54:27,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:54:29,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-30 15:54:31,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:54:34,851 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:54:36,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:54:37,845 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:54:39,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:54:39,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:54:41,038 INFO [train.py:1039] (1/4) Epoch 22, batch 2700, loss[loss=0.1839, simple_loss=0.2676, pruned_loss=0.05012, over 24340.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2503, pruned_loss=0.04865, over 4729459.48 frames. ], batch size: 77, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 15:54:41,389 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=761693.3333333334, ans=0.2 2023-09-30 15:54:42,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:54:42,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-30 15:54:44,372 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:54:46,110 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=761693.3333333334, ans=0.1 2023-09-30 15:54:47,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 15:54:50,306 WARNING [train.py:1197] (1/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:54:50,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:54:50,390 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:54:50,539 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:54:52,466 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:54:52,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:54:52,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-30 15:54:52,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-30 15:54:52,691 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:54:54,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-30 15:54:55,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 15:54:57,211 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:55:01,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-30 15:55:01,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-30 15:55:02,596 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:55:08,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:55:08,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:55:14,907 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:55:14,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:55:14,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:55:16,483 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:55:19,397 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.664e+02 1.905e+02 2.101e+02 2.408e+02 3.392e+02, threshold=4.202e+02, percent-clipped=0.0 2023-09-30 15:55:19,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:55:22,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:55:22,724 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:55:22,747 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:55:22,990 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=761826.6666666666, ans=0.0 2023-09-30 15:55:29,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:55:29,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:55:31,182 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.max_abs, batch_count=761893.3333333334, ans=10.0 2023-09-30 15:55:35,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:55:37,825 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:55:38,822 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=761893.3333333334, ans=0.2 2023-09-30 15:55:38,834 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=761893.3333333334, ans=0.125 2023-09-30 15:55:41,523 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:55:41,526 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:55:43,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:55:45,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:55:46,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:55:47,036 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:55:48,003 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.50 vs. limit=6.0 2023-09-30 15:55:48,557 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:55:48,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:55:52,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:55:54,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:55:54,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:55:57,488 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-30 15:55:57,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:55:59,225 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:55:59,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-30 15:55:59,455 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=761960.0, ans=0.125 2023-09-30 15:55:59,640 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=761960.0, ans=0.1 2023-09-30 15:56:01,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-30 15:56:02,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:56:04,254 INFO [train.py:1039] (1/4) Epoch 22, batch 2750, loss[loss=0.1672, simple_loss=0.258, pruned_loss=0.03825, over 24330.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.2495, pruned_loss=0.04856, over 4732658.69 frames. ], batch size: 74, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 15:56:05,891 WARNING [train.py:1197] (1/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:56:05,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:56:06,175 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=762026.6666666666, ans=0.125 2023-09-30 15:56:06,209 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=762026.6666666666, ans=0.1 2023-09-30 15:56:07,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:56:07,794 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=762026.6666666666, ans=0.125 2023-09-30 15:56:09,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-30 15:56:09,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:56:14,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:56:15,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 15:56:15,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:56:16,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:56:16,374 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-30 15:56:16,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:56:16,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:56:17,236 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.72 vs. limit=15.0 2023-09-30 15:56:22,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-30 15:56:23,001 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=762093.3333333334, ans=0.125 2023-09-30 15:56:24,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:56:24,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:56:25,596 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:56:25,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-30 15:56:27,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:56:28,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:56:30,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:56:30,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:56:36,294 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 15:56:36,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 15:56:36,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:56:38,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:56:39,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 15:56:47,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:56:50,434 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 15:56:50,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:56:55,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:56:55,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-30 15:56:55,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:57:00,472 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=762226.6666666666, ans=0.0 2023-09-30 15:57:01,880 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:57:01,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:57:01,945 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-30 15:57:07,724 WARNING [train.py:1197] (1/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:57:09,424 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-30 15:57:16,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-30 15:57:18,197 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:57:18,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-30 15:57:20,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:57:20,604 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:57:21,842 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-30 15:57:21,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:57:25,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-30 15:57:26,889 INFO [train.py:1039] (1/4) Epoch 22, batch 2800, loss[loss=0.1622, simple_loss=0.2498, pruned_loss=0.03726, over 24626.00 frames. ], tot_loss[loss=0.1728, simple_loss=0.2487, pruned_loss=0.04843, over 4718466.00 frames. ], batch size: 73, lr: 4.67e-03, grad_scale: 32.0 2023-09-30 15:57:26,956 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:57:27,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:57:28,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-30 15:57:28,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:57:28,577 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:57:30,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:57:31,558 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-30 15:57:31,559 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-30 15:57:34,669 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:57:37,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 15:57:37,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:57:40,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:57:42,532 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-30 15:57:44,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-30 15:57:45,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-30 15:57:47,350 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=762426.6666666666, ans=0.0 2023-09-30 15:57:49,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:57:49,204 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:57:49,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:57:54,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:57:55,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:57:55,152 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-30 15:57:56,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:58:04,785 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.836e+02 2.011e+02 2.397e+02 3.435e+02, threshold=4.022e+02, percent-clipped=0.0 2023-09-30 15:58:06,460 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:58:08,003 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:58:09,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:58:11,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:58:11,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:58:15,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:58:16,002 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-30 15:58:17,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:58:19,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:58:19,024 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:58:24,168 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:58:26,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:58:29,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:58:31,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:58:31,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:58:31,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 15:58:33,327 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 15:58:33,426 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:58:35,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:58:35,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-30 15:58:36,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:58:38,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:58:38,482 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:58:38,685 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=762626.6666666666, ans=0.2 2023-09-30 15:58:39,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-30 15:58:40,724 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.17 vs. limit=15.0 2023-09-30 15:58:41,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:58:41,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:58:43,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:58:43,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-30 15:58:43,785 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=762626.6666666666, ans=0.0 2023-09-30 15:58:46,781 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=762626.6666666666, ans=0.125 2023-09-30 15:58:49,613 INFO [train.py:1039] (1/4) Epoch 22, batch 2850, loss[loss=0.1913, simple_loss=0.2595, pruned_loss=0.06152, over 23832.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2481, pruned_loss=0.04793, over 4721127.74 frames. ], batch size: 164, lr: 4.67e-03, grad_scale: 32.0 2023-09-30 15:58:49,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:58:49,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 15:58:51,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:58:54,029 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:58:57,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:58:59,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:58:59,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:59:02,946 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:59:03,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:59:03,635 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=762693.3333333334, ans=15.0 2023-09-30 15:59:04,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:59:05,959 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-30 15:59:11,904 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-30 15:59:11,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:59:13,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-30 15:59:13,706 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=762760.0, ans=0.125 2023-09-30 15:59:14,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:59:16,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-30 15:59:18,042 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-30 15:59:21,080 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:59:23,149 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=762826.6666666666, ans=0.0 2023-09-30 15:59:31,871 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.18 vs. limit=6.0 2023-09-30 15:59:32,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:59:32,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:59:34,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:59:36,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 15:59:36,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 15:59:36,512 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:59:38,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 15:59:39,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-30 15:59:41,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-30 15:59:41,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:59:41,351 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:59:42,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:59:46,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:59:47,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:59:49,229 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:59:52,226 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:59:52,905 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.22 vs. limit=15.0 2023-09-30 15:59:53,742 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:59:53,842 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:59:54,109 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=762960.0, ans=0.2 2023-09-30 15:59:55,333 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:59:58,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:00:01,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:00:03,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-30 16:00:05,206 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-30 16:00:06,808 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 16:00:06,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:00:06,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-30 16:00:07,034 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:00:08,447 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:00:08,486 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:00:10,541 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:00:10,542 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-30 16:00:10,612 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-30 16:00:10,617 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:00:10,721 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:00:11,975 INFO [train.py:1039] (1/4) Epoch 22, batch 2900, loss[loss=0.1848, simple_loss=0.2649, pruned_loss=0.05231, over 23375.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2486, pruned_loss=0.04779, over 4736181.24 frames. ], batch size: 93, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 16:00:15,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-30 16:00:15,741 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.82 vs. limit=15.0 2023-09-30 16:00:16,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:00:16,739 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:00:18,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-30 16:00:19,282 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=763026.6666666666, ans=0.0 2023-09-30 16:00:23,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:00:23,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-30 16:00:25,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-30 16:00:26,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:00:26,754 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-30 16:00:28,319 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:00:28,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:00:33,004 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:00:33,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:00:36,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-30 16:00:36,755 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=763093.3333333334, ans=0.0 2023-09-30 16:00:38,062 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-30 16:00:38,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-30 16:00:39,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:00:43,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-30 16:00:43,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-30 16:00:45,721 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=8.29 vs. limit=15.0 2023-09-30 16:00:46,501 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:00:46,505 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-30 16:00:46,543 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:00:50,815 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.778e+02 1.944e+02 2.291e+02 4.038e+02, threshold=3.888e+02, percent-clipped=1.0 2023-09-30 16:00:50,939 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:00:50,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-30 16:00:53,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:00:53,457 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=763160.0, ans=0.2 2023-09-30 16:00:54,748 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:00:58,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:01:01,672 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:01:03,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-30 16:01:03,269 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-30 16:01:03,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:01:05,828 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.53 vs. limit=15.0 2023-09-30 16:01:07,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:01:09,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-30 16:01:11,697 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:01:16,853 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:01:26,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:01:26,060 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-30 16:01:27,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-30 16:01:32,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:01:32,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-30 16:01:32,584 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:01:32,671 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-30 16:01:34,059 INFO [train.py:1039] (1/4) Epoch 22, batch 2950, loss[loss=0.1851, simple_loss=0.2565, pruned_loss=0.0568, over 23807.00 frames. ], tot_loss[loss=0.1728, simple_loss=0.2495, pruned_loss=0.04806, over 4747713.99 frames. ], batch size: 195, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:01:37,656 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:01:40,632 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-30 16:01:42,075 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:01:42,084 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:01:43,638 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:01:47,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:01:47,233 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-30 16:01:48,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-30 16:01:48,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 16:01:48,936 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:01:55,724 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:01:57,635 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=763426.6666666666, ans=0.125 2023-09-30 16:01:58,265 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.37 vs. limit=15.0 2023-09-30 16:01:58,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:02:00,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:02:01,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:02:04,151 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.23 vs. limit=12.0 2023-09-30 16:02:05,450 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:02:05,480 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:02:09,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:02:09,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:02:09,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:02:12,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-30 16:02:16,377 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.53 vs. limit=22.5 2023-09-30 16:02:17,104 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-30 16:02:17,138 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-30 16:02:17,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 16:02:18,843 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-30 16:02:20,649 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=763493.3333333334, ans=0.05 2023-09-30 16:02:21,751 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-30 16:02:21,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:02:21,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:02:21,913 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-30 16:02:21,920 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-30 16:02:25,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-30 16:02:25,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:02:25,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:02:28,821 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:02:30,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:02:30,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:02:30,583 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-30 16:02:32,090 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:02:32,148 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-30 16:02:40,090 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:02:42,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-30 16:02:42,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-30 16:02:42,184 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:02:43,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-30 16:02:44,528 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.20 vs. limit=10.0 2023-09-30 16:02:48,172 WARNING [train.py:1197] (1/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:02:49,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:02:51,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 16:02:51,419 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:02:51,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 16:02:51,662 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=763626.6666666666, ans=0.125 2023-09-30 16:02:53,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:02:55,884 INFO [train.py:1039] (1/4) Epoch 22, batch 3000, loss[loss=0.1727, simple_loss=0.2603, pruned_loss=0.04252, over 24638.00 frames. ], tot_loss[loss=0.1728, simple_loss=0.2496, pruned_loss=0.04799, over 4752025.77 frames. ], batch size: 73, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:02:55,885 INFO [train.py:1062] (1/4) Computing validation loss 2023-09-30 16:03:10,450 INFO [train.py:1071] (1/4) Epoch 22, validation: loss=0.3133, simple_loss=0.2748, pruned_loss=0.1759, over 1125622.00 frames. 2023-09-30 16:03:10,451 INFO [train.py:1072] (1/4) Maximum memory allocated so far is 21065MB 2023-09-30 16:03:10,511 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:03:10,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-30 16:03:10,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:03:10,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:03:12,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:03:13,577 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:03:13,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-30 16:03:14,992 WARNING [train.py:1197] (1/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:03:18,110 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:03:18,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-30 16:03:21,313 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-30 16:03:22,728 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-30 16:03:24,423 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-30 16:03:25,163 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.97 vs. limit=15.0 2023-09-30 16:03:25,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:03:25,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-30 16:03:27,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:03:34,537 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 16:03:34,816 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=763760.0, ans=0.2 2023-09-30 16:03:43,880 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:03:50,838 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.854e+02 2.053e+02 2.240e+02 3.574e+02, threshold=4.107e+02, percent-clipped=0.0 2023-09-30 16:03:52,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-30 16:03:52,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:03:55,738 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 16:03:57,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:03:57,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:03:59,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:03:59,996 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-30 16:04:00,250 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-30 16:04:02,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:04:03,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 16:04:05,405 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 16:04:05,667 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=763893.3333333334, ans=0.2 2023-09-30 16:04:07,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:04:07,385 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:04:07,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:04:10,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 16:04:11,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:04:11,928 WARNING [train.py:1197] (1/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:04:15,011 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:04:16,722 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-30 16:04:18,155 WARNING [train.py:1197] (1/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:04:18,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:04:18,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:04:23,371 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:04:24,758 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:04:24,934 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-30 16:04:25,000 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-30 16:04:25,123 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=763960.0, ans=0.1 2023-09-30 16:04:27,587 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:04:27,642 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-30 16:04:29,075 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:04:30,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-30 16:04:32,408 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-30 16:04:32,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 16:04:32,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-30 16:04:32,687 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-30 16:04:32,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 16:04:32,934 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=764026.6666666666, ans=0.125 2023-09-30 16:04:34,036 INFO [train.py:1039] (1/4) Epoch 22, batch 3050, loss[loss=0.1806, simple_loss=0.2675, pruned_loss=0.04685, over 24425.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2506, pruned_loss=0.04864, over 4740076.15 frames. ], batch size: 69, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:04:34,307 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:04:36,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:04:36,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-30 16:04:36,527 WARNING [train.py:1197] (1/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:04:38,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:04:40,326 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-30 16:04:44,688 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:04:46,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:04:46,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 16:04:49,577 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:04:53,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-30 16:04:56,503 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=764093.3333333334, ans=0.125 2023-09-30 16:04:59,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-30 16:04:59,285 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-30 16:05:01,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:05:04,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:05:07,646 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:05:07,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:05:07,732 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:05:11,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:05:12,792 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-30 16:05:12,838 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:05:12,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:05:12,909 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:05:16,401 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:05:18,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:05:21,059 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:05:21,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-30 16:05:22,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:05:22,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 16:05:25,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:05:25,935 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 16:05:27,411 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:05:27,516 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:05:33,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:05:35,431 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:05:40,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:05:40,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:05:40,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:05:42,594 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:05:44,658 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 16:05:44,756 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:05:46,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-30 16:05:49,049 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:05:49,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:05:49,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-30 16:05:51,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:05:51,512 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=764293.3333333334, ans=0.0 2023-09-30 16:05:56,005 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:05:57,363 INFO [train.py:1039] (1/4) Epoch 22, batch 3100, loss[loss=0.16, simple_loss=0.2311, pruned_loss=0.04441, over 24451.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2507, pruned_loss=0.04858, over 4731819.89 frames. ], batch size: 58, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:05:58,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:06:00,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 16:06:02,879 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.64 vs. limit=15.0 2023-09-30 16:06:03,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-30 16:06:07,431 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-30 16:06:07,586 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-30 16:06:10,812 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:06:13,931 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:06:13,949 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:06:17,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-30 16:06:20,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:06:24,550 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=764426.6666666666, ans=0.1 2023-09-30 16:06:25,939 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-30 16:06:31,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 16:06:33,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:06:33,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:06:34,679 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:06:34,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-30 16:06:34,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:06:35,008 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-30 16:06:35,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:06:36,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:06:38,093 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 1.888e+02 2.068e+02 2.233e+02 3.046e+02, threshold=4.136e+02, percent-clipped=0.0 2023-09-30 16:06:38,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-30 16:06:40,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:06:45,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-30 16:06:45,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-30 16:06:47,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-30 16:06:47,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:06:47,290 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:06:50,933 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:06:50,977 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:06:52,413 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:06:52,617 WARNING [train.py:1197] (1/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:06:52,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:06:55,400 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:06:55,457 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:06:55,473 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:06:55,479 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 16:07:00,705 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:07:02,147 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-30 16:07:03,818 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-30 16:07:05,267 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-30 16:07:06,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:07:06,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:07:06,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-30 16:07:19,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-30 16:07:20,443 INFO [train.py:1039] (1/4) Epoch 22, batch 3150, loss[loss=0.1728, simple_loss=0.2345, pruned_loss=0.05561, over 23706.00 frames. ], tot_loss[loss=0.1728, simple_loss=0.2492, pruned_loss=0.0482, over 4712878.11 frames. ], batch size: 232, lr: 4.66e-03, grad_scale: 8.0 2023-09-30 16:07:22,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:07:22,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:07:23,939 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:07:23,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:07:24,043 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-30 16:07:26,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:07:26,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-30 16:07:27,607 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-30 16:07:30,934 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:07:32,440 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-30 16:07:32,817 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=764693.3333333334, ans=0.2 2023-09-30 16:07:35,440 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-30 16:07:35,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:07:37,105 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-30 16:07:38,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-30 16:07:40,162 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-30 16:07:41,633 WARNING [train.py:1197] (1/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-30 16:07:41,636 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-30 16:07:41,675 WARNING [train.py:1197] (1/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:07:41,681 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:07:42,047 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=764760.0, ans=0.125 2023-09-30 16:07:43,313 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:07:44,868 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-30 16:07:45,247 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=764760.0, ans=0.125 2023-09-30 16:07:47,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:07:47,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:07:48,479 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:07:50,582 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-30 16:07:55,073 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-30 16:07:55,190 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:07:57,669 INFO [scaling.py:1022] (1/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.92 vs. limit=8.0 2023-09-30 16:07:58,157 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-30 16:08:00,104 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:08:00,186 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-30 16:08:03,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-30 16:08:05,237 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:08:05,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 16:08:05,348 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 16:08:06,777 WARNING [train.py:1197] (1/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:08:06,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 16:08:09,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-30 16:08:09,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-30 16:08:09,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-30 16:08:11,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 16:08:11,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:08:12,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:08:12,829 WARNING [train.py:1197] (1/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:08:12,940 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-30 16:08:14,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:08:17,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-30 16:08:17,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:08:17,645 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-30 16:08:19,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-30 16:08:20,698 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:08:20,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:08:22,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-30 16:08:23,157 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=764893.3333333334, ans=0.0 2023-09-30 16:08:24,261 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 16:08:24,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:08:26,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:08:28,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:08:29,401 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:08:34,528 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 16:08:36,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:08:38,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-30 16:08:39,870 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=764960.0, ans=0.0 2023-09-30 16:08:41,421 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=765026.6666666666, ans=0.125 2023-09-30 16:08:42,508 INFO [train.py:1039] (1/4) Epoch 22, batch 3200, loss[loss=0.1644, simple_loss=0.252, pruned_loss=0.03844, over 24647.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2487, pruned_loss=0.04795, over 4719251.37 frames. ], batch size: 73, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:08:44,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:08:44,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-30 16:08:48,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:08:48,839 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:08:50,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-30 16:08:53,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:08:57,203 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:09:00,415 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:09:11,418 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:09:11,864 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=765093.3333333334, ans=0.04949747468305833 2023-09-30 16:09:13,334 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=765093.3333333334, ans=0.1 2023-09-30 16:09:15,479 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.86 vs. limit=15.0 2023-09-30 16:09:17,966 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=765160.0, ans=0.125 2023-09-30 16:09:20,687 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-30 16:09:22,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:09:23,567 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.875e+02 2.082e+02 2.411e+02 3.393e+02, threshold=4.163e+02, percent-clipped=0.0 2023-09-30 16:09:25,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-30 16:09:25,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 16:09:30,666 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:09:30,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 16:09:32,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:09:37,359 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-30 16:09:38,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-30 16:09:40,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-30 16:09:42,497 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=765226.6666666666, ans=0.125 2023-09-30 16:09:44,345 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-30 16:09:47,680 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:09:53,884 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:09:53,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 16:09:55,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:09:55,433 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-30 16:09:55,436 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 16:10:00,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:10:00,577 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=765293.3333333334, ans=0.0 2023-09-30 16:10:01,853 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-30 16:10:01,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-30 16:10:03,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-30 16:10:04,740 INFO [train.py:1039] (1/4) Epoch 22, batch 3250, loss[loss=0.1679, simple_loss=0.2416, pruned_loss=0.04703, over 23473.00 frames. ], tot_loss[loss=0.1724, simple_loss=0.2488, pruned_loss=0.04798, over 4720377.60 frames. ], batch size: 134, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:10:04,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-30 16:10:07,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:10:10,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-30 16:10:10,201 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-30 16:10:10,258 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:10:10,262 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:10:12,460 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-30 16:10:17,086 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:10:19,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:10:27,366 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:10:27,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-30 16:10:28,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:10:30,371 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:10:30,374 WARNING [train.py:1197] (1/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:10:31,904 WARNING [train.py:1197] (1/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:10:31,962 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 16:10:32,241 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=765426.6666666666, ans=0.09899494936611666 2023-09-30 16:10:35,128 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:10:35,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-30 16:10:35,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:10:35,335 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:10:35,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:10:35,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:10:40,670 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:10:43,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:10:45,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:10:45,102 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:10:46,630 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:10:46,706 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:10:46,724 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:10:52,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-30 16:10:52,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:10:52,189 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:10:54,249 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:10:56,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:11:02,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:11:03,895 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=765560.0, ans=0.125 2023-09-30 16:11:09,901 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:11:09,941 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:11:09,942 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-30 16:11:09,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:11:09,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 16:11:11,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:11:14,100 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.10 vs. limit=22.5 2023-09-30 16:11:15,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-30 16:11:15,071 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-30 16:11:15,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:11:15,415 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=765626.6666666666, ans=0.125 2023-09-30 16:11:16,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:11:16,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:11:18,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-30 16:11:18,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:11:22,959 WARNING [train.py:1197] (1/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:11:22,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:11:25,063 WARNING [train.py:1197] (1/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-30 16:11:25,082 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:11:28,028 INFO [train.py:1039] (1/4) Epoch 22, batch 3300, loss[loss=0.1675, simple_loss=0.2409, pruned_loss=0.04705, over 23588.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2492, pruned_loss=0.0483, over 4716751.98 frames. ], batch size: 256, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:11:28,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 16:11:28,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-30 16:11:30,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:11:32,466 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-30 16:11:34,065 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-30 16:11:34,747 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.99 vs. limit=10.0 2023-09-30 16:11:35,609 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-30 16:11:36,946 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:11:38,999 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=765693.3333333334, ans=0.1 2023-09-30 16:11:40,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:11:40,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:11:41,696 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:11:43,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 16:11:43,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 16:11:46,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:11:48,454 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:11:53,013 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-30 16:11:53,113 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:11:53,150 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:11:56,085 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:11:57,487 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-30 16:11:59,005 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:12:01,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 16:12:01,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 16:12:01,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:12:01,200 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-30 16:12:06,234 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:12:06,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-30 16:12:07,943 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:12:07,947 WARNING [train.py:1197] (1/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-30 16:12:09,733 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.926e+02 2.080e+02 2.384e+02 3.230e+02, threshold=4.160e+02, percent-clipped=0.0 2023-09-30 16:12:09,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-30 16:12:09,996 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:12:10,321 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=765826.6666666666, ans=0.125 2023-09-30 16:12:11,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:12:13,189 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-30 16:12:14,713 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-30 16:12:14,787 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:12:16,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-30 16:12:19,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:12:23,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-30 16:12:23,242 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:12:27,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:12:27,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:12:27,767 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:12:29,204 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-30 16:12:32,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:12:32,183 WARNING [train.py:1197] (1/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:12:33,723 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-30 16:12:36,561 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-30 16:12:36,695 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-30 16:12:38,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-30 16:12:38,982 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:12:38,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:12:41,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:12:41,079 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:12:42,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 16:12:44,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:12:44,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-30 16:12:46,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:12:46,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 16:12:49,360 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-30 16:12:49,423 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:12:50,908 INFO [train.py:1039] (1/4) Epoch 22, batch 3350, loss[loss=0.1631, simple_loss=0.2489, pruned_loss=0.03869, over 24441.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2501, pruned_loss=0.04834, over 4729301.64 frames. ], batch size: 69, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:12:51,047 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:12:53,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:12:54,008 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:12:55,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:12:59,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:12:59,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:13:00,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:13:02,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:13:03,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:13:05,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:13:08,824 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:13:10,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:13:10,401 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:13:12,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-30 16:13:12,760 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=766093.3333333334, ans=0.125 2023-09-30 16:13:14,005 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-30 16:13:14,056 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:13:18,521 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-30 16:13:18,545 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-30 16:13:18,725 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 16:13:20,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:13:22,154 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:13:22,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-30 16:13:22,260 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:13:22,296 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:13:24,098 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=766160.0, ans=0.125 2023-09-30 16:13:25,138 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:13:25,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:13:25,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:13:26,895 WARNING [train.py:1197] (1/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:13:31,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:13:35,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:13:35,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:13:39,796 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:13:41,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:13:42,812 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:13:42,834 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:13:44,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:13:46,676 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-30 16:13:46,689 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 16:13:46,732 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-30 16:13:48,132 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:13:48,313 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-30 16:13:48,643 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=766226.6666666666, ans=0.0 2023-09-30 16:13:49,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:13:51,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:13:57,147 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=766293.3333333334, ans=0.0 2023-09-30 16:13:58,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:13:59,887 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-30 16:13:59,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 16:14:00,082 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:14:01,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:14:03,681 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=766293.3333333334, ans=0.125 2023-09-30 16:14:04,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:14:08,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-30 16:14:08,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 16:14:08,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:14:10,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:14:11,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-30 16:14:13,656 INFO [train.py:1039] (1/4) Epoch 22, batch 3400, loss[loss=0.1534, simple_loss=0.2291, pruned_loss=0.03888, over 24471.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2499, pruned_loss=0.04796, over 4731651.40 frames. ], batch size: 58, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:14:13,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:14:13,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-30 16:14:15,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:14:16,145 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=766360.0, ans=0.2 2023-09-30 16:14:16,500 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.07 vs. limit=15.0 2023-09-30 16:14:17,199 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:14:17,289 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-30 16:14:18,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:14:18,823 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-30 16:14:19,181 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=766360.0, ans=0.1 2023-09-30 16:14:24,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-30 16:14:24,823 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-30 16:14:24,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:14:25,584 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=9.46 vs. limit=22.5 2023-09-30 16:14:28,571 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:14:28,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 16:14:30,179 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:14:31,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-30 16:14:37,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:14:39,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-30 16:14:44,335 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:14:47,362 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:14:47,445 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:14:49,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-30 16:14:56,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:14:57,965 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.457e+02 1.853e+02 2.034e+02 2.228e+02 2.939e+02, threshold=4.068e+02, percent-clipped=0.0 2023-09-30 16:14:58,312 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-30 16:15:05,036 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:15:06,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:15:06,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-30 16:15:07,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:15:07,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:15:09,336 WARNING [train.py:1197] (1/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:15:09,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 16:15:11,106 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:15:16,142 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 16:15:16,152 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:15:16,831 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.26 vs. limit=22.5 2023-09-30 16:15:21,656 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:15:23,380 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-30 16:15:28,325 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 16:15:33,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-30 16:15:36,888 INFO [train.py:1039] (1/4) Epoch 22, batch 3450, loss[loss=0.157, simple_loss=0.2421, pruned_loss=0.03596, over 24647.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2502, pruned_loss=0.04847, over 4702862.22 frames. ], batch size: 65, lr: 4.65e-03, grad_scale: 4.0 2023-09-30 16:15:37,042 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-30 16:15:37,102 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:15:40,064 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:15:40,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-30 16:15:40,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:15:42,227 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=766693.3333333334, ans=0.125 2023-09-30 16:15:43,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-30 16:15:48,690 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=766693.3333333334, ans=0.0 2023-09-30 16:15:51,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:15:51,971 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=766760.0, ans=0.025 2023-09-30 16:15:53,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:15:54,608 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:15:54,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:15:56,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:16:03,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-30 16:16:07,000 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=766760.0, ans=0.05 2023-09-30 16:16:10,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-30 16:16:10,322 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 16:16:10,393 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:16:13,353 WARNING [train.py:1197] (1/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:16:18,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-30 16:16:19,619 WARNING [train.py:1197] (1/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 16:16:22,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:16:22,974 WARNING [train.py:1197] (1/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:16:24,438 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-30 16:16:26,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:16:27,258 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=766893.3333333334, ans=0.125 2023-09-30 16:16:28,441 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-30 16:16:28,457 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:16:30,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:16:33,663 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:16:36,703 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-30 16:16:41,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:16:43,929 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.72 vs. limit=15.0 2023-09-30 16:16:48,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:16:50,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:16:53,183 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:16:56,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:16:57,863 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:16:57,968 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:16:59,263 INFO [train.py:1039] (1/4) Epoch 22, batch 3500, loss[loss=0.1513, simple_loss=0.2401, pruned_loss=0.03127, over 24511.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2491, pruned_loss=0.04804, over 4710342.39 frames. ], batch size: 63, lr: 4.65e-03, grad_scale: 8.0 2023-09-30 16:16:59,342 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:17:04,699 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:17:07,718 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:17:09,635 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-30 16:17:11,141 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 16:17:13,469 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-30 16:17:16,655 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:17:16,679 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-30 16:17:21,800 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:17:21,944 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:17:23,602 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:17:23,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:17:24,972 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-30 16:17:25,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:17:26,481 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:17:26,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-30 16:17:29,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:17:30,904 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-30 16:17:32,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:17:36,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:17:37,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-30 16:17:37,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:17:39,408 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:17:42,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:17:44,406 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.347e+02 1.810e+02 1.991e+02 2.339e+02 3.631e+02, threshold=3.981e+02, percent-clipped=0.0 2023-09-30 16:17:44,577 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:17:46,129 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:17:46,163 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:17:47,711 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-30 16:17:47,871 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-30 16:17:49,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-30 16:17:52,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:17:53,598 WARNING [train.py:1197] (1/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:17:53,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:17:53,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 16:17:58,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 16:17:58,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:18:03,348 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:18:04,894 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-30 16:18:04,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-30 16:18:04,906 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:18:08,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:18:10,021 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:18:11,640 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:18:12,078 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=767293.3333333334, ans=0.0 2023-09-30 16:18:13,206 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-30 16:18:13,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:18:14,840 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:18:16,392 WARNING [train.py:1197] (1/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-30 16:18:18,457 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-30 16:18:20,064 WARNING [train.py:1197] (1/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:18:21,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:18:22,001 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:18:22,055 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:18:24,034 INFO [train.py:1039] (1/4) Epoch 22, batch 3550, loss[loss=0.1597, simple_loss=0.2491, pruned_loss=0.03511, over 24676.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2482, pruned_loss=0.04803, over 4712907.13 frames. ], batch size: 73, lr: 4.65e-03, grad_scale: 8.0 2023-09-30 16:18:25,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:18:34,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:18:34,993 WARNING [train.py:1197] (1/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 16:18:39,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:18:40,200 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:18:42,177 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=767426.6666666666, ans=0.0 2023-09-30 16:18:43,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:18:43,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:18:44,702 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 16:18:47,818 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:18:49,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:18:49,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:18:50,770 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-30 16:18:50,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 16:18:51,310 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=767426.6666666666, ans=0.1 2023-09-30 16:18:58,393 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:18:58,524 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=767493.3333333334, ans=0.125 2023-09-30 16:18:59,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:19:00,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:19:00,023 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:19:01,456 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:19:01,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-30 16:19:01,516 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:19:03,076 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:19:04,649 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 16:19:10,707 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:19:10,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:19:12,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:19:15,856 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-30 16:19:15,960 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:19:16,295 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=767560.0, ans=0.2 2023-09-30 16:19:17,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-30 16:19:17,499 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:19:19,095 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-30 16:19:19,131 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:19:23,767 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-30 16:19:25,216 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:19:30,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:19:32,634 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-30 16:19:34,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:19:37,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:19:38,931 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-30 16:19:40,895 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=767626.6666666666, ans=0.2 2023-09-30 16:19:44,955 INFO [train.py:1039] (1/4) Epoch 22, batch 3600, loss[loss=0.1665, simple_loss=0.2434, pruned_loss=0.04483, over 24488.00 frames. ], tot_loss[loss=0.1719, simple_loss=0.2479, pruned_loss=0.04796, over 4703570.23 frames. ], batch size: 63, lr: 4.65e-03, grad_scale: 16.0 2023-09-30 16:19:46,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-30 16:19:46,585 WARNING [train.py:1197] (1/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:19:46,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:19:48,994 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:19:50,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:19:50,780 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=767693.3333333334, ans=0.125 2023-09-30 16:19:51,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:19:55,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:19:56,681 WARNING [train.py:1197] (1/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:19:58,112 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:19:58,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:19:58,302 WARNING [train.py:1197] (1/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:19:59,762 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-30 16:20:02,090 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 16:20:04,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:20:06,497 WARNING [train.py:1197] (1/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:20:09,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:20:11,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 16:20:11,178 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:20:11,220 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-30 16:20:12,732 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:20:15,781 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:20:17,235 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:20:18,892 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:20:22,009 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:20:24,143 WARNING [train.py:1197] (1/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:20:25,628 WARNING [train.py:1197] (1/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-30 16:20:30,136 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.855e+02 2.093e+02 2.539e+02 3.867e+02, threshold=4.186e+02, percent-clipped=0.0 2023-09-30 16:20:31,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:20:33,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 16:20:33,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-30 16:20:39,257 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:20:45,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:20:46,154 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=767893.3333333334, ans=0.035 2023-09-30 16:20:47,523 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:20:51,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-30 16:20:51,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:20:51,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-30 16:20:52,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-30 16:20:54,425 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-30 16:20:57,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:20:57,915 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:20:58,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-30 16:20:59,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:20:59,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:20:59,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:21:01,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-30 16:21:04,144 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-30 16:21:05,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:21:06,002 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-30 16:21:09,576 INFO [train.py:1039] (1/4) Epoch 22, batch 3650, loss[loss=0.1672, simple_loss=0.2425, pruned_loss=0.04594, over 23320.00 frames. ], tot_loss[loss=0.1718, simple_loss=0.2484, pruned_loss=0.04765, over 4712626.12 frames. ], batch size: 119, lr: 4.65e-03, grad_scale: 16.0 2023-09-30 16:21:13,370 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-30 16:21:13,572 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:21:14,323 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.97 vs. limit=15.0 2023-09-30 16:21:20,018 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-30 16:21:21,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-30 16:21:26,296 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:21:26,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-30 16:21:27,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 16:21:31,177 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-30 16:21:31,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:21:32,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-30 16:21:34,173 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:21:34,218 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:21:35,080 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-30 16:21:36,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 16:21:36,548 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:21:36,553 WARNING [train.py:1197] (1/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:21:37,409 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.10 vs. limit=15.0 2023-09-30 16:21:39,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:21:43,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-30 16:21:44,685 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-30 16:21:46,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:21:47,678 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-30 16:21:49,857 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:21:49,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:21:53,968 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=768160.0, ans=0.125 2023-09-30 16:21:56,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:21:58,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:21:58,275 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-30 16:21:58,600 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=768226.6666666666, ans=0.1 2023-09-30 16:21:59,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:22:01,305 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:22:04,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:22:06,051 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:22:07,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:22:07,595 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:22:09,229 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=768226.6666666666, ans=0.2 2023-09-30 16:22:11,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 16:22:12,389 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:22:12,490 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:22:18,707 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-30 16:22:22,352 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:22:22,381 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:22:25,735 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-30 16:22:25,815 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:22:27,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-30 16:22:29,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:22:31,188 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-30 16:22:31,194 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:22:32,631 INFO [train.py:1039] (1/4) Epoch 22, batch 3700, loss[loss=0.1705, simple_loss=0.2616, pruned_loss=0.03973, over 24316.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2486, pruned_loss=0.04768, over 4709720.14 frames. ], batch size: 74, lr: 4.65e-03, grad_scale: 16.0 2023-09-30 16:22:32,848 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 16:22:35,846 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:22:37,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:22:40,396 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:22:40,398 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-30 16:22:40,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:22:40,558 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 16:22:41,991 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 16:22:47,166 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 16:22:48,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:22:48,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:22:50,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:22:51,901 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:22:52,013 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 16:22:55,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:22:57,107 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-30 16:22:57,444 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=768426.6666666666, ans=0.015 2023-09-30 16:23:02,717 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=768426.6666666666, ans=0.125 2023-09-30 16:23:05,795 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:23:07,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 16:23:07,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 16:23:07,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-30 16:23:07,455 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:23:07,753 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=768493.3333333334, ans=0.125 2023-09-30 16:23:12,061 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:23:12,213 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-30 16:23:12,751 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.07 vs. limit=10.0 2023-09-30 16:23:13,647 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:23:15,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:23:15,457 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=768493.3333333334, ans=0.0 2023-09-30 16:23:16,431 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.461e+02 1.838e+02 2.020e+02 2.456e+02 4.416e+02, threshold=4.040e+02, percent-clipped=2.0 2023-09-30 16:23:16,942 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=768493.3333333334, ans=0.2 2023-09-30 16:23:18,138 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:23:18,185 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 16:23:21,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 16:23:26,412 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:23:26,419 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-30 16:23:27,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:23:27,909 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-30 16:23:32,464 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:23:32,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:23:36,644 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:23:36,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-30 16:23:36,899 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=768626.6666666666, ans=0.125 2023-09-30 16:23:40,528 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:23:40,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-30 16:23:40,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:23:41,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:23:42,303 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=768626.6666666666, ans=0.125 2023-09-30 16:23:45,171 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:23:46,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-30 16:23:46,810 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-30 16:23:48,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:23:48,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:23:48,471 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:23:49,965 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:23:53,462 WARNING [train.py:1197] (1/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:23:54,768 INFO [train.py:1039] (1/4) Epoch 22, batch 3750, loss[loss=0.1641, simple_loss=0.2547, pruned_loss=0.03678, over 24632.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2502, pruned_loss=0.04856, over 4708668.96 frames. ], batch size: 68, lr: 4.65e-03, grad_scale: 16.0 2023-09-30 16:23:54,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:23:55,363 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=768693.3333333334, ans=0.125 2023-09-30 16:23:56,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:23:59,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-30 16:23:59,751 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=768693.3333333334, ans=0.2 2023-09-30 16:24:00,896 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 16:24:02,590 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-30 16:24:04,046 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-30 16:24:04,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:24:05,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:24:05,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:24:09,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:24:14,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:24:16,133 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=768760.0, ans=0.125 2023-09-30 16:24:17,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:24:19,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 16:24:20,696 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:24:24,957 WARNING [train.py:1197] (1/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:24:25,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-30 16:24:26,578 WARNING [train.py:1197] (1/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:24:28,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:24:28,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:24:31,877 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-30 16:24:35,016 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-30 16:24:36,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:24:37,872 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:24:40,886 WARNING [train.py:1197] (1/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:24:46,643 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:24:48,164 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-30 16:24:51,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-30 16:24:53,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:24:56,268 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:24:56,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:25:00,942 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 16:25:03,307 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=768960.0, ans=0.07 2023-09-30 16:25:06,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 16:25:07,746 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-30 16:25:08,568 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.01 vs. limit=15.0 2023-09-30 16:25:09,357 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 16:25:10,910 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:25:11,115 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=768960.0, ans=0.1 2023-09-30 16:25:14,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-30 16:25:16,068 INFO [train.py:1039] (1/4) Epoch 22, batch 3800, loss[loss=0.1708, simple_loss=0.2496, pruned_loss=0.04604, over 23456.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2497, pruned_loss=0.04815, over 4725473.40 frames. ], batch size: 134, lr: 4.65e-03, grad_scale: 16.0 2023-09-30 16:25:16,562 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=769026.6666666666, ans=0.0 2023-09-30 16:25:20,692 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=769026.6666666666, ans=0.125 2023-09-30 16:25:23,528 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:25:26,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:25:28,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 16:25:29,657 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-30 16:25:31,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:25:31,428 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:25:32,982 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-30 16:25:34,120 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=769093.3333333334, ans=0.125 2023-09-30 16:25:36,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 16:25:36,490 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:25:36,627 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:25:38,222 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:25:38,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:25:39,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:25:41,132 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-30 16:25:44,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-30 16:25:44,305 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:25:47,349 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:25:50,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:25:51,048 WARNING [train.py:1197] (1/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 16:25:53,344 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-30 16:25:53,368 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:25:56,410 WARNING [train.py:1197] (1/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:25:56,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:26:00,869 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.761e+02 1.975e+02 2.225e+02 3.481e+02, threshold=3.951e+02, percent-clipped=0.0 2023-09-30 16:26:02,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 16:26:02,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-30 16:26:02,769 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:26:11,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:26:15,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:26:17,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-30 16:26:17,762 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=769226.6666666666, ans=0.125 2023-09-30 16:26:19,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-30 16:26:19,330 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:26:22,887 WARNING [train.py:1197] (1/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:26:24,310 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:26:24,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-30 16:26:29,741 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-30 16:26:29,759 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-30 16:26:29,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:26:32,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:26:39,667 INFO [train.py:1039] (1/4) Epoch 22, batch 3850, loss[loss=0.1749, simple_loss=0.2567, pruned_loss=0.04652, over 23183.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2485, pruned_loss=0.04839, over 4701574.56 frames. ], batch size: 93, lr: 4.65e-03, grad_scale: 16.0 2023-09-30 16:26:39,749 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:26:39,867 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 16:26:43,425 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=769360.0, ans=0.0 2023-09-30 16:26:43,845 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.24 vs. limit=12.0 2023-09-30 16:26:44,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:26:44,657 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-30 16:26:46,252 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 16:26:46,405 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:26:49,608 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 16:26:53,200 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:26:53,536 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=769360.0, ans=0.125 2023-09-30 16:26:54,815 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-30 16:26:56,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-30 16:26:56,668 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=769426.6666666666, ans=0.125 2023-09-30 16:27:03,972 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:27:04,716 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys.whitening_limit, batch_count=769426.6666666666, ans=6.0 2023-09-30 16:27:07,019 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:27:09,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:27:09,865 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=14.24 vs. limit=15.0 2023-09-30 16:27:10,722 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 16:27:12,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:27:12,532 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:27:12,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:27:12,640 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:27:14,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:27:15,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:27:15,948 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:27:17,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:27:17,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-30 16:27:17,537 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-30 16:27:19,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:27:19,156 WARNING [train.py:1197] (1/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:27:22,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:27:22,277 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:27:23,704 WARNING [train.py:1197] (1/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-30 16:27:25,790 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-30 16:27:28,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:27:31,041 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-30 16:27:32,654 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-30 16:27:39,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:27:41,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:27:44,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:27:46,096 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-30 16:27:47,780 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-30 16:27:49,366 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=769626.6666666666, ans=0.0 2023-09-30 16:27:50,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:27:52,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:27:53,885 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 16:27:53,908 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:27:55,343 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:27:55,477 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:27:55,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:27:55,487 WARNING [train.py:1197] (1/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-30 16:27:56,924 WARNING [train.py:1197] (1/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:27:58,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-30 16:27:58,552 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:27:58,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:28:02,240 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:28:03,502 INFO [train.py:1039] (1/4) Epoch 22, batch 3900, loss[loss=0.1977, simple_loss=0.2763, pruned_loss=0.05957, over 24091.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.2469, pruned_loss=0.04819, over 4694655.33 frames. ], batch size: 80, lr: 4.65e-03, grad_scale: 16.0 2023-09-30 16:28:03,556 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:28:05,099 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:28:05,207 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:28:05,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:28:05,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:28:07,219 WARNING [train.py:1197] (1/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-30 16:28:07,315 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:28:11,833 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:28:13,860 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 16:28:13,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:28:15,513 WARNING [train.py:1197] (1/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:28:17,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 16:28:19,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:28:20,738 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-30 16:28:22,318 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-30 16:28:22,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:28:23,914 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-30 16:28:23,961 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:28:25,428 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-30 16:28:25,717 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=769760.0, ans=0.125 2023-09-30 16:28:27,002 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-30 16:28:30,271 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:28:31,710 WARNING [train.py:1197] (1/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:28:31,734 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 16:28:33,220 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-30 16:28:39,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:28:41,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:28:46,378 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:28:46,390 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:28:48,312 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.859e+02 2.113e+02 2.353e+02 3.355e+02, threshold=4.226e+02, percent-clipped=0.0 2023-09-30 16:28:48,419 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:28:50,059 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=769826.6666666666, ans=0.125 2023-09-30 16:28:55,193 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:28:55,265 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:29:02,849 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 16:29:03,038 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:29:10,849 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=769960.0, ans=0.125 2023-09-30 16:29:13,508 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:29:15,743 WARNING [train.py:1197] (1/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-30 16:29:17,725 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-30 16:29:17,792 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-30 16:29:17,978 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=769960.0, ans=0.0 2023-09-30 16:29:19,270 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-30 16:29:19,472 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-30 16:29:21,029 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:29:22,443 WARNING [train.py:1197] (1/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-30 16:29:25,827 INFO [train.py:1039] (1/4) Epoch 22, batch 3950, loss[loss=0.1822, simple_loss=0.2697, pruned_loss=0.04738, over 24312.00 frames. ], tot_loss[loss=0.1715, simple_loss=0.2473, pruned_loss=0.04785, over 4709592.60 frames. ], batch size: 74, lr: 4.64e-03, grad_scale: 16.0 2023-09-30 16:29:29,689 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:29:31,150 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-30 16:29:31,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:29:34,245 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:29:37,210 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:29:43,251 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-30 16:29:43,359 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 16:29:43,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-30 16:29:44,782 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-30 16:29:44,822 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:29:48,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:29:48,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-30 16:29:48,112 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:29:50,942 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.61 vs. limit=15.0 2023-09-30 16:29:52,370 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-30 16:29:54,031 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:29:55,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 16:29:55,444 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 16:29:55,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 16:29:56,979 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:30:07,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:30:08,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:30:13,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-30 16:30:19,513 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-30 16:30:19,518 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-30 16:30:19,576 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:30:21,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:30:27,668 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.93 vs. limit=15.0 2023-09-30 16:30:29,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:30:29,964 WARNING [train.py:1197] (1/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-30 16:30:31,494 WARNING [train.py:1197] (1/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:30:31,538 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:30:31,601 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-30 16:30:34,377 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=770293.3333333334, ans=0.1 2023-09-30 16:30:37,752 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:30:39,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:30:41,192 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=770293.3333333334, ans=0.125 2023-09-30 16:30:41,231 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=770293.3333333334, ans=0.0 2023-09-30 16:30:43,054 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.39 vs. limit=6.0 2023-09-30 16:30:43,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-30 16:30:48,430 INFO [train.py:1039] (1/4) Epoch 22, batch 4000, loss[loss=0.1746, simple_loss=0.2635, pruned_loss=0.04282, over 24429.00 frames. ], tot_loss[loss=0.1722, simple_loss=0.2484, pruned_loss=0.04804, over 4725568.71 frames. ], batch size: 69, lr: 4.64e-03, grad_scale: 16.0 2023-09-30 16:30:53,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:30:58,854 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 16:31:02,214 WARNING [train.py:1197] (1/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:31:05,608 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=770426.6666666666, ans=0.1 2023-09-30 16:31:06,865 WARNING [train.py:1197] (1/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:31:06,969 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:31:08,434 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:31:08,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-30 16:31:08,559 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-30 16:31:10,667 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-30 16:31:10,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 16:31:10,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-30 16:31:14,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:31:17,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:31:17,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:31:17,487 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:31:17,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:31:17,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-30 16:31:19,107 WARNING [train.py:1197] (1/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:31:20,684 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-30 16:31:20,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:31:21,167 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=770493.3333333334, ans=0.125 2023-09-30 16:31:22,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:31:23,918 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-30 16:31:25,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 16:31:25,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:31:32,081 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-30 16:31:33,551 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:31:34,785 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.863e+02 2.026e+02 2.291e+02 3.253e+02, threshold=4.053e+02, percent-clipped=0.0 2023-09-30 16:31:38,322 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:31:39,740 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-30 16:31:41,331 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:31:42,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-30 16:31:42,843 WARNING [train.py:1197] (1/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:31:44,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:31:44,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:31:46,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:31:46,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-30 16:31:48,966 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:31:49,175 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-30 16:31:49,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:31:50,753 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-30 16:31:55,546 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 16:32:00,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 16:32:01,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:32:01,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:32:03,152 WARNING [train.py:1197] (1/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:32:03,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:32:08,826 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:32:11,921 INFO [train.py:1039] (1/4) Epoch 22, batch 4050, loss[loss=0.1574, simple_loss=0.2474, pruned_loss=0.03367, over 24013.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2484, pruned_loss=0.04811, over 4724617.49 frames. ], batch size: 80, lr: 4.64e-03, grad_scale: 16.0 2023-09-30 16:32:13,451 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-30 16:32:14,912 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-30 16:32:15,198 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 16:32:16,502 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:32:16,654 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:32:17,567 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten.whitening_limit, batch_count=770693.3333333334, ans=15.0 2023-09-30 16:32:18,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-30 16:32:18,309 WARNING [train.py:1197] (1/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:32:19,269 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=770693.3333333334, ans=0.0 2023-09-30 16:32:24,221 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:32:28,623 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:32:30,011 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 16:32:31,718 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:32:31,969 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=770760.0, ans=0.0 2023-09-30 16:32:33,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:32:35,287 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=770760.0, ans=0.0 2023-09-30 16:32:36,308 WARNING [train.py:1197] (1/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:32:36,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-30 16:32:39,716 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 16:32:41,955 WARNING [train.py:1197] (1/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-30 16:32:43,777 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-30 16:32:46,711 WARNING [train.py:1197] (1/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:32:54,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-30 16:32:55,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:32:59,928 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=770893.3333333334, ans=0.125 2023-09-30 16:33:01,879 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:33:05,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:33:05,098 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:33:06,343 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:33:09,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:33:11,117 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-30 16:33:11,130 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 16:33:12,735 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:33:14,169 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-30 16:33:19,900 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:33:26,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-30 16:33:27,589 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:33:27,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 16:33:30,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-30 16:33:30,688 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-30 16:33:30,690 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:33:34,167 INFO [train.py:1039] (1/4) Epoch 22, batch 4100, loss[loss=0.1772, simple_loss=0.2613, pruned_loss=0.04655, over 24333.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2492, pruned_loss=0.048, over 4723529.71 frames. ], batch size: 77, lr: 4.64e-03, grad_scale: 8.0 2023-09-30 16:33:34,329 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:33:36,363 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:33:36,387 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:33:43,986 WARNING [train.py:1197] (1/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-30 16:33:45,492 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-30 16:33:47,103 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-30 16:33:48,763 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-30 16:33:48,782 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:33:48,862 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:33:50,193 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:33:50,235 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 16:33:51,783 WARNING [train.py:1197] (1/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-30 16:33:55,588 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:33:55,737 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:33:55,763 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:33:57,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 16:34:00,478 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 16:34:01,971 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:34:02,051 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:34:03,373 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-30 16:34:03,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:34:03,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:34:03,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:34:03,566 WARNING [train.py:1197] (1/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:34:04,967 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-30 16:34:07,516 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.max_abs, batch_count=771160.0, ans=10.0 2023-09-30 16:34:08,707 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:34:08,874 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-30 16:34:10,336 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:34:11,925 WARNING [train.py:1197] (1/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:34:11,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-30 16:34:13,404 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:34:14,862 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:34:16,215 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:34:17,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-30 16:34:19,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-30 16:34:20,929 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:34:21,182 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.min_positive, batch_count=771226.6666666666, ans=0.025 2023-09-30 16:34:21,204 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=771226.6666666666, ans=0.035 2023-09-30 16:34:22,265 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.433e+02 1.834e+02 2.084e+02 2.360e+02 3.426e+02, threshold=4.169e+02, percent-clipped=0.0 2023-09-30 16:34:22,539 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-30 16:34:24,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:34:24,614 WARNING [train.py:1197] (1/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:34:27,632 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:34:33,935 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=771226.6666666666, ans=0.0 2023-09-30 16:34:35,144 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:34:38,391 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:34:39,888 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:34:40,751 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.67 vs. limit=10.0 2023-09-30 16:34:45,991 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=771293.3333333334, ans=0.2 2023-09-30 16:34:48,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:34:48,841 WARNING [train.py:1197] (1/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:34:53,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:34:53,533 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:34:54,881 INFO [train.py:1039] (1/4) Epoch 22, batch 4150, loss[loss=0.1751, simple_loss=0.249, pruned_loss=0.05061, over 23509.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2502, pruned_loss=0.04804, over 4735183.15 frames. ], batch size: 120, lr: 4.64e-03, grad_scale: 8.0 2023-09-30 16:34:55,475 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=771360.0, ans=0.125 2023-09-30 16:34:56,678 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:34:56,828 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=771360.0, ans=0.015 2023-09-30 16:34:58,631 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 16:34:58,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:34:58,775 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:34:59,586 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.31 vs. limit=15.0 2023-09-30 16:35:02,358 WARNING [train.py:1197] (1/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-30 16:35:02,412 WARNING [train.py:1197] (1/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:35:03,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-30 16:35:03,926 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-30 16:35:03,950 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-30 16:35:06,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:35:07,404 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=771360.0, ans=0.0 2023-09-30 16:35:10,239 WARNING [train.py:1197] (1/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:35:10,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:35:15,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:35:17,523 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:35:18,976 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-30 16:35:19,212 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 16:35:19,232 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:35:20,778 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-30 16:35:20,926 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=771426.6666666666, ans=0.0 2023-09-30 16:35:25,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:35:25,751 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=771426.6666666666, ans=0.125 2023-09-30 16:35:28,744 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:35:30,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-30 16:35:32,324 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-30 16:35:32,332 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:35:34,138 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=771493.3333333334, ans=0.1 2023-09-30 16:35:35,331 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-30 16:35:35,339 WARNING [train.py:1197] (1/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:35:35,369 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:35:38,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:35:39,942 WARNING [train.py:1197] (1/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:35:43,178 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-30 16:35:47,095 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-30 16:35:48,665 WARNING [train.py:1197] (1/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:35:49,474 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-30 16:35:50,760 WARNING [train.py:1197] (1/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:35:51,050 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=771560.0, ans=0.125 2023-09-30 16:35:52,287 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-30 16:35:52,989 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.82 vs. limit=15.0 2023-09-30 16:35:53,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 16:35:55,253 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:35:56,786 WARNING [train.py:1197] (1/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:35:58,411 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-30 16:35:58,411 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:35:58,415 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-30 16:36:00,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 16:36:03,044 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-30 16:36:04,322 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:36:04,329 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 16:36:04,363 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 16:36:04,491 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-30 16:36:04,523 WARNING [train.py:1197] (1/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:36:04,570 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 16:36:06,574 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:36:08,206 WARNING [train.py:1197] (1/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:36:08,246 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-30 16:36:09,621 WARNING [train.py:1197] (1/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-30 16:36:14,656 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=771626.6666666666, ans=0.1 2023-09-30 16:36:15,684 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:36:15,930 WARNING [train.py:1197] (1/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-30 16:36:17,315 INFO [train.py:1039] (1/4) Epoch 22, batch 4200, loss[loss=0.1631, simple_loss=0.2501, pruned_loss=0.03804, over 24464.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2486, pruned_loss=0.0477, over 4735556.49 frames. ], batch size: 69, lr: 4.64e-03, grad_scale: 8.0 2023-09-30 16:36:19,503 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 16:36:23,092 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:36:23,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 16:36:24,780 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:36:24,783 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:36:27,785 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-30 16:36:30,140 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.54 vs. limit=10.0 2023-09-30 16:36:30,932 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-30 16:36:31,004 WARNING [train.py:1197] (1/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:36:33,987 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:36:36,928 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:36:40,501 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-30 16:36:40,763 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:36:40,804 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:36:42,846 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-30 16:36:42,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:36:44,439 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:36:45,981 WARNING [train.py:1197] (1/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:36:46,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 16:36:47,507 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 16:36:50,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-30 16:36:51,923 WARNING [train.py:1197] (1/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:36:52,344 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=771826.6666666666, ans=0.125 2023-09-30 16:36:57,236 WARNING [train.py:1197] (1/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-30 16:36:57,395 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:37:00,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:37:00,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:37:03,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:37:03,611 WARNING [train.py:1197] (1/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-30 16:37:03,651 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:37:05,225 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:37:06,544 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.471e+02 1.822e+02 2.029e+02 2.265e+02 3.199e+02, threshold=4.057e+02, percent-clipped=0.0 2023-09-30 16:37:11,386 WARNING [train.py:1197] (1/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-30 16:37:13,541 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:37:20,244 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:37:21,985 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=771960.0, ans=0.1 2023-09-30 16:37:22,031 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=771960.0, ans=0.125 2023-09-30 16:37:23,367 WARNING [train.py:1197] (1/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-30 16:37:24,983 WARNING [train.py:1197] (1/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:37:30,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 16:37:31,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:37:34,505 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-30 16:37:35,145 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=771960.0, ans=0.1 2023-09-30 16:37:37,770 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-30 16:37:40,679 INFO [train.py:1039] (1/4) Epoch 22, batch 4250, loss[loss=0.1757, simple_loss=0.2618, pruned_loss=0.04474, over 24300.00 frames. ], tot_loss[loss=0.1713, simple_loss=0.2479, pruned_loss=0.04737, over 4719496.88 frames. ], batch size: 74, lr: 4.64e-03, grad_scale: 8.0 2023-09-30 16:37:41,060 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=772026.6666666666, ans=0.125 2023-09-30 16:37:42,906 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:37:44,316 WARNING [train.py:1197] (1/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-30 16:37:47,276 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:37:50,624 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-30 16:37:50,692 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-30 16:37:51,322 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.35 vs. limit=6.0 2023-09-30 16:37:52,832 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:37:55,831 WARNING [train.py:1197] (1/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:37:58,898 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:38:02,099 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:38:03,891 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:38:06,278 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:38:06,282 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:38:07,866 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:38:09,391 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:38:10,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:38:12,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:38:12,880 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=772160.0, ans=0.2 2023-09-30 16:38:14,033 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:38:16,024 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-30 16:38:19,280 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=772160.0, ans=0.0 2023-09-30 16:38:20,470 WARNING [train.py:1197] (1/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-30 16:38:20,482 WARNING [train.py:1197] (1/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:38:20,599 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:38:20,637 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:38:22,140 WARNING [train.py:1197] (1/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:38:22,146 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:38:22,230 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:38:26,803 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-30 16:38:28,913 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:38:32,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:38:33,677 WARNING [train.py:1197] (1/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:38:35,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-30 16:38:35,146 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 16:38:37,328 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-30 16:38:38,882 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:38:41,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-30 16:38:44,208 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:38:44,261 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:38:45,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-30 16:38:46,789 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.41 vs. limit=10.0 2023-09-30 16:38:47,514 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 16:38:47,606 WARNING [train.py:1197] (1/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-30 16:38:49,510 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=772293.3333333334, ans=0.125 2023-09-30 16:38:52,314 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:38:55,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:38:55,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:38:57,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:38:59,151 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:39:01,388 WARNING [train.py:1197] (1/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:39:01,484 WARNING [train.py:1197] (1/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:39:01,495 WARNING [train.py:1197] (1/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-30 16:39:03,058 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:39:03,407 INFO [scaling.py:1118] (1/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=5.157e-03 2023-09-30 16:39:03,430 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=772360.0, ans=0.1 2023-09-30 16:39:04,424 INFO [train.py:1039] (1/4) Epoch 22, batch 4300, loss[loss=0.1822, simple_loss=0.2378, pruned_loss=0.06328, over 19418.00 frames. ], tot_loss[loss=0.1709, simple_loss=0.2472, pruned_loss=0.04726, over 4716461.99 frames. ], batch size: 388, lr: 4.64e-03, grad_scale: 8.0 2023-09-30 16:39:06,418 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=772360.0, ans=0.1 2023-09-30 16:39:09,120 WARNING [train.py:1197] (1/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:39:10,461 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:39:14,361 WARNING [train.py:1197] (1/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:39:23,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:39:23,453 WARNING [train.py:1197] (1/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-30 16:39:24,977 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:39:26,594 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-30 16:39:26,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 16:39:26,648 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-30 16:39:31,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 16:39:31,964 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=772426.6666666666, ans=0.0 2023-09-30 16:39:33,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 16:39:33,544 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=772426.6666666666, ans=0.0 2023-09-30 16:39:38,453 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-30 16:39:38,475 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:39:38,514 WARNING [train.py:1197] (1/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-30 16:39:41,659 WARNING [train.py:1197] (1/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 16:39:41,863 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:39:43,541 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:39:43,544 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:39:43,825 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=772493.3333333334, ans=0.1 2023-09-30 16:39:45,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 16:39:48,565 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:39:48,729 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:39:50,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-30 16:39:50,193 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-30 16:39:51,858 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:39:53,170 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.888e+02 2.133e+02 2.440e+02 3.863e+02, threshold=4.266e+02, percent-clipped=0.0 2023-09-30 16:39:54,980 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:39:54,995 WARNING [train.py:1197] (1/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 16:39:55,022 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:39:55,094 WARNING [train.py:1197] (1/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:39:55,114 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-30 16:39:55,117 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-30 16:39:55,754 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.80 vs. limit=15.0 2023-09-30 16:39:56,671 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-30 16:39:58,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:39:58,205 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-30 16:39:58,256 WARNING [train.py:1197] (1/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-30 16:40:02,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:40:05,187 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-30 16:40:05,282 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:40:08,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:40:08,817 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:40:11,876 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-30 16:40:11,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 16:40:11,991 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:40:12,223 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=772626.6666666666, ans=0.1 2023-09-30 16:40:13,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:40:13,415 WARNING [train.py:1197] (1/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:40:13,510 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:40:16,530 WARNING [train.py:1197] (1/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:40:18,211 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:40:18,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:40:20,828 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:40:25,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-30 16:40:26,619 INFO [train.py:1039] (1/4) Epoch 22, batch 4350, loss[loss=0.1728, simple_loss=0.2593, pruned_loss=0.04316, over 24633.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2483, pruned_loss=0.04784, over 4712189.45 frames. ], batch size: 68, lr: 4.64e-03, grad_scale: 8.0 2023-09-30 16:40:26,719 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-30 16:40:30,063 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=772693.3333333334, ans=0.1 2023-09-30 16:40:31,332 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:40:34,347 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:40:36,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:40:36,036 WARNING [train.py:1197] (1/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:40:41,840 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 16:40:44,875 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:40:46,543 WARNING [train.py:1197] (1/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:40:46,568 WARNING [train.py:1197] (1/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:40:49,650 WARNING [train.py:1197] (1/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-30 16:40:53,201 WARNING [train.py:1197] (1/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:40:55,429 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-30 16:41:00,321 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-30 16:41:01,764 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:41:01,890 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:41:07,971 WARNING [train.py:1197] (1/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:41:11,437 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-30 16:41:13,709 WARNING [train.py:1197] (1/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:41:15,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 16:41:19,800 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-30 16:41:21,327 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:41:21,409 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:41:22,914 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-30 16:41:24,404 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-30 16:41:24,425 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:41:24,469 WARNING [train.py:1197] (1/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:41:24,583 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:41:26,603 WARNING [train.py:1197] (1/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:41:27,912 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:41:27,983 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:41:31,098 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-30 16:41:31,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:41:31,125 WARNING [train.py:1197] (1/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:41:32,536 WARNING [train.py:1197] (1/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:41:32,652 WARNING [train.py:1197] (1/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-30 16:41:35,549 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-30 16:41:35,556 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-30 16:41:35,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-30 16:41:38,771 WARNING [train.py:1197] (1/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:41:38,808 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 16:41:38,839 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:41:39,175 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=772960.0, ans=0.125 2023-09-30 16:41:40,288 WARNING [train.py:1197] (1/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:41:41,905 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-30 16:41:44,858 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-30 16:41:44,870 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:41:49,165 INFO [train.py:1039] (1/4) Epoch 22, batch 4400, loss[loss=0.189, simple_loss=0.2723, pruned_loss=0.05284, over 24139.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2497, pruned_loss=0.04823, over 4714208.97 frames. ], batch size: 80, lr: 4.64e-03, grad_scale: 16.0 2023-09-30 16:41:49,387 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:41:49,401 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:41:53,710 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:41:55,279 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-30 16:41:55,325 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-30 16:41:56,849 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-30 16:41:56,913 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-30 16:41:58,364 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 16:41:58,384 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:42:01,322 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-30 16:42:05,648 WARNING [train.py:1197] (1/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:42:07,124 WARNING [train.py:1197] (1/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:42:07,144 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-30 16:42:08,951 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:42:08,953 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-30 16:42:09,045 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-30 16:42:12,250 WARNING [train.py:1197] (1/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-30 16:42:13,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-30 16:42:13,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-30 16:42:13,837 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:42:15,328 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:42:15,403 WARNING [train.py:1197] (1/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:42:18,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:42:18,508 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-30 16:42:18,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-30 16:42:19,944 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:42:22,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:42:22,159 WARNING [train.py:1197] (1/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:42:23,776 WARNING [train.py:1197] (1/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:42:25,241 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:42:25,263 WARNING [train.py:1197] (1/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-30 16:42:25,412 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-30 16:42:27,707 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=773160.0, ans=0.125 2023-09-30 16:42:28,963 WARNING [train.py:1197] (1/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:42:29,265 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=773160.0, ans=0.0 2023-09-30 16:42:32,302 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=773160.0, ans=0.0 2023-09-30 16:42:36,383 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:42:38,240 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.442e+02 1.760e+02 1.970e+02 2.320e+02 3.797e+02, threshold=3.941e+02, percent-clipped=0.0 2023-09-30 16:42:39,899 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-30 16:42:43,347 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=773226.6666666666, ans=0.0 2023-09-30 16:42:44,484 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 16:42:46,121 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:42:47,767 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 16:42:49,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-30 16:42:49,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:42:49,291 WARNING [train.py:1197] (1/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-30 16:42:49,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:42:50,853 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-30 16:42:51,207 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=773226.6666666666, ans=0.1 2023-09-30 16:42:54,067 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=773293.3333333334, ans=0.125 2023-09-30 16:42:55,442 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-30 16:42:58,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-30 16:43:00,989 WARNING [train.py:1197] (1/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-30 16:43:01,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:43:01,038 WARNING [train.py:1197] (1/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-30 16:43:02,632 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:43:08,884 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:43:11,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-30 16:43:12,014 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=773360.0, ans=0.025 2023-09-30 16:43:13,088 INFO [train.py:1039] (1/4) Epoch 22, batch 4450, loss[loss=0.1773, simple_loss=0.2482, pruned_loss=0.05323, over 23285.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2505, pruned_loss=0.04873, over 4714051.90 frames. ], batch size: 119, lr: 4.63e-03, grad_scale: 16.0 2023-09-30 16:43:16,922 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:43:18,634 WARNING [train.py:1197] (1/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:43:20,174 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:43:25,004 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:43:25,045 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:43:28,150 WARNING [train.py:1197] (1/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:43:31,025 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:43:33,190 WARNING [train.py:1197] (1/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:43:33,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:43:36,733 WARNING [train.py:1197] (1/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-30 16:43:36,735 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:43:36,865 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:43:36,927 WARNING [train.py:1197] (1/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:43:36,929 WARNING [train.py:1197] (1/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-30 16:43:39,964 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 16:43:46,723 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.17 vs. limit=6.0 2023-09-30 16:43:47,380 WARNING [train.py:1197] (1/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:43:47,463 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:43:48,957 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:43:51,090 WARNING [train.py:1197] (1/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:43:51,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:43:55,300 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=773493.3333333334, ans=0.1 2023-09-30 16:43:56,476 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 16:43:58,040 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-30 16:43:58,066 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-30 16:43:58,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:44:00,970 WARNING [train.py:1197] (1/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:44:02,520 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-30 16:44:06,977 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-30 16:44:10,560 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:44:10,653 WARNING [train.py:1197] (1/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-30 16:44:10,688 WARNING [train.py:1197] (1/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:44:10,693 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:44:12,141 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:44:12,155 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:44:13,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:44:16,794 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-30 16:44:16,855 WARNING [train.py:1197] (1/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-30 16:44:18,389 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 16:44:19,990 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:44:21,500 WARNING [train.py:1197] (1/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:44:21,835 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=773626.6666666666, ans=0.2 2023-09-30 16:44:23,078 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:44:23,126 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 16:44:25,419 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=773626.6666666666, ans=0.0 2023-09-30 16:44:26,540 WARNING [train.py:1197] (1/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-30 16:44:30,227 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-30 16:44:31,935 WARNING [train.py:1197] (1/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 16:44:34,905 INFO [train.py:1039] (1/4) Epoch 22, batch 4500, loss[loss=0.168, simple_loss=0.2587, pruned_loss=0.03866, over 24043.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2507, pruned_loss=0.04837, over 4720860.42 frames. ], batch size: 86, lr: 4.63e-03, grad_scale: 16.0 2023-09-30 16:44:36,642 WARNING [train.py:1197] (1/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:44:36,802 WARNING [train.py:1197] (1/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-30 16:44:36,804 WARNING [train.py:1197] (1/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-30 16:44:39,883 WARNING [train.py:1197] (1/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:44:45,287 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:44:46,673 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:44:46,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 16:44:48,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:44:49,683 WARNING [train.py:1197] (1/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:44:49,770 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:44:59,865 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=773760.0, ans=0.0 2023-09-30 16:45:04,449 WARNING [train.py:1197] (1/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:45:06,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:45:09,139 WARNING [train.py:1197] (1/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:45:09,225 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:45:10,770 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 16:45:15,591 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 16:45:15,801 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=773826.6666666666, ans=0.125 2023-09-30 16:45:21,445 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=773826.6666666666, ans=0.09899494936611666 2023-09-30 16:45:22,794 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:45:24,246 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.544e+02 1.959e+02 2.152e+02 2.408e+02 4.470e+02, threshold=4.304e+02, percent-clipped=1.0 2023-09-30 16:45:26,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 16:45:29,068 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:45:29,119 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-30 16:45:29,562 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=773893.3333333334, ans=0.2 2023-09-30 16:45:30,566 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:45:30,647 WARNING [train.py:1197] (1/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:45:32,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:45:32,338 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:45:34,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:45:34,822 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-30 16:45:34,823 WARNING [train.py:1197] (1/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 16:45:34,835 WARNING [train.py:1197] (1/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:45:35,135 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=773893.3333333334, ans=0.07 2023-09-30 16:45:39,988 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:45:40,033 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 16:45:43,222 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:45:47,532 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-30 16:45:47,560 WARNING [train.py:1197] (1/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:45:49,115 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-30 16:45:51,531 WARNING [train.py:1197] (1/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-30 16:45:51,542 WARNING [train.py:1197] (1/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-30 16:45:55,191 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-30 16:45:58,094 INFO [train.py:1039] (1/4) Epoch 22, batch 4550, loss[loss=0.1638, simple_loss=0.2415, pruned_loss=0.04305, over 24457.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.25, pruned_loss=0.04854, over 4712434.23 frames. ], batch size: 63, lr: 4.63e-03, grad_scale: 16.0 2023-09-30 16:45:58,266 WARNING [train.py:1197] (1/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-30 16:45:58,514 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=774026.6666666666, ans=0.125 2023-09-30 16:45:59,691 WARNING [train.py:1197] (1/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:46:02,773 WARNING [train.py:1197] (1/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:46:04,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:46:07,801 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:46:14,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:46:16,135 WARNING [train.py:1197] (1/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:46:19,148 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 16:46:19,151 WARNING [train.py:1197] (1/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:46:19,153 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:46:20,785 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:46:20,843 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:46:24,111 WARNING [train.py:1197] (1/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:46:24,731 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.37 vs. limit=15.0 2023-09-30 16:46:28,192 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-30 16:46:28,300 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-30 16:46:28,408 WARNING [train.py:1197] (1/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:46:29,917 WARNING [train.py:1197] (1/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-30 16:46:34,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-30 16:46:34,525 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:46:34,762 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=774160.0, ans=0.125 2023-09-30 16:46:37,579 WARNING [train.py:1197] (1/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-30 16:46:39,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 16:46:41,572 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:46:41,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:46:41,644 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:46:44,774 WARNING [train.py:1197] (1/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-30 16:46:48,313 WARNING [train.py:1197] (1/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:46:51,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:46:51,298 WARNING [train.py:1197] (1/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:46:52,784 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 16:46:54,715 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-30 16:46:54,816 WARNING [train.py:1197] (1/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-30 16:46:54,860 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:46:56,491 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-30 16:46:56,779 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-30 16:46:57,506 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.48 vs. limit=15.0 2023-09-30 16:46:58,869 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 16:47:00,404 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:47:00,430 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:47:01,850 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:47:01,880 WARNING [train.py:1197] (1/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:47:03,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 16:47:03,812 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=774293.3333333334, ans=0.125 2023-09-30 16:47:04,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-30 16:47:04,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:47:05,015 WARNING [train.py:1197] (1/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 16:47:06,563 WARNING [train.py:1197] (1/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-30 16:47:06,575 WARNING [train.py:1197] (1/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:47:06,600 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-30 16:47:11,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:47:11,842 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:47:13,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:47:14,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:47:14,903 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-30 16:47:16,492 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:47:18,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-30 16:47:21,075 INFO [train.py:1039] (1/4) Epoch 22, batch 4600, loss[loss=0.1741, simple_loss=0.2313, pruned_loss=0.05847, over 22792.00 frames. ], tot_loss[loss=0.1724, simple_loss=0.248, pruned_loss=0.04842, over 4692958.59 frames. ], batch size: 322, lr: 4.63e-03, grad_scale: 16.0 2023-09-30 16:47:21,145 WARNING [train.py:1197] (1/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:47:21,297 WARNING [train.py:1197] (1/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:47:24,963 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:47:24,984 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 16:47:25,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:47:27,229 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-30 16:47:30,074 WARNING [train.py:1197] (1/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:47:35,292 WARNING [train.py:1197] (1/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:47:36,864 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:47:41,304 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:47:43,415 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=774426.6666666666, ans=0.125 2023-09-30 16:47:44,905 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=774426.6666666666, ans=0.125 2023-09-30 16:47:48,422 WARNING [train.py:1197] (1/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-30 16:47:49,954 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:47:54,397 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:47:57,407 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:47:57,421 WARNING [train.py:1197] (1/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:48:02,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-30 16:48:02,396 WARNING [train.py:1197] (1/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 16:48:04,303 WARNING [train.py:1197] (1/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:48:11,416 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.811e+02 1.994e+02 2.205e+02 2.930e+02, threshold=3.988e+02, percent-clipped=0.0 2023-09-30 16:48:11,526 WARNING [train.py:1197] (1/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:48:11,615 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-30 16:48:11,966 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=774560.0, ans=0.125 2023-09-30 16:48:13,108 WARNING [train.py:1197] (1/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:48:16,413 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-30 16:48:19,295 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-30 16:48:24,098 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.66 vs. limit=15.0 2023-09-30 16:48:24,569 WARNING [train.py:1197] (1/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:48:26,122 WARNING [train.py:1197] (1/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:48:29,168 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:48:29,180 WARNING [train.py:1197] (1/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 16:48:29,235 WARNING [train.py:1197] (1/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:48:29,346 WARNING [train.py:1197] (1/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-30 16:48:30,740 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:48:30,820 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:48:31,001 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:48:32,467 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:48:34,027 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:48:34,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-30 16:48:34,181 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-30 16:48:35,573 WARNING [train.py:1197] (1/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-30 16:48:35,584 WARNING [train.py:1197] (1/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:48:37,515 WARNING [train.py:1197] (1/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:48:37,616 WARNING [train.py:1197] (1/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:48:39,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:48:44,943 INFO [train.py:1039] (1/4) Epoch 22, batch 4650, loss[loss=0.1716, simple_loss=0.262, pruned_loss=0.0406, over 24664.00 frames. ], tot_loss[loss=0.1727, simple_loss=0.2487, pruned_loss=0.04834, over 4711880.14 frames. ], batch size: 73, lr: 4.63e-03, grad_scale: 16.0 2023-09-30 16:48:49,819 WARNING [train.py:1197] (1/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:48:51,452 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:48:52,940 WARNING [train.py:1197] (1/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:48:53,010 WARNING [train.py:1197] (1/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:48:53,069 WARNING [train.py:1197] (1/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:48:53,105 WARNING [train.py:1197] (1/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:48:56,396 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:48:59,504 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-30 16:48:59,774 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=774760.0, ans=0.0 2023-09-30 16:49:00,504 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.95 vs. limit=22.5 2023-09-30 16:49:04,054 WARNING [train.py:1197] (1/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:49:04,356 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=774760.0, ans=0.0 2023-09-30 16:49:06,899 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-30 16:49:06,937 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:49:08,467 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-30 16:49:08,502 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:49:08,587 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-30 16:49:08,623 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-30 16:49:08,636 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:49:08,945 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=774760.0, ans=0.5 2023-09-30 16:49:10,030 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:49:10,676 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.13 vs. limit=10.0 2023-09-30 16:49:13,535 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 16:49:14,998 WARNING [train.py:1197] (1/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:49:15,038 WARNING [train.py:1197] (1/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-30 16:49:18,789 WARNING [train.py:1197] (1/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:49:22,067 WARNING [train.py:1197] (1/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-30 16:49:23,809 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:49:23,826 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:49:25,356 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-30 16:49:26,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:49:29,902 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 16:49:33,500 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:49:38,226 WARNING [train.py:1197] (1/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:49:41,376 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:49:42,731 WARNING [train.py:1197] (1/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:49:42,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 16:49:45,933 WARNING [train.py:1197] (1/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-30 16:49:46,009 WARNING [train.py:1197] (1/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-30 16:49:46,137 WARNING [train.py:1197] (1/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 16:49:46,139 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-30 16:49:48,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:49:53,526 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=774960.0, ans=0.125 2023-09-30 16:49:56,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:49:56,985 WARNING [train.py:1197] (1/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:49:58,372 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-30 16:49:58,409 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:50:00,032 WARNING [train.py:1197] (1/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:50:00,052 WARNING [train.py:1197] (1/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 16:50:01,678 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:50:04,793 WARNING [train.py:1197] (1/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:50:04,817 WARNING [train.py:1197] (1/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:50:06,797 INFO [train.py:1039] (1/4) Epoch 22, batch 4700, loss[loss=0.1813, simple_loss=0.2564, pruned_loss=0.05313, over 23887.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2497, pruned_loss=0.04859, over 4712017.80 frames. ], batch size: 195, lr: 4.63e-03, grad_scale: 16.0 2023-09-30 16:50:06,918 WARNING [train.py:1197] (1/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:50:10,217 WARNING [train.py:1197] (1/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:50:10,284 WARNING [train.py:1197] (1/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:50:10,299 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 16:50:11,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-30 16:50:12,030 WARNING [train.py:1197] (1/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:50:13,540 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-30 16:50:16,945 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=775026.6666666666, ans=0.125 2023-09-30 16:50:18,655 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=775026.6666666666, ans=0.125 2023-09-30 16:50:21,978 WARNING [train.py:1197] (1/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:50:23,479 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:50:23,551 WARNING [train.py:1197] (1/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:50:25,047 WARNING [train.py:1197] (1/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:50:26,562 WARNING [train.py:1197] (1/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 16:50:32,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-30 16:50:32,100 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-30 16:50:35,191 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:50:35,711 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=775093.3333333334, ans=0.2 2023-09-30 16:50:36,762 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:50:36,805 WARNING [train.py:1197] (1/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:50:38,656 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=775160.0, ans=0.125 2023-09-30 16:50:42,023 WARNING [train.py:1197] (1/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:50:48,248 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:50:49,827 WARNING [train.py:1197] (1/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 16:50:51,355 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:50:54,637 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=775226.6666666666, ans=0.125 2023-09-30 16:50:55,510 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.82 vs. limit=6.0 2023-09-30 16:50:55,706 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.872e+02 2.138e+02 2.585e+02 4.153e+02, threshold=4.275e+02, percent-clipped=1.0 2023-09-30 16:50:58,050 WARNING [train.py:1197] (1/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-30 16:50:58,214 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-30 16:51:01,209 WARNING [train.py:1197] (1/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:51:06,814 WARNING [train.py:1197] (1/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-30 16:51:08,414 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:51:10,145 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=775226.6666666666, ans=0.125 2023-09-30 16:51:11,617 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:51:11,946 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=775293.3333333334, ans=0.125 2023-09-30 16:51:13,116 WARNING [train.py:1197] (1/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-30 16:51:14,727 WARNING [train.py:1197] (1/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:51:14,753 WARNING [train.py:1197] (1/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:51:19,947 WARNING [train.py:1197] (1/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:51:20,031 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 16:51:20,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-30 16:51:21,639 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-30 16:51:23,251 WARNING [train.py:1197] (1/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:51:25,106 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=775293.3333333334, ans=0.125 2023-09-30 16:51:26,273 WARNING [train.py:1197] (1/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:51:26,274 WARNING [train.py:1197] (1/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:51:26,280 WARNING [train.py:1197] (1/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-30 16:51:26,432 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:51:29,377 INFO [train.py:1039] (1/4) Epoch 22, batch 4750, loss[loss=0.1875, simple_loss=0.2651, pruned_loss=0.05489, over 23332.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2504, pruned_loss=0.04861, over 4717206.84 frames. ], batch size: 93, lr: 4.63e-03, grad_scale: 16.0 2023-09-30 16:51:31,149 WARNING [train.py:1197] (1/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-30 16:51:34,907 WARNING [train.py:1197] (1/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:51:36,510 WARNING [train.py:1197] (1/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:51:40,498 WARNING [train.py:1197] (1/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:51:40,535 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:51:43,013 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-30 16:51:43,070 WARNING [train.py:1197] (1/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:51:46,779 INFO [scaling.py:1022] (1/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.57 vs. limit=15.0 2023-09-30 16:51:47,489 WARNING [train.py:1197] (1/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-30 16:51:47,701 WARNING [train.py:1197] (1/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:51:47,734 WARNING [train.py:1197] (1/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:51:49,182 WARNING [train.py:1197] (1/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:51:56,068 WARNING [train.py:1197] (1/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-30 16:51:59,391 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:52:02,365 WARNING [train.py:1197] (1/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-30 16:52:02,477 WARNING [train.py:1197] (1/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:52:05,662 WARNING [train.py:1197] (1/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:52:05,666 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:52:05,698 WARNING [train.py:1197] (1/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:52:09,014 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-30 16:52:09,020 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-30 16:52:11,021 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=775493.3333333334, ans=0.0 2023-09-30 16:52:12,851 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-30 16:52:13,387 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=775493.3333333334, ans=0.0 2023-09-30 16:52:15,264 WARNING [train.py:1197] (1/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:52:18,198 WARNING [train.py:1197] (1/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:52:19,888 WARNING [train.py:1197] (1/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 16:52:19,889 WARNING [train.py:1197] (1/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-30 16:52:19,897 WARNING [train.py:1197] (1/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:52:23,021 WARNING [train.py:1197] (1/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:52:28,195 WARNING [train.py:1197] (1/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:52:31,228 WARNING [train.py:1197] (1/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-30 16:52:31,286 WARNING [train.py:1197] (1/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-30 16:52:31,564 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=775560.0, ans=0.05 2023-09-30 16:52:32,791 WARNING [train.py:1197] (1/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:52:32,842 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:52:34,417 WARNING [train.py:1197] (1/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:52:34,534 WARNING [train.py:1197] (1/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 16:52:34,567 WARNING [train.py:1197] (1/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-30 16:52:37,597 WARNING [train.py:1197] (1/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-30 16:52:40,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:52:41,120 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=775626.6666666666, ans=0.0 2023-09-30 16:52:42,544 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:52:42,547 WARNING [train.py:1197] (1/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-30 16:52:42,620 WARNING [train.py:1197] (1/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:52:44,625 WARNING [train.py:1197] (1/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:52:46,171 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-30 16:52:46,270 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:52:47,766 WARNING [train.py:1197] (1/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:52:51,506 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:52:52,852 WARNING [train.py:1197] (1/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-30 16:52:52,999 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-30 16:52:54,288 INFO [train.py:1039] (1/4) Epoch 22, batch 4800, loss[loss=0.1678, simple_loss=0.2417, pruned_loss=0.04697, over 23761.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2508, pruned_loss=0.04861, over 4722081.81 frames. ], batch size: 135, lr: 4.63e-03, grad_scale: 32.0 2023-09-30 16:52:54,532 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-30 16:52:57,697 WARNING [train.py:1197] (1/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-30 16:52:59,125 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:53:01,341 WARNING [train.py:1197] (1/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-30 16:53:03,144 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=775693.3333333334, ans=0.07 2023-09-30 16:53:05,986 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:53:06,059 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:53:10,884 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:53:12,463 WARNING [train.py:1197] (1/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:53:12,517 WARNING [train.py:1197] (1/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:53:12,618 WARNING [train.py:1197] (1/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-30 16:53:14,060 WARNING [train.py:1197] (1/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:53:14,133 WARNING [train.py:1197] (1/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:53:16,509 WARNING [train.py:1197] (1/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:53:22,808 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:53:25,097 WARNING [train.py:1197] (1/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:53:25,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:53:26,750 WARNING [train.py:1197] (1/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:53:26,781 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 16:53:26,806 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:53:28,420 WARNING [train.py:1197] (1/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:53:28,892 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=775826.6666666666, ans=0.1 2023-09-30 16:53:30,165 WARNING [train.py:1197] (1/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:53:31,919 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=775826.6666666666, ans=0.125 2023-09-30 16:53:33,158 WARNING [train.py:1197] (1/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:53:36,848 WARNING [train.py:1197] (1/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:53:36,878 WARNING [train.py:1197] (1/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-30 16:53:38,425 WARNING [train.py:1197] (1/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 16:53:41,348 WARNING [train.py:1197] (1/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:53:41,588 WARNING [train.py:1197] (1/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-30 16:53:43,023 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-30 16:53:43,162 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:53:43,200 WARNING [train.py:1197] (1/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:53:44,487 INFO [optim.py:468] (1/4) Clipping_scale=2.0, grad-norm quartiles 1.478e+02 1.908e+02 2.098e+02 2.406e+02 3.815e+02, threshold=4.197e+02, percent-clipped=0.0 2023-09-30 16:53:44,674 WARNING [train.py:1197] (1/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:53:44,685 WARNING [train.py:1197] (1/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:53:44,700 WARNING [train.py:1197] (1/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:53:46,283 WARNING [train.py:1197] (1/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 16:53:46,379 WARNING [train.py:1197] (1/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:53:46,580 INFO [scaling.py:213] (1/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=775893.3333333334, ans=0.0 2023-09-30 16:53:51,130 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:53:54,920 WARNING [train.py:1197] (1/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:53:55,130 WARNING [train.py:1197] (1/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725